Skip to content

feat(ocsf): create openshell-ocsf crate — standalone OCSF event types, formatters, and tracing layers #392

@johntmyers

Description

@johntmyers

Problem Statement

The sandbox supervisor emits 123 log statements across 18 source files using ad-hoc tracing::info!()/warn!() macros with inconsistent field names, no event classification, and no machine-readable output. There is no schema governing the log events, making it impossible to reliably filter, correlate, alert on, or export them to SIEMs.

We need a dedicated crate that implements the OCSF v1.7.0 event model for all sandbox log events, providing typed event structs, dual-format output (human-readable shorthand + JSONL), and schema-validated testing — all independently buildable and testable without modifying the sandbox supervisor.

Proposed Design

Create a new openshell-ocsf crate at crates/openshell-ocsf/ that owns all OCSF logic. The sandbox (Part 2, separate issue) will later depend on this crate and use its builders to construct events. This issue covers everything that can be built and tested standalone.

The full design is documented in .opencode/plans/ocsf-log-export.md. Key sections: "The openshell-ocsf Crate", "Shorthand Format Design", "Event Class Mapping", "Vendored Schema for Test Validation".

Crate Scope

  • 8 OCSF event classes: Network Activity [4001], HTTP Activity [4002], SSH Activity [4007], Process Activity [1007], Detection Finding [2004], Application Lifecycle [6002], Device Config State Change [5019], Base Event [0]
  • 11 enum types: SeverityId, StatusId, ActionId, DispositionId, ActivityId, StateId, AuthTypeId, LaunchTypeId, SecurityLevelId, ConfidenceId, RiskLevelId
  • 20 object types: Metadata, Product, Endpoint, Process, Actor, Container, Image, Device, OsInfo, FirewallRule, FindingInfo, Evidence, Remediation, HttpRequest, HttpResponse, Url, Attack, Technique, Tactic, ConnectionInfo
  • 8 builders: One per event class with SandboxContext for shared metadata
  • Dual formatters: format_shorthand() (single-line human-readable) and to_json()/to_json_line() (OCSF JSONL)
  • Tracing layers: OcsfShorthandLayer and OcsfJsonlLayer for subscriber integration
  • Vendored schemas: OCSF v1.7.0 JSON schemas (8 classes + 17 objects) for offline test validation
  • ocsf_emit! macro: Thin wrapper for emitting events through the tracing system

Module Structure

crates/openshell-ocsf/
├── Cargo.toml
├── src/
│   ├── lib.rs                      # Re-exports, OcsfEvent enum
│   ├── events/                     # Per-class event structs (8 files)
│   ├── objects/                    # Shared OCSF object types (10 files)
│   ├── enums/                      # All OCSF enum types (8 files)
│   ├── builders/                   # SandboxContext + 8 per-class builders
│   ├── format/                     # shorthand.rs + jsonl.rs
│   ├── tracing/                    # Layers + event bridge + ocsf_emit! macro
│   └── validation/                 # Schema validation utilities (test-only)
├── schemas/ocsf/v1.7.0/           # Vendored OCSF schemas (25 files)
└── tests/                          # Integration tests

Shorthand Format

Single-line human-readable format derived from OCSF events:

<HH:MM:SS.mmm> <severity> <CLASS:ACTIVITY> <action> <key fields> [context]

Examples:

14:00:00.000 I NET:OPEN ALLOW python3(42) -> api.example.com:443 [policy:default-egress engine:mechanistic]
14:00:01.000 I HTTP:GET ALLOW curl(88) -> GET https://api.example.com/v1/data [policy:default-egress]
14:00:02.000 I SSH:OPEN ALLOW 10.42.0.1:48201 [auth:NSSH1]
14:00:03.000 I PROC:LAUNCH python3(42) [cmd:python3 /app/main.py]
14:00:04.000 H FINDING:BLOCKED "NSSH1 Nonce Replay Attack" [confidence:high]
14:00:00.000 I LIFECYCLE:START openshell-sandbox success
14:00:10.000 I CONFIG:LOADED policy reloaded [version:v3 hash:sha256:abc123def456]
14:00:00.000 I EVENT Network namespace created [ns:openshell-sandbox-abc123]

Order of Battle

Each step depends on prior steps unless noted. No openshell-sandbox code is modified in this issue.

Step 1: Crate scaffolding (~0.5 day)

  • Create crates/openshell-ocsf/ directory with full module structure (empty files with mod declarations)
  • Create Cargo.toml with dependencies: serde, serde_json, tracing, tracing-subscriber, chrono (all already in workspace)
  • Add openshell-ocsf to workspace Cargo.toml members list
  • Create src/lib.rs with crate-level docs and placeholder re-exports
  • Create empty module files: events/mod.rs, objects/mod.rs, enums/mod.rs, builders/mod.rs, format/mod.rs, tracing/mod.rs, validation/mod.rs

Done when: cargo check -p openshell-ocsf compiles with zero errors and zero warnings. No functional code yet.

Step 2: Vendor OCSF schemas (~0.5 day)

  • Fetch v1.7.0 class schemas (8 classes) and object schemas (17 objects) from schema.ocsf.io/api/1.7.0/
  • Place in crates/openshell-ocsf/schemas/ocsf/v1.7.0/classes/ and objects/
  • Create schemas/ocsf/v1.7.0/VERSION containing 1.7.0
  • Create schemas/ocsf/README.md documenting provenance, fetch date, and upgrade procedure
  • Create mise task ocsf:update-schema that re-fetches schemas for a given version
  • Commit all schema JSON files to the repository

Done when: All 25 schema JSON files exist, are valid JSON, and VERSION file reads 1.7.0. The mise run ocsf:update-schema -- 1.7.0 task runs successfully and produces identical output.

Schema files to vendor:

Classes (8): network_activity, http_activity, ssh_activity, process_activity, detection_finding, application_lifecycle, device_config_state_change, base_event

Objects (17): metadata, network_endpoint, network_proxy, process, actor, device, container, product, firewall_rule, finding_info, evidences, http_request, http_response, url, attack, remediation, connection_info

Step 3: Schema validation utilities (~0.5 day)

  • Implement validation/schema.rs with:
    • load_class_schema(class: &str) -> Value — loads vendored class schema by name
    • validate_required_fields(event: &Value, schema: &Value) — asserts all required fields present
    • validate_enum_value(event: &Value, field: &str, schema: &Value) — asserts enum values are valid
  • Gate behind #[cfg(test)] — test-only utilities
  • Add unit tests loading each vendored class schema and verifying structure is parseable

Done when: cargo test -p openshell-ocsf validation passes. Each of the 8 class schemas loads successfully. validate_required_fields correctly identifies missing required fields in a synthetic event. validate_enum_value correctly rejects invalid enum values.

Depends on: Steps 1, 2.

Step 4: Core types — enums (~1 day)

  • Implement all OCSF enum types with Serialize/Deserialize derives and integer representation:
    • SeverityId (0-6, 99) — Unknown, Informational, Low, Medium, High, Critical, Fatal, Other
    • StatusId (0-2, 99) — Unknown, Success, Failure, Other
    • ActionId (0-4, 99) — Unknown, Allowed, Denied, ...
    • DispositionId (0-27, 99) — Unknown, Allowed, Blocked, ... Error, ...
    • ActivityId — per-class variants (separate enum types or unified with class context)
    • StateId (0-2, 99) — Unknown, Disabled, Enabled, Other
    • AuthTypeId (0-6, 99) — Unknown, Certificate Based, GSSAPI, Host Based, Keyboard Interactive, Password, Public Key, Other
    • LaunchTypeId (0-3, 99) — Unknown, Spawn, Fork, Exec, Other
    • SecurityLevelId (0-3, 99) — Unknown, Secure, At Risk, Compromised, Other
    • ConfidenceId (0-3, 99) — Unknown, Low, Medium, High, Other
    • RiskLevelId (0-4, 99) — Unknown, Info, Low, Medium, High, Critical, Other
  • Each enum serializes to its integer value in JSON and has fn label(&self) -> &str returning the OCSF string label

Done when: All enum types compile, serialize to correct integer values, and each enum value validates against the corresponding vendored schema enum definition. Unit tests cover every variant of every enum.

Depends on: Steps 1, 3 (uses validation utilities in tests).

Step 5: Core types — objects (~1-1.5 days)

  • Implement all OCSF object types with Serialize/Deserialize:
    • Metadata, Product — with profiles array, uid, version, log_source
    • Endpoint — with fn domain(name, port), fn ip(addr, port), fn domain_or_ip(&self) -> String
    • Process, ActorProcess has optional parent_process: Box<Option<Process>> for ancestor chain
    • Container, Image
    • Device, OsInfo
    • FirewallRulename, type
    • FindingInfo, Evidence, RemediationFindingInfo has uid, title, desc; Remediation has desc
    • HttpRequest, HttpResponse, UrlHttpRequest has http_method, url; Url has scheme, hostname, path, port
    • Attack, Technique, Tactic — with Attack::mitre(technique_uid, name, tactic_uid, name) convenience constructor
    • ConnectionInfoprotocol_name
  • All fields use #[serde(skip_serializing_if = "Option::is_none")] for optional OCSF fields

Done when: All object types compile, serialize to correct JSON structure, and unit tests verify field names match the vendored object schemas. Each object has at least one construction + serialization test.

Depends on: Steps 1, 4 (objects reference enums).

Step 6: Event structs (~1-1.5 days)

  • Implement BaseEventData with all OCSF base event fields (class_uid, class_name, category_uid, category_name, activity_id, activity_name, type_uid, type_name, time, severity_id, severity, status_id, status, message, metadata, device, container, unmapped)
  • Implement all 8 event class structs, each embedding BaseEventData via #[serde(flatten)]:
    • NetworkActivityEvent [4001] — adds src_endpoint, dst_endpoint, proxy_endpoint, actor, firewall_rule, connection_info, action_id, action, disposition_id, disposition, observation_point_id, is_src_dst_assignment_known
    • HttpActivityEvent [4002] — adds http_request, http_response, src_endpoint, dst_endpoint, proxy_endpoint, actor, firewall_rule
    • SshActivityEvent [4007] — adds src_endpoint, dst_endpoint, auth_type_id, auth_type, protocol_ver, actor
    • ProcessActivityEvent [1007] — adds process, actor, launch_type_id, launch_type, exit_code
    • DetectionFindingEvent [2004] — adds finding_info, evidences, attacks, remediation, is_alert, confidence_id, confidence, risk_level_id, risk_level
    • ApplicationLifecycleEvent [6002] — adds app (Product)
    • DeviceConfigStateChangeEvent [5019] — adds state_id, state, security_level_id, security_level, prev_security_level_id, prev_security_level
    • BaseEvent [0] — just BaseEventData
  • Implement OcsfEvent enum with variants for all 8 classes
  • Implement type_uid auto-computation: class_uid * 100 + activity_id

Done when: All 8 event structs compile and serialize to JSON with correct class_uid, category_uid, type_uid, and type_name. At least one test per class verifies the serialized JSON validates against the vendored class schema (required fields present, enum values valid).

Depends on: Steps 4, 5 (events reference enums and objects).

Step 7: JSONL serializer (~0.5-1 day)

  • Implement to_json(&self) -> serde_json::Value and to_json_line(&self) -> String on OcsfEvent
  • to_json() returns the full OCSF JSON object
  • to_json_line() returns to_json() serialized as a single line (no pretty-printing) with trailing newline
  • Ensure #[serde(skip_serializing_if)] is correctly applied so absent optional fields are omitted (not null)

Done when: Every event class has at least one test that: (a) serializes to JSON via to_json(), (b) validates against the vendored schema with validate_required_fields(), (c) validates all enum fields with validate_enum_value(), (d) verifies to_json_line() is a single line ending in \n and parses back to the same JSON value.

Depends on: Steps 3, 6 (uses validation utilities against event structs).

Step 8: Shorthand formatter (~1 day)

  • Implement format_shorthand(&self) -> String on OcsfEvent with per-class templates:
    • NET:<activity> <action> <process>(<pid>) -> <dst>:<port> [policy:<rule> engine:<engine>]
    • HTTP:<method> <action> <process>(<pid>) -> <method> <url> [policy:<rule>]
    • SSH:<activity> <action> <peer> [auth:<auth_type>]
    • PROC:<activity> <process>(<pid>) [exit:<code>] [cmd:<cmdline>]
    • FINDING:<disposition> "<title>" [confidence:<level>]
    • LIFECYCLE:<activity> <app> <status>
    • CONFIG:<state> <what> [version:<ver> hash:<hash>]
    • EVENT <message> [<key fields>]
  • Implement format_ts(time_ms: i64) -> String — ISO 8601 compact
  • Implement severity_char(severity_id: SeverityId) -> charI, L, M, H, C, F,
  • Add snapshot tests (using insta or inline expected strings) for every class variant. At least 2 snapshots per class (common case + edge case)

Done when: format_shorthand() produces correct output for all 8 event classes. At least 16 snapshot tests pass (2 per class). Shorthand output is deterministic (same input → same output).

Depends on: Step 6 (formats event structs).

Step 9: Round-trip tests (~0.5 day)

  • For each event class, verify consistency between shorthand and JSON representations:
    • Shorthand class prefix (NET, HTTP, SSH, etc.) matches JSON class_uid
    • Shorthand activity (OPEN, GET, LAUNCH, etc.) matches JSON activity_name
    • Shorthand action (ALLOW, DENY, etc.) matches JSON action (when present)
    • Shorthand severity char matches JSON severity_id
  • Add tests for dual-emit events: BYPASS_DETECT produces one Network Activity + one Detection Finding, both with consistent data fields
  • NSSH1 replay dual-emit consistency test

Done when: At least one round-trip test per event class passes. Dual-emit consistency tests for BYPASS_DETECT and NSSH1 replay pass.

Depends on: Steps 7, 8 (uses both formatters).

Step 10: SandboxContext + Builders (~1-1.5 days)

  • Implement SandboxContext struct with sandbox_id, sandbox_name, container_image, hostname, product_version, proxy_ip, proxy_port
  • Add metadata(), container(), device(), proxy_endpoint() methods on SandboxContext
  • Implement all 8 builders:
    • NetworkActivityBuilder — required: activity, action, disposition, severity, dst_endpoint. Optional: src_endpoint, actor_process, firewall_rule, message, status, connection_info, observation_point
    • HttpActivityBuilder — required: activity (HTTP method), action, disposition, severity, http_request. Optional: http_response, src_endpoint, dst_endpoint, actor_process, firewall_rule, message
    • SshActivityBuilder — required: activity, action, disposition, severity. Optional: src_endpoint, dst_endpoint, auth_type, protocol_ver, message
    • ProcessActivityBuilder — required: activity, severity, process. Optional: action, disposition, launch_type, actor_process, exit_code, message
    • DetectionFindingBuilder — required: activity, severity, finding_info. Optional: action, disposition, is_alert, confidence, risk_level, evidences, attacks, remediation, message
    • AppLifecycleBuilder — required: activity, severity, status. Optional: message
    • ConfigStateChangeBuilder — required: state, severity. Optional: security_level, prev_security_level, status, unmapped, message
    • BaseEventBuilder — required: severity, message. Optional: unmapped
  • Each builder's .build() returns OcsfEvent. Builders auto-populate time, metadata, container, device from SandboxContext

Done when: All 8 builders compile and produce valid OcsfEvent instances. Each builder has at least one test that builds an event and validates it against the vendored schema. Builder ergonomics match the "Before and After" examples in the plan.

Depends on: Steps 4, 5, 6 (builders construct events from types).

Step 11: ocsf_emit! macro + tracing layers (~1-1.5 days)

  • Implement tracing/event_bridge.rs: emit_ocsf_event(event: OcsfEvent) function emitting with target ocsf, plus ocsf_emit!($event) macro
  • Implement tracing/shorthand_layer.rs: OcsfShorthandLayer — a tracing::Layer that intercepts ocsf target events, calls format_shorthand(), writes to provided writer. Non-OCSF events pass through with fallback format
  • Implement tracing/jsonl_layer.rs: OcsfJsonlLayer — a tracing::Layer that intercepts ocsf target events, calls to_json_line(), writes to provided writer
  • Add unit tests with mock writers (Vec<u8>) verifying:
    • An ocsf_emit! call results in both layers receiving the event
    • Shorthand layer produces expected text
    • JSONL layer produces expected JSON
    • Non-OCSF tracing events handled gracefully (shorthand layer falls back, JSONL layer ignores)

Done when: Both layers correctly format OCSF events. ocsf_emit! macro works. Mock-writer tests pass for at least 3 event classes. Non-OCSF event fallback test passes.

Depends on: Steps 7, 8, 10 (layers use formatters and builders).

Step 12: CI integration (~0.5 day)

  • Ensure cargo test -p openshell-ocsf passes in CI with all tests green
  • Add CI check that vendored VERSION file matches OCSF_VERSION constant in Rust code
  • Run mise run pre-commit and fix any lint, format, or license header issues
  • Verify cargo clippy -p openshell-ocsf has zero warnings

Done when: CI green. mise run pre-commit passes. cargo test -p openshell-ocsf runs all tests with zero failures. Vendored schema version matches code constants.

Depends on: All prior steps.

Acceptance Criteria

  1. cargo check -p openshell-ocsf compiles with zero errors and zero warnings
  2. cargo test -p openshell-ocsf passes with all tests green (target: 80+ tests covering all 8 event classes, all formatters, all builders, schema validation, round-trip consistency)
  3. cargo clippy -p openshell-ocsf has zero warnings
  4. mise run pre-commit passes
  5. Every event class has at least one JSON serialization test validating against the vendored OCSF v1.7.0 schema
  6. Every event class has at least two shorthand format snapshot tests
  7. Dual-emit events (BYPASS_DETECT, NSSH1 replay) have round-trip consistency tests
  8. Both tracing layers (OcsfShorthandLayer, OcsfJsonlLayer) have mock-writer tests demonstrating correct output
  9. The ocsf_emit! macro compiles and correctly routes events to both layers
  10. No code in openshell-sandbox has been modified — the crate is fully standalone

Estimated Effort

~8-10 days

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions