Chio/Docs

SIEM Export

Chio produces a signed, append-only receipt log for every kernel decision. The chio-siem crate tails that log and forwards receipts to Splunk HEC and Elasticsearch without linking against the kernel. Receipts move at batch cadence, retries are bounded, and failures land in a dead-letter queue. Nothing silently disappears.

Design Constraints

The exporter is deliberately isolated from the kernel. The kernel trusted computing base (TCB) never loads an HTTP client. Instead, the SIEM manager opens its own read-only SQLite connection to the receipt database and scans forward using a sequence cursor.

  • Read-only access: the exporter opens a fresh read-only connection on each tick. It can never mutate the receipt store.
  • No kernel linkage: the crate does not depend on chio-kernel. An operator can disable SIEM by dropping the siem feature at build time and the binary has no HTTP-client code at all.
  • Idempotent exporters: both Splunk HEC and Elasticsearch are safe to re-feed on retry. The cursor can advance past partial failures without losing data.

Enabling SIEM

SIEM support is gated behind the siem Cargo feature on chio-cli. Build with the feature enabled:

bash
# With SIEM support
$ cargo build -p chio-cli --features siem

# Default build (no chio-siem crate compiled)
$ cargo build -p chio-cli

Without --features siem, the rest of chio operates identically. The CLI simply omits the chio trust export subcommands and the chio-siem crate is not compiled.


Architecture

The exporter manager sits alongside the kernel process. It reads directly from the receipt SQLite file, wraps each row in a SiemEvent, and fans the batch out to every registered exporter.

rendering…
Kernel writes receipts into an append-only SQLite log; the exporter manager pulls by sequence cursor and fans out to Splunk HEC and Elasticsearch.

Sequence cursor, not timestamps

The manager scans by monotonically increasing seq, not by timestamps. That avoids clock-skew bugs and guarantees in-order delivery per receipt log.

ExporterManager Cursor Pull

The manager is configured with a SiemConfig:

chio-siem/src/config.rs
pub struct SiemConfig {
    /// Path to the kernel receipt SQLite file.
    pub db_path: PathBuf,
    /// How often to poll the receipt log. Default: 5 seconds.
    pub poll_interval: Duration,
    /// Max receipts per poll. Default: 100.
    pub batch_size: usize,
    /// Max retries per exporter per batch. Default: 3.
    pub max_retries: u32,
    /// Base backoff, doubled on each retry. Default: 500 ms.
    pub base_backoff_ms: u64,
    /// Dead-letter queue capacity. Default: 1000 entries.
    pub dlq_capacity: usize,
    /// Optional per-exporter batch throttle.
    pub rate_limit: Option<RateLimitConfig>,
}

On each tick the manager:

  1. Opens a fresh read-only SQLite connection.
  2. Runs SELECT seq, raw_json FROM chio_tool_receipts WHERE seq > cursor ORDER BY seq ASC LIMIT batch_size.
  3. Parses each row into a SiemEvent.
  4. Calls export_batch on every registered exporter.
  5. Advances the cursor past the batch, whether or not some events were routed to the DLQ.

The cursor is in-memory only. It resets to 0 on restart. That is safe because both exporters dedupe on receipt identity (Splunk HEC on timestamp + receipt ID, Elasticsearch on idempotent _id upsert). Re-feeding the log after a restart produces no duplicates downstream.

rust
let (cancel_tx, cancel_rx) = tokio::sync::watch::channel(false);
manager.run(cancel_rx).await;

// To stop gracefully:
let _ = cancel_tx.send(true);

Batching and Rate Limiting

If rate_limit is configured, each exporter gets its own token bucket keyed by exporter name. When a bucket is empty the manager waits for capacity before sending the next batch. Burst traffic is delayed rather than silently dropped.

chio-siem.yaml
siem:
  db_path: /var/lib/chio/receipts.sqlite
  poll_interval_ms: 5000
  batch_size: 100
  max_retries: 3
  base_backoff_ms: 500
  dlq_capacity: 1000

  rate_limit:
    splunk_hec:
      capacity: 500      # burst ceiling in receipts
      refill_per_sec: 50 # sustained rate
    elasticsearch:
      capacity: 1000
      refill_per_sec: 200

Retry Policy and Dead-Letter Queue

Each exporter gets up to max_retries attempts per batch. Backoff doubles on each failure: 500 ms, 1 s, 2 s by default. When all retries are exhausted, the failed events go to the bounded DeadLetterQueue.

Failure ModeBehavior
Transient 5xxRetry up to max_retries, doubling backoff.
Network errorSame retry loop as 5xx.
Elasticsearch partial failureSurfaced as ExportError::PartialFailure; only the failed entries are retried.
All retries exhaustedEvent lands in the DLQ. The cursor still advances.
DLQ fullOldest entry is dropped and a tracing::error is logged.
rust
// Inspect DLQ depth from operator tooling:
let dlq_len = manager.dlq_len();

DLQ events are not auto-retried

Events in the DLQ are not automatically retried. They are lost unless you drain and re-feed them. Because both exporters are idempotent, re-feeding DLQ contents is safe and produces no duplicates.

Splunk HEC

The Splunk exporter POSTs newline-separated JSON envelopes to {endpoint}/services/collector/event. Each envelope wraps the full ChioReceipt under the event key with time, sourcetype, and optional index / host fields.

rust
use chio_siem::exporters::splunk::{SplunkConfig, SplunkHecExporter};

let config = SplunkConfig {
    endpoint: "https://splunk.example.com:8088".to_string(),
    hec_token: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx".to_string(),
    sourcetype: "chio:receipt".to_string(),
    index: Some("chio_audit".to_string()),
    host: Some("chio-node-01".to_string()),
};

let exporter = SplunkHecExporter::new(config)?;
manager.add_exporter(Box::new(exporter));

The Authorization header is Splunk {hec_token}. TLS is handled by reqwest against the system native certificate store.

Sample SPL

spl/denied-budget.spl
sourcetype="chio:receipt" event.decision.deny.guard="monetary_budget"
| stats sum(event.metadata.financial.attempted_cost) as total_attempted
        by event.capability_id
| sort - total_attempted
spl/egress-denials-by-tool.spl
sourcetype="chio:receipt" event.decision.deny.guard="egress-allowlist"
| stats count by event.tool_server, event.tool_name
| sort - count

Elasticsearch Bulk

The Elasticsearch exporter POSTs NDJSON to {endpoint}/_bulk. Each receipt produces two lines: an index action keyed on receipt.id as _id (making the write idempotent), and the full receipt document. Partial failures (HTTP 200 with errors: true) are detected and surfaced as ExportError::PartialFailure.

rust
use chio_siem::exporters::elastic::{
    ElasticAuthConfig, ElasticConfig, ElasticsearchExporter,
};

let config = ElasticConfig {
    endpoint: "https://es.example.com:9200".to_string(),
    index_name: "chio-receipts".to_string(),
    auth: ElasticAuthConfig::ApiKey("base64encodedkey==".to_string()),
    // or Basic { username, password }
};

let exporter = ElasticsearchExporter::new(config)?;
manager.add_exporter(Box::new(exporter));

Sample Elasticsearch DSL

es/deny-by-guard.json
POST chio-receipts/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "decision.verdict": "deny" } },
        { "range": { "timestamp": { "gte": "now-24h/h" } } }
      ]
    }
  },
  "aggs": {
    "by_guard": {
      "terms": { "field": "decision.guard", "size": 20 }
    }
  }
}
es/top-spenders.json
POST chio-receipts/_search
{
  "size": 0,
  "query": { "term": { "decision.verdict": "allow" } },
  "aggs": {
    "by_subject": {
      "terms": {
        "field": "metadata.financial.root_budget_holder",
        "size": 10
      },
      "aggs": {
        "spend": {
          "sum": { "field": "metadata.financial.cost_charged" }
        }
      }
    }
  }
}

The SiemEvent Wrapper

Each receipt is wrapped in a SiemEvent. The wrapper hoists FinancialReceiptMetadata to a top-level field, so SIEM search queries can reach cost_charged, budget_remaining, and friends without JSON-path traversal.

rust
pub struct SiemEvent {
    pub receipt: ChioReceipt,
    pub financial: Option<FinancialReceiptMetadata>,
}

The financial field is extracted from receipt.metadata["financial"]. It exposes cost_charged, currency, budget_remaining, budget_total, delegation_depth, root_budget_holder, settlement_status, and attempted_cost as first-class fields.


OCSF Field Mapping

Many SIEM installations normalize events to the Open Cybersecurity Schema Framework (OCSF). Chio receipts fit naturally into the OCSF Application Activity class. The suggested mapping:

OCSF FieldChio Receipt FieldNotes
timereceipt.timestampSeconds · multiply by 1000 for OCSF millis
activity_idderived from decision.verdictallow → 1, deny → 2
statusdecision.verdictSuccess or Failure
actor.user.nameagent subject keyHex-encoded ed25519 pubkey
actor.session.uidreceipt.capability_idCapability exercised
api.operationreceipt.tool_nameTool invoked
api.service.namereceipt.tool_serverTool server handling the call
unmapped.guarddecision.guardKebab-case guard name on deny (forbidden-path, egress-allowlist, mcp-tool, etc.)
unmapped.cost_chargedfinancial.cost_chargedMinor units, currency in financial.currency
metadata.correlation_uidreceipt.idUUIDv7, directly usable for pivoting

Guard names are stable

Guard identifiers on deny receipts are drawn from a fixed kebab-case set: forbidden-path, path-allowlist, shell-command, egress-allowlist, mcp-tool, secret-leak, patch-integrity, velocity. Detection content can key on these directly without worrying about label drift.

Operational Notes

  • Place the exporter next to the kernel: the read-only SQLite connection works best on the same host as the writer. Remote file systems work but the poll interval may need to be larger.
  • Monitor DLQ depth: any sustained non-zero dlq_len indicates the downstream SIEM is failing. Page on it.
  • Rotate HEC tokens and API keys: because idempotency is handled per-exporter, you can run two exporters with different credentials during a rotation window without double-writing.
  • Test restart: restart the exporter and confirm no duplicates land in the index. Both exporters dedupe, but it is worth verifying your index template does not strip the receipt ID.

Do not disable the kernel receipt log

The SIEM path reads from the kernel receipt log. Dropping or truncating that log breaks the audit chain and the SIEM stream at the same time. The receipt log is append-only for a reason. Leave it that way.