Chio/Docs

Session-Aware Guards

Three guards read state that does not live on the request: cumulative bytes in and out, the tool invocation history, and an EMA baseline of receipts per window. They depend on a shared SessionJournal from chio-http-session and a receipt feed for behavioral profiling. Journal-unavailable means deny: every journal read is wrapped in map_err(|e| KernelError::Internal(...)) and the kernel translates every Err to a denial.


The session journal

Source: crates/chio-http-session/src/lib.rs. The journal is a thread-safe, append-only, hash-chained list of per-tool-call entries. The struct is small:

chio-http-session/src/lib.rs
pub struct SessionJournal {
    inner: Mutex<JournalInner>,
    session_id: String,
}

struct JournalInner {
    entries: Vec<JournalEntry>,
    data_flow: CumulativeDataFlow,
    tool_sequence: Vec<String>,
    tool_counts: HashMap<String, u64>,
}

pub struct CumulativeDataFlow {
    pub total_bytes_read: u64,
    pub total_bytes_written: u64,
    pub total_invocations: u64,
    pub max_delegation_depth: u32,
}

Three accessors are load-bearing for the session-aware guards (lib.rs:269, lib.rs:275, lib.rs:217):

  • data_flow() -> Result<CumulativeDataFlow, SessionJournalError> clones the cumulative struct out from under the mutex.
  • tool_sequence() -> Result<Vec<String>, SessionJournalError> clones the ordered tool names.
  • record(RecordParams) -> Result<u64, SessionJournalError> appends a hash-chained entry and returns its sequence number. RecordParams carries tool_name, server_id, agent_id, bytes_read, bytes_written, delegation_depth, and allowed.

The cumulative counters use saturating_add on every record (lib.rs:244-256), so each running total clamps at u64::MAX rather than wrapping. The guards take Arc<SessionJournal> at construction so they share the underlying state without owning it.


DataFlowGuard

Source: crates/chio-guards/src/data_flow.rs. Guard name: data-flow (data_flow.rs:52). Reads cumulative bytes from the journal and denies once any configured ceiling is reached.

Struct

chio-guards/src/data_flow.rs
#[derive(Clone, Debug, Default)]
pub struct DataFlowConfig {
    pub max_bytes_read: Option<u64>,
    pub max_bytes_written: Option<u64>,
    pub max_bytes_total: Option<u64>,
}

pub struct DataFlowGuard {
    journal: Arc<SessionJournal>,
    config: DataFlowConfig,
}

Default on DataFlowConfig sets every ceiling to None: a default guard never denies. Per-knob defaults:

KnobTypeDefaultBehavior
max_bytes_readOption<u64>NoneCumulative read ceiling. Inclusive comparison: flow.total_bytes_read >= max_read.
max_bytes_writtenOption<u64>NoneCumulative write ceiling. Same inclusive comparison.
max_bytes_totalOption<u64>NoneCumulative read + write ceiling. The total is computed via flow.total_bytes_read.saturating_add(flow.total_bytes_written).

Algorithm

The full body of evaluate (data_flow.rs:55-85), verbatim:

chio-guards/src/data_flow.rs
fn evaluate(&self, _ctx: &GuardContext) -> Result<Verdict, KernelError> {
    let flow = self.journal.data_flow().map_err(|e| {
        KernelError::Internal(format!("data-flow guard journal error (fail-closed): {e}"))
    })?;

    if let Some(max_read) = self.config.max_bytes_read {
        if flow.total_bytes_read >= max_read {
            return Ok(Verdict::Deny);
        }
    }

    if let Some(max_written) = self.config.max_bytes_written {
        if flow.total_bytes_written >= max_written {
            return Ok(Verdict::Deny);
        }
    }

    if let Some(max_total) = self.config.max_bytes_total {
        let total = flow
            .total_bytes_read
            .saturating_add(flow.total_bytes_written);
        if total >= max_total {
            return Ok(Verdict::Deny);
        }
    }

    Ok(Verdict::Allow)
}

The comparison is inclusive: a session that has already read exactly max_bytes_read denies the next call. The guard does not pre-charge the in-flight request, so the arithmetic only sees what prior callers wrote into the journal. The action enum is ignored: even invocations with zero reported bytes execute the three checks before returning Allow.

u64 ceiling

Both CumulativeDataFlow counters and max_bytes_total are u64. The journal's saturating_add updates clamp at u64::MAX = 18_446_744_073_709_551_615 bytes (about 16 EB, 18.4 quintillion bytes). At any realistic web scale this ceiling is unreachable, so saturation is a defensive boundary rather than an operational concern: a mis-configured journal cannot wrap a counter to zero and silently re-allow a terminated session.

Failure modes

  • Journal lock poisoned :: SessionJournalError::LockPoisoned surfaces via the map_err as KernelError::Internal("data-flow guard journal error (fail-closed): {e}"). The kernel reads Err(_) from a guard as a denial.
  • Saturated counter :: deny stays deny. Once any total reaches its ceiling, every subsequent call denies until the session is replaced.

BehavioralSequenceGuard

Source: crates/chio-guards/src/behavioral_sequence.rs. Guard name: behavioral-sequence (behavioral_sequence.rs:58). Enforces tool-ordering rules over the journal's tool sequence.

Struct

chio-guards/src/behavioral_sequence.rs
#[derive(Clone, Debug, Default)]
pub struct SequencePolicy {
    pub required_predecessors: HashMap<String, HashSet<String>>,
    pub forbidden_transitions: Vec<(String, String)>,
    pub max_consecutive: Option<u32>,
    pub required_first_tool: Option<String>,
}

pub struct BehavioralSequenceGuard {
    journal: Arc<SessionJournal>,
    policy: SequencePolicy,
}

Configuration

KnobTypeDefaultCheck
required_predecessorsHashMap<String, HashSet<String>>emptyFor target tool_name: deny if any name in the required set is missing from the journal sequence (behavioral_sequence.rs:80-87).
forbidden_transitionsVec<(String, String)>emptyIf sequence.last() == from and the requested tool is to, deny (behavioral_sequence.rs:89-96).
max_consecutiveOption<u32>NoneWalk the journal in reverse counting matches; deny when the streak reaches the ceiling (behavioral_sequence.rs:98-111).
required_first_toolOption<String>NoneIf the journal sequence is empty, deny anything other than this tool (behavioral_sequence.rs:71-77).

Algorithm and the read-then-write race

The guard's evaluate body reads the journal sequence once, runs four checks against it, and returns. The race condition is structural: the guard does not hold the journal lock across the check-and-record window. Two requests on the same session can both observe the same sequence prefix, both pass max_consecutive, and only afterward append. The interleaving:

text
time   request A (max_consecutive = 3)         request B (max_consecutive = 3)
       sequence: [read, read, read]
  1    let seq = journal.tool_sequence()?;       // returns [read, read, read]
  2                                              let seq = journal.tool_sequence()?; // returns [read, read, read]
  3    streak = 3, returns Allow                 streak = 3, returns Allow
  4    journal.record(RecordParams { tool: read, .. });  // sequence: [read, read, read, read]
  5                                              journal.record(RecordParams { tool: read, .. });  // sequence: [read, read, read, read, read]

ending streak = 5 across two passes, both of which were supposed to deny

The journal's record path is internally locked (lib.rs:217-265), so the entries themselves are still well-ordered. Only the policy decision is racy. Step body, verbatim (behavioral_sequence.rs:61-114):

chio-guards/src/behavioral_sequence.rs
fn evaluate(&self, ctx: &GuardContext) -> Result<Verdict, KernelError> {
    let tool_name = &ctx.request.tool_name;

    let sequence = self.journal.tool_sequence().map_err(|e| {
        KernelError::Internal(format!(
            "behavioral-sequence guard journal error (fail-closed): {e}"
        ))
    })?;

    if sequence.is_empty() {
        if let Some(ref required_first) = self.policy.required_first_tool {
            if tool_name != required_first {
                return Ok(Verdict::Deny);
            }
        }
    }

    if let Some(required) = self.policy.required_predecessors.get(tool_name) {
        let invoked: HashSet<&str> = sequence.iter().map(|s| s.as_str()).collect();
        for req in required {
            if !invoked.contains(req.as_str()) {
                return Ok(Verdict::Deny);
            }
        }
    }

    if let Some(last_tool) = sequence.last() {
        for (from, to) in &self.policy.forbidden_transitions {
            if last_tool == from && tool_name == to {
                return Ok(Verdict::Deny);
            }
        }
    }

    if let Some(max_consec) = self.policy.max_consecutive {
        let mut count: u32 = 0;
        for t in sequence.iter().rev() {
            if t == tool_name {
                count = count.saturating_add(1);
            } else {
                break;
            }
        }
        if count >= max_consec {
            return Ok(Verdict::Deny);
        }
    }

    Ok(Verdict::Allow)
}

Serialize per-session traffic for strict ordering

The check-then-record window above is the only way to violate max_consecutive or a forbidden-transition pair without a guard regression. If your policy must hold under concurrent calls on a single session, the upstream HTTP edge should serialize per-session_id traffic. The journal itself is thread-safe; only the guard's policy decision is non-atomic with the append.

BehavioralProfileGuard

Source: crates/chio-guards/src/behavioral_profile.rs. Guard name: behavioral-profile (behavioral_profile.rs:213). Computes anomaly signals against a per-agent EMA baseline. The verdict path is advisory: even when the sample is anomalous, evaluate returns Verdict::Allow (behavioral_profile.rs:355-365).

Defaults

Every default constant is declared at the top of the module (behavioral_profile.rs:46-54):

chio-guards/src/behavioral_profile.rs
pub const DEFAULT_EMA_ALPHA: f64 = 0.2;
pub const DEFAULT_SIGMA_THRESHOLD: f64 = 2.0;
pub const DEFAULT_WINDOW_SECS: u64 = 60;
pub const DEFAULT_BASELINE_MIN_WINDOWS: u64 = 3;
KnobTypeDefaultSource
ema_alphaf640.2behavioral_profile.rs:46 (clamped to (0.0, 1.0] on every update at operator_report.rs:1374).
sigma_thresholdf642.0behavioral_profile.rs:48.
window_secsu6460behavioral_profile.rs:50.
baseline_min_windowsu643behavioral_profile.rs:54. Anomalies cannot fire until at least three windows have folded into the baseline.

EmaBaselineState

The baseline state is shared with the operator-report module (operator_report.rs:1353-1386):

chio-kernel/src/operator_report.rs
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq)]
#[serde(rename_all = "camelCase")]
pub struct EmaBaselineState {
    pub sample_count: u64,
    pub ema_mean: f64,
    pub ema_variance: f64,
    pub last_update: u64,
}

impl EmaBaselineState {
    pub fn update(&mut self, sample: f64, alpha: f64, now: u64) {
        let alpha = alpha.clamp(f64::MIN_POSITIVE, 1.0);
        if self.sample_count == 0 {
            self.ema_mean = sample;
            self.ema_variance = 0.0;
        } else {
            let prev_mean = self.ema_mean;
            self.ema_mean = prev_mean + alpha * (sample - prev_mean);
            // Incremental EWMA variance, following West (1979) / Welford.
            let diff = sample - prev_mean;
            self.ema_variance = (1.0 - alpha) * (self.ema_variance + alpha * diff * diff);
        }
        self.sample_count = self.sample_count.saturating_add(1);
        self.last_update = now;
    }

    pub fn stddev(&self) -> f64 {
        self.ema_variance.max(0.0).sqrt()
    }
}

Two things to read out of the update body:

  • First sample seeds the mean. When sample_count == 0 the very first call sets ema_mean = sample and ema_variance = 0.0. The arithmetic prev_mean + alpha * (sample - prev_mean) (which is just the standard EMA update) does not run on sample one; it runs from sample two onward.
  • EWMA variance, not sample variance. The variance update (1 - alpha) * (ema_variance + alpha * diff * diff) is the West/Welford incremental EWMA form. stddev() is sqrt(max(ema_variance, 0.0)); there is no Bessel correction and no separate sample-variance path.

Robust z-score with Poisson floor

The guard does not call EmaBaselineState::z_score directly; it uses a robust variant that clamps stddev away from zero (behavioral_profile.rs:337-348):

chio-guards/src/behavioral_profile.rs
fn robust_z_score(state: &EmaBaselineState, sample: f64) -> Option<f64> {
    if state.sample_count < 2 {
        return None;
    }
    let measured = state.stddev();
    let floor = state.ema_mean.max(1.0).sqrt();
    let effective = measured.max(floor);
    if effective <= f64::EPSILON {
        return None;
    }
    Some((sample - state.ema_mean) / effective)
}

The early return on sample_count < 2 is the documented reason the first sample never updates a usable baseline: observe_sample calls robust_z_score before EmaBaselineState::update, so the very first observation always returns z_score = None and anomaly = false. The Poisson floor sqrt(max(mean, 1)) means a 50x spike over a steady 10/window baseline still flags even when EWMA variance is numerically zero.

Window-start quantization

The current-window calculation (behavioral_profile.rs:294-297) is a plain integer-division quantizer:

chio-guards/src/behavioral_profile.rs
fn current_window_start(&self, now: u64) -> u64 {
    let window = self.config.window_secs.max(1);
    (now / window) * window
}

now is in unix seconds (behavioral_profile.rs:322-326), and the window_secs.max(1) guards a misconfigured zero. Two consequences:

  • Calls within the same window-start bucket fold into one sample. observe_sample short-circuits when last_window_start == window_start (behavioral_profile.rs:246-257) and returns the cached outcome without bumping sample_count.
  • There are no sub-second timestamps. The clock source is SystemTime::now().duration_since(UNIX_EPOCH).map(|d| d.as_secs()) (behavioral_profile.rs:323-326), so anything finer than 1 s is discarded before the divisor sees it.

Metrics

From BehavioralMetric (behavioral_profile.rs:57-79):

  • CallRate :: "call_rate", receipts per window.
  • DenyRate :: "deny_rate", denies per window.
  • UniqueTools :: "unique_tools", distinct tool names per window.
  • AvgParameterEntropy :: "avg_parameter_entropy", Shannon entropy of parameters.

On the synchronous Guard::evaluate path (behavioral_profile.rs:355-365), only CallRate is sampled: the count is receipts.len() as f64 from sample_for_window (behavioral_profile.rs:299-305). The other three metrics are reachable through observe_sample for callers that want to feed values out-of-band (an offline batch or a dashboard).

BehavioralAnomalyScore

The dashboard-visible struct paired with EmaBaselineState (operator_report.rs:1413-1455). Even though the synchronous guard verdict is always Allow, the guard's observation outcome carries everything needed to fill this struct out for the receipt evidence block:

chio-kernel/src/operator_report.rs
#[derive(Debug, Clone, Default, Serialize, Deserialize, PartialEq)]
#[serde(rename_all = "camelCase")]
pub struct BehavioralAnomalyScore {
    pub agent_id: String,
    pub baseline: EmaBaselineState,
    pub current_sample: f64,
    pub z_score: Option<f64>,
    pub sigma_threshold: f64,
    pub anomaly: bool,
    pub generated_at: u64,
}

The matching per-call return type is ObservationOutcome (behavioral_profile.rs:309-320), which pairs the same z_score, anomaly, baseline, and sample fields.

Algorithm

  1. window_start = (now / window_secs) * window_secs (behavioral_profile.rs:294-297).
  2. Read receipts in [window_start, window_end - 1] where window_end = window_start + window_secs.max(1) via the ReceiptFeedSource (behavioral_profile.rs:299-305). The upper bound is exclusive after thesaturating_sub(1).
  3. Sample = receipts.len() as f64.
  4. Compute robust_z_score(&state, sample) against the pre-update baseline. If sample_count >= baseline_min_windows and |z| > sigma_threshold, mark anomaly = true (behavioral_profile.rs:259-263).
  5. Update the baseline with the new sample (one update per window-start; repeated calls in the same bucket short-circuit at behavioral_profile.rs:246-257).
  6. Return Verdict::Allow. The signal lives on the receipt evidence block, not the verdict.

Failure modes

  • Mutex poisoning :: Err(KernelError::Internal("baseline lock poisoned")) (behavioral_profile.rs:240).
  • Receipt feed error :: propagated through ? (behavioral_profile.rs:359). The kernel reads it as deny even though the normal verdict is advisory.
  • Cold baseline (sample count below baseline_min_windows) :: never flags. The first observation also returns z_score = None from the sample_count < 2 early return.

Storage

Baselines live in memory keyed by (agent_id, BehavioralMetric) behind a single Mutex<HashMap<...>> (behavioral_profile.rs:199). The receipt feed is pluggable through ReceiptFeedSource (behavioral_profile.rs:84-95); InMemoryReceiptFeed (behavioral_profile.rs:104-158) ships with the crate for tests and lightweight deployments. Production wiring backs the trait with ReceiptStore::query_receipts from chio-store-sqlite.


Composition

rust
use std::sync::Arc;
use chio_guards::{
    DataFlowGuard, DataFlowConfig,
    BehavioralSequenceGuard, SequencePolicy,
    BehavioralProfileGuard, BehavioralProfileConfig,
    InMemoryReceiptFeed,
};
use chio_http_session::SessionJournal;

let journal = Arc::new(SessionJournal::new("sess-1".to_string()));
let feed = InMemoryReceiptFeed::new();

let mut pipeline = chio_guards::GuardPipeline::new();

pipeline.add(Box::new(DataFlowGuard::new(
    journal.clone(),
    DataFlowConfig {
        max_bytes_read: Some(50 * 1024 * 1024),
        max_bytes_written: Some(10 * 1024 * 1024),
        max_bytes_total: None,
    },
)));

let mut policy = SequencePolicy::default();
policy.required_first_tool = Some("init".to_string());
pipeline.add(Box::new(BehavioralSequenceGuard::new(journal.clone(), policy)));

pipeline.add(Box::new(BehavioralProfileGuard::with_config(
    Box::new(feed),
    BehavioralProfileConfig::default(),
)));

Journal-unavailable means deny

All three guards are fail-closed. A journal that fails to read, a receipt feed that returns an error, and a poisoned mutex all surface as Err(KernelError::Internal(...)) from the guard. The kernel reads every Err as a denial. A session-aware guard that cannot read session state cannot make a safe allow decision.

Next Steps

Session-Aware Guards · Chio Docs