Response Sanitization
Two guard surfaces handle outbound content. ResponseSanitizationGuard is the simple, fixed-pattern API with binary Block / Redact actions. OutputSanitizer is the full-featured detector engine with seven redaction strategies and a category-keyed strategy map. ContentReviewGuard is the pre-invocation cousin: it scans outbound SaaS / payment / messaging content before the call leaves the kernel.
Run as a post-invocation hook
ResponseSanitizationGuard and OutputSanitizer operate on outputs, not inputs. The kernel runs them through the post-invocation hook surface (see guard pipelines), not the pre-invocation guard pipeline that gates tool calls. ContentReviewGuard is the pre-invocation cousin and runs in the standard pipeline.ResponseSanitizationGuard
Source: crates/chio-guards/src/response_sanitization.rs (the simple-API portion). Guard name: response-sanitization.
Struct
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum SensitivityLevel { Low, Medium, High }
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum SanitizationAction { Block, Redact }
pub struct SensitivePattern {
pub name: String,
regex: Regex,
pub level: SensitivityLevel,
pub redaction: String,
}
pub struct ResponseSanitizationGuard {
patterns: Vec<SensitivePattern>,
min_level: SensitivityLevel,
action: SanitizationAction,
}Default patterns
Quoted verbatim from default_patterns() at response_sanitization.rs:66-133. Each pattern is a literal Regex::new(...) call inside an if let Ok(regex) = ... block; a regex that fails to compile is silently dropped from the default set rather than panicking at startup.
// SSN, level High, redaction "[SSN REDACTED]"
Regex::new(r"\b\d{3}-\d{2}-\d{4}\b")
// email, level Medium, redaction "[EMAIL REDACTED]"
Regex::new(r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b")
// phone, level Low, redaction "[PHONE REDACTED]"
Regex::new(r"\b(?:\(\d{3}\)\s*|\d{3}[-.])\d{3}[-.]?\d{4}\b")
// credit-card, level High, redaction "[CARD REDACTED]"
Regex::new(r"\b(?:\d{4}[-\s]?){3}\d{4}\b")
// date-of-birth, level Low, redaction "[DATE REDACTED]"
Regex::new(r"\b(?:\d{2}/\d{2}/\d{4}|\d{4}-\d{2}-\d{2})\b")
// MRN, level High, redaction "[MRN REDACTED]"
Regex::new(r"\bMRN[:\s#]*\d{6,12}\b")
// ICD-10, level Medium, redaction "[ICD REDACTED]"
Regex::new(r"\b[A-Z]\d{2}(?:\.\d{1,4})?\b")The simple-API redact path ( ResponseSanitizationGuard::redact at response_sanitization.rs:190) runs the patterns in declaration order and uses Regex::replace_all per pattern. Patterns below min_level are skipped via level_ord (Low=0, Medium=1, High=2).
Block vs Redact
SanitizationAction::Block:: when any matching pattern at or abovemin_levelfires, the call path treats the response as denied. Thescan_responsehelper returnsScanResult::Blockedwith the list of findings.SanitizationAction::Redact:: each match is replaced with itsredactionstring. The response is allowed through with the redacted text;ScanResult::Redactedcarries the count and the original findings for evidence.
Guard impl
The Guard impl on ResponseSanitizationGuard scans the request arguments JSON-as-string. If any finding meets the minimum level, the verdict is Verdict::Deny; otherwise Verdict::Allow. For the full read-then-redact-then-emit flow over actual responses, use the post-invocation hook with scan_response directly.
OutputSanitizer (full API)
Source: same file, lower in the module. The full sanitizer is a stateless engine wrapped around the OutputSanitizerConfig struct and a token vault for the Tokenize strategy.
Configuration
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct OutputSanitizerConfig {
pub categories: CategoryConfig,
pub redaction_strategies: HashMap<SensitiveCategory, RedactionStrategy>,
pub entropy: EntropyConfig,
pub allowlist: AllowlistConfig,
pub denylist: DenylistConfig,
pub max_input_bytes: usize,
pub include_findings: bool,
}| Knob | Default |
|---|---|
categories.secrets | true |
categories.pii | true |
categories.internal | true |
redaction_strategies[Secret] | Mask |
redaction_strategies[Pii] | Partial |
redaction_strategies[Internal] | TypeLabel |
entropy.enabled | true |
entropy.threshold | 4.5 bits/char |
entropy.min_token_len | 16 |
max_input_bytes | 1_000_000 |
include_findings | true |
Detector catalog
Verified from compiled_patterns(). Each detector carries a stable ID, a category, a confidence, and a recommended redaction strategy.
| Detector ID | Category | Confidence | Recommended | Validator |
|---|---|---|---|---|
secret_aws_access_key_id | Secret | 0.99 | Mask | none |
secret_aws_secret_access_key | Secret | 0.9 | Mask | none |
secret_github_token | Secret | 0.99 | Mask | none |
secret_slack_token | Secret | 0.99 | Mask | none |
secret_slack_webhook | Secret | 0.95 | Mask | none |
secret_gcp_service_account | Secret | 0.97 | Drop | none |
secret_pem_private_key | Secret | 0.99 | Mask | none |
secret_jwt | Secret | 0.85 | Mask | none |
secret_oauth_bearer | Secret | 0.85 | Mask | none |
secret_password_assignment | Secret | 0.7 | Mask | none |
pii_ssn | PII | 0.9 | Mask | is_valid_ssn_fragments |
pii_ssn_compact | PII | 0.7 | Mask | is_valid_ssn_compact |
pii_credit_card | PII | 0.9 | Mask | Luhn check |
pii_email | PII | 0.95 | Partial | none |
internal_private_ip | Internal | 0.8 | TypeLabel | none |
Detectors with validators run the regex first and then call the validator on the matched text. Luhn rejects all-same-digit sequences and any number outside 13 to 19 digits. SSN validators reject area 0, 666, 900-999, and zero group / zero serial. The MRN, DOB, ICD-10, and US-phone detectors from the simple API are not included in the full sanitizer's default catalog.
Redaction strategies
| Strategy | Behavior |
|---|---|
Mask | Replace the match with a constant mask (e.g. ****). |
Fingerprint | Replace with a stable SHA-256 hex prefix. Allows correlation across calls without exposing the value. |
Drop | Replace with empty text. At the JSON-field level, the whole field becomes null. |
Tokenize | Replace with an opaque ID and record the mapping in a shared TokenVault for reversible workflows. |
Partial | Keep the first 2 and last 2 characters; replace the middle with ***. Strings of 4 chars or fewer become all asterisks. |
TypeLabel | Replace with a typed label ([REDACTED:email]). |
Keep | Do not redact. Used to surface a finding without modifying output. |
Entropy detector
Source: EntropyConfig at response_sanitization.rs:382-395. Verbatim defaults:
pub struct EntropyConfig {
pub enabled: bool,
pub threshold: f64,
pub min_token_len: usize,
}
impl Default for EntropyConfig {
fn default() -> Self {
Self {
enabled: true,
threshold: 4.5,
min_token_len: 16,
}
}
}Scanning loop (lines 1048-1061): tokens shorter than min_token_len are skipped with if token.len() < self.config.entropy.min_token_len; tokens whose Shannon entropy is below self.config.entropy.threshold are skipped. The remaining tokens are emitted as findings with category Secret and a confidence of 0.6.
Candidate token characters are restricted to base64-shaped:
fn is_candidate_secret_token(token: &str) -> bool {
token.bytes().all(|b|
b.is_ascii_alphanumeric()
|| matches!(b, b'+' | b'/' | b'=' | b'-' | b'_'))
}Allowlist and denylist
AllowlistConfig { exact, patterns }:: known-safe values that should not be redacted (test fixtures, sample tokens). Both lists are checked.DenylistConfig { exact, patterns }:: forced-redaction values. Always redacted regardless of category or detector.- Pattern compile failure ::
OutputSanitizerConfigError::InvalidPatternwith the offending list name ("allowlist"or"denylist").
Overlap resolution
When two detectors match overlapping byte ranges, the sanitizer merges them with a deterministic longest-match-wins rule, ties broken by strategy rank. Source: resolve_overlaps at response_sanitization.rs:1270-1325.
The strategy ranking (higher number wins on a tie):
fn strategy_rank(s: &RedactionStrategy) -> u8 {
match s {
RedactionStrategy::Keep => 0,
RedactionStrategy::Partial => 1,
RedactionStrategy::TypeLabel => 2,
RedactionStrategy::Fingerprint => 3,
RedactionStrategy::Tokenize => 4,
RedactionStrategy::Mask => 5,
RedactionStrategy::Drop => 6,
}
}The merge step sorts spans by start ascending then by end descending (so the longest match at any start position comes first), walks the list, and folds each span into the previous one when it overlaps. On a fold, the merged span's end becomes last.0.end.max(current.0.end) and the strategy is replaced only when strategy_rank(current) > strategy_rank(last). Result: longer match wins on extent; on equal extent, Drop beats Mask beats Tokenize beats Fingerprint beats TypeLabel beats Partial beats Keep.
spans.sort_by(|a, b| {
a.0.start
.cmp(&b.0.start)
.then_with(|| b.0.end.cmp(&a.0.end))
});
let mut merged: Vec<ResolvedSpan> = Vec::new();
for current in spans {
if let Some(last) = merged.last_mut() {
if current.0.start < last.0.end {
let new_end = last.0.end.max(current.0.end);
last.0.end = new_end;
if strategy_rank(¤t.1) > strategy_rank(&last.1) {
last.1 = current.1;
last.2 = current.2;
last.3 = current.3;
last.4 = current.4;
}
continue;
}
}
merged.push(current);
}One quirk worth knowing: detector recommendations of Drop, Fingerprint, or Tokenize override the per-category default before overlap resolution runs (lines 1282-1291). So a category default of Mask still gets demoted to Drop if the detector flagged the finding as Drop-worthy.
ContentReviewGuard
Source: crates/chio-guards/src/content_review.rs. Guard name: content-review. Pre-invocation review of outbound content for SaaS / messaging / payment tools (Slack, SendGrid, Twilio, Stripe, etc.) targeted via ToolAction::ExternalApiCall.
Struct
#[derive(Clone, Debug, Default, Deserialize, Serialize)]
#[serde(deny_unknown_fields)]
pub struct ContentReviewRules {
pub detect_pii: bool,
pub detect_profanity: bool,
pub banned_words: Vec<String>,
pub extra_patterns: Vec<String>,
pub max_scan_bytes: usize,
}
#[derive(Clone, Debug, Deserialize, Serialize)]
#[serde(deny_unknown_fields)]
pub struct ContentReviewConfig {
pub enabled: bool,
pub default_rules: ContentReviewRules,
pub per_service: HashMap<String, ContentReviewRules>,
}Defaults
| Knob | Default |
|---|---|
enabled | true |
default_rules.detect_pii | true |
default_rules.detect_profanity | true |
default_rules.banned_words | empty |
default_rules.extra_patterns | empty |
default_rules.max_scan_bytes | 65536 (64 KiB) |
per_service | empty |
Pattern safety limits
From CompiledRules::compile:
- Maximum
extra_patternsper rule set:64. - Maximum pattern length:
512characters. - Maximum complexity score (heuristic over alternation, quantifier, and group counts):
96. - Regex builder size limits:
EXTRA_PATTERN_REGEX_SIZE_LIMIT = 1 << 20andEXTRA_PATTERN_DFA_SIZE_LIMIT = 1 << 20.
Patterns that fail any of these checks raise ContentReviewError::UnsafePattern or ContentReviewError::InvalidPattern at policy load.
Blocking, advisory, and approval-gating
The guard's comment header (verified from source) describes three operational modes folded into one guard:
- PII detection on message bodies / email text. Categories are surfaced as tracing evidence. Detected PII denies the call.
- Tone / profanity filter against the configurable banned-words list. Hits deny.
- Monetary approval gating: payment calls whose amount meets or exceeds the matched grant's
Constraint::RequireApprovalAbovethreshold yieldVerdict::PendingApprovalso the HITL flow inchio_kernel::approvalcan collect signoff.
No advisory-only built-in mode
Verdict::Deny; only the approval-threshold path uses PendingApproval. If you want to log without blocking, run a custom guard or post-invocation hook instead.Failure modes
- Invalid extra-pattern regex ::
ContentReviewError::InvalidPattern. - Pattern exceeds safety limits ::
ContentReviewError::UnsafePattern. - Per-service lookup miss ::
default_rulesapplies. - Non-
ExternalApiCallactions pass through withVerdict::Allow. - A message that hits both PII and profanity yields a single deny verdict but logs both categories in tracing evidence.
Composition
use chio_guards::{
ResponseSanitizationGuard, SensitivityLevel, SanitizationAction,
ContentReviewGuard, ContentReviewConfig,
};
// Pre-invocation pipeline: review outbound SaaS calls.
let mut pipeline = chio_guards::GuardPipeline::default_pipeline();
pipeline.add(Box::new(ContentReviewGuard::new()));
// Post-invocation hook: redact responses.
let response_sanitizer = ResponseSanitizationGuard::new(
SensitivityLevel::Medium,
SanitizationAction::Redact,
);Pair with the secret-leak guard
SecretLeakGuard blocks secrets on writes into the filesystem. OutputSanitizer blocks them on the way out in tool responses. The detector catalogs overlap (AWS keys, GitHub tokens, GCP service-account JSON, PEM keys); running both closes the bidirectional gap. See the filesystem guards page for the matching write-side detector list.Next Steps
- Guard Pipelines :: where post-invocation hooks live alongside pre-invocation guards.
- Filesystem Guards :: the matching
SecretLeakGuardon the write side. - Jailbreak & Injection Guards :: scan inputs for adversarial framing in the same pipeline pass.