Chio/Docs

Fuzz Infrastructure

The standalone Cargo workspace at fuzz/ hosts twenty libFuzzer harnesses, each targeting one trust-boundary parser, decoder, or verifier. ClusterFuzzLite drives changed-target sampling on every PR plus a nightly batch sweep. Crashes are triaged into permanent regression seeds owned by the crate that regressed.

Standalone workspace by design

fuzz/Cargo.toml carries an empty [workspace] stanza so libFuzzer's nightly-only runtime requirements do not leak into the main stable workspace. The main workspace denies unwrap_used and expect_used; the fuzz workspace inherits the same lints so harnesses cannot mask panics behind a hidden .unwrap().

What Fuzzing Does

libFuzzer is a coverage-guided in-process fuzzer. It starts from a seed corpus, executes the harness, observes which code edges the input touched, and mutates the input toward inputs that cover new edges. When an input causes a panic, an assertion violation, or an ASAN/UBSAN report, libFuzzer captures it as a crash and shrinks toward a minimal reproducer.

chio targets only trust-boundary surfaces: places where externally controlled bytes first enter chio code. The contract for every target is that no input causes a panic, an unwrap failure, or undefined behavior; structurally invalid input must result in a typed Err return.


Fuzz Targets

Twenty [[bin]] entries in fuzz/Cargo.toml, each backed by a fuzz_target! in fuzz/fuzz_targets/<name>.rs. The owning crate per target is declared in fuzz/owners.toml as [targets.<name>] with crate and path fields.

TargetSourceOwning crateSurface
canonical_jsonfuzz_targets/canonical_json.rschio-core-typesRFC 8785 canonical-JSON round-trip; structure-aware mutator self-driver
capability_receiptfuzz_targets/capability_receipt.rschio-core-typesCapabilityToken and ChioReceipt JSON deserialization plus verify_signature
manifest_roundtripfuzz_targets/manifest_roundtrip.rschio-manifestTool-manifest decode plus canonicalization
attest_verifyfuzz_targets/attest_verify.rschio-attest-verifySigstore bundle parser plus cert chain verify
jwt_vc_verifyfuzz_targets/jwt_vc_verify.rschio-credentialsJWT VC verifier; constant-time compare assertions
oid4vp_presentationfuzz_targets/oid4vp_presentation.rschio-credentialsOID4VP holder response decode
did_resolvefuzz_targets/did_resolve.rschio-didchio-did parser plus resolver
anchor_bundle_verifyfuzz_targets/anchor_bundle_verify.rschio-anchorAnchor proof bundle plus checkpoint records
mcp_envelope_decodefuzz_targets/mcp_envelope_decode.rschio-mcp-edgeMCP NDJSON decode plus edge dispatch into the evaluator
a2a_envelope_decodefuzz_targets/a2a_envelope_decode.rschio-a2a-adapterA2A SSE parse plus per-event fan-out
acp_envelope_decodefuzz_targets/acp_envelope_decode.rschio-acp-edgeACP NDJSON plus handle_jsonrpc dispatch
wasm_preinstantiate_validatefuzz_targets/wasm_preinstantiate_validate.rschio-wasm-guardsComponentBackend, WasmtimeBackend, format detection
wit_host_call_boundaryfuzz_targets/wit_host_call_boundary.rschio-wasm-guardsGuardRequest and GuestDenyResponse serde deserialization
chio_yaml_parsefuzz_targets/chio_yaml_parse.rschio-configchio-config YAML loader
openapi_ingestfuzz_targets/openapi_ingest.rschio-openapi-mcp-bridgeOpenApiMcpBridge::from_spec ingest
receipt_log_replayfuzz_targets/receipt_log_replay.rschio-kernel-coreReceipt log replay decode plus chain-invariant state machine
fuzz_policy_parse_compilefuzz_targets/fuzz_policy_parse_compile.rschio-policyHushSpec parser, validator, compiler, and YAML round-trip
fuzz_sql_parserfuzz_targets/fuzz_sql_parser.rschio-data-guardsSQL parser and SQL guard fail-closed analysis across dialects
fuzz_merkle_checkpointfuzz_targets/fuzz_merkle_checkpoint.rschio-kernelMerkle tree inclusion proofs and signed checkpoint validation
fuzz_tool_actionfuzz_targets/fuzz_tool_action.rschio-guardsTool action classification and guard verdicts for egress, shell, SQL, memory, MCP

Each target is wired through target-map.toml with the source-path globs that, when changed, must trigger that target on the PR. The list is in lockstep with .clusterfuzzlite/build.sh and infra/oss-fuzz/build.sh.


Harness Shape

The smallest harnesses delegate to a fuzz_ entrypoint exported by the owning crate. From fuzz_targets/jwt_vc_verify.rs:

rust
#![no_main]

use chio_credentials::fuzz::fuzz_jwt_vc_verify;
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    fuzz_jwt_vc_verify(data);
});

Targets that need a structure-aware mutator declare one with fuzz_mutator!. The canonical-JSON mutator at fuzz/mutators/canonical_json.rs is the worked example; it is consumed by canonical_json, capability_receipt, and mcp_envelope_decode:

rust
#![no_main]

use chio_core_types::{CapabilityToken, ChioReceipt};
use chio_fuzz::canonical_json::canonical_json_mutate;
use libfuzzer_sys::{fuzz_mutator, fuzz_target};

fuzz_target!(|data: &[u8]| {
    if let Ok(token) = serde_json::from_slice::<CapabilityToken>(data) {
        let _ = token.verify_signature();
    }

    if let Ok(receipt) = serde_json::from_slice::<ChioReceipt>(data) {
        let _ = receipt.verify_signature();
    }
});

fuzz_mutator!(|data: &mut [u8], size: usize, max_size: usize, seed: u32| {
    canonical_json_mutate(data, size, max_size, seed)
});

Targets that drive the full state machine (HushSpec policy parse-compile, tool-action classification) embed seed corpora with include_bytes! so the first iteration always exercises a known-good shape.


Corpus

Seed corpora live under fuzz/corpus/<target>/. The cargo-fuzz convention is corpus/<target>/ for unprefixed targets and corpus/fuzz_<target>/ for the main-branch fuzz_* binaries. A typical corpus directory holds:

  • A handful of hand-curated seed inputs covering the major structural cases (one valid case, one boundary case, one intentionally malformed case).
  • Promoted seeds: every crash that triaged through scripts/promote_fuzz_seed.sh becomes a permanent corpus entry plus a regression test in the owning crate.

Adding a new seed:

bash
# Drop the input bytes
cp my-seed.bin fuzz/corpus/<target>/

# Run the target with the seed in the corpus
cd fuzz
cargo +nightly fuzz run <target>

# If the seed survives the run, commit it:
git add fuzz/corpus/<target>/my-seed.bin

Real corpus snapshot

Seed counts per target as committed in fuzz/corpus/. The unprefixed entries (e.g. canonical_json) are the cargo-fuzz convention for unprefixed targets; the fuzz_* entries are the main-branch fuzz_* binaries, and a few targets carry both for historical reasons (the dual entry is harmless: cargo-fuzz dedups by content hash).

TargetSeedsTargetSeeds
a2a_envelope_decode8acp_envelope_decode7
anchor_bundle_verify6attest_verify1
canonical_json3capability_receipt1
chio_yaml_parse7did_resolve6
fuzz_canonical_json2fuzz_capability_receipt13
fuzz_manifest_roundtrip6fuzz_merkle_checkpoint1
fuzz_policy_parse_compile6fuzz_sql_parser10
fuzz_tool_action12jwt_vc_verify5
manifest_roundtrip1mcp_envelope_decode7
oid4vp_presentation4openapi_ingest9
receipt_log_replay7wasm_preinstantiate_validate7
wit_host_call_boundary7

Targets with a single seed (e.g. attest_verify, manifest_roundtrip, fuzz_merkle_checkpoint) rely on the libFuzzer mutator to grow coverage from a single valid seed. Targets with double-digit seed counts (e.g. fuzz_capability_receipt, fuzz_tool_action, fuzz_sql_parser) carry promoted regression seeds from prior crash triages.

Promoted regression seeds also land as a tests/ file in the owning crate (per owners.toml) so the seed stays exercised even when the fuzz lane is offline.


ClusterFuzzLite Integration

ClusterFuzzLite (CFLite) is a continuous fuzzing layer for repos that do not run on Google's OSS-Fuzz infrastructure. Two GitHub Actions workflows wire it in:

  • .github/workflows/cflite_pr.yml · PR-time changed-target sampling. Each PR computes which source-path globs changed, intersects them with target-map.toml triggers, and runs the matching subset for a short budget.
  • .github/workflows/cflite_batch.yml · nightly rotation that runs every target for a longer budget. Crashes feed into fuzz/artifacts/<target>/.

The build entrypoint is .clusterfuzzlite/build.sh; it mirrors infra/oss-fuzz/build.sh target-for-target. Storage is a sibling private repo (the chio fuzz lane does not use GCS); the storage-repo input is passed via FUZZ_CORPUS_PAT in the workflow files.

The PR check, verbatim

The required check on every PR is cflite_pr / changed-target-sampling. It computes the diff against the merge base, intersects the changed source paths with the trigger globs in fuzz/target-map.toml, and runs only the matching targets at 60 seconds each. The PR label fuzz: full promotes the run to a 120-second-per-target full sweep.

.github/workflows/cflite_pr.yml
name: cflite_pr

on:
  pull_request:
    types:
      - opened
      - synchronize
      - reopened
      - labeled
    paths:
      - "crates/**"
      - "Cargo.toml"
      - "Cargo.lock"
      - ".cargo/**"
      - "fuzz/**"
      - "spec/schemas/**"
      - "tests/bindings/vectors/**"
      - ".github/workflows/cflite_pr.yml"
      - ".clusterfuzzlite/**"
      - "scripts/check-fuzz-budget.sh"

env:
  CHIO_FUZZ_DURATION_SECONDS: "60"
  CHIO_FUZZ_DURATION_SECONDS_FULL: "120"

jobs:
  budget-check:
    name: fuzz-budget
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5
      - name: Verify 30-day fuzz budget
        env:
          GH_TOKEN: ${{ github.token }}
        run: |
          set -euo pipefail
          scripts/check-fuzz-budget.sh "${{ github.repository }}"

  changed-target-sampling:
    name: changed-target-sampling
    needs: budget-check
    runs-on: ubuntu-latest
    timeout-minutes: 30

The changed-target detection step reads target-map.toml and glob-matches each entry against the diff:

.github/workflows/cflite_pr.yml (target detection)
      - name: Compute changed targets
        id: targets
        if: steps.mode.outputs.full == 'false'
        run: |
          set -euo pipefail
          base_sha="${{ github.event.pull_request.base.sha }}"
          head_sha="${{ github.event.pull_request.head.sha }}"
          git diff --name-only "${base_sha}" "${head_sha}" > changed.txt
          mapfile -t targets < <(yq -p=toml -r '.targets | keys | .[]' \
            fuzz/target-map.toml)
          fired=()
          for tgt in "${targets[@]}"; do
            mapfile -t triggers < <(yq -p=toml -r \
              ".targets.${tgt}.triggers[]" fuzz/target-map.toml)
            for trig in "${triggers[@]}"; do
              regex="$(printf '%s' "${trig}" | sed -e 's|\.|\\.|g' \
                -e 's|\*\*|__DSTAR__|g' -e 's|\*|[^/]*|g' \
                -e 's|__DSTAR__|.*|g')"
              if grep -qE "^${regex}$" changed.txt; then
                fired+=("${tgt}")
                break
              fi
            done
          done
          # Inventory fallback: builder, workflow, and target-map edits
          # can break changed-target sampling without matching a source
          # trigger. Run the full inventory for those control-plane edits.

The actual fuzz invocation is the upstream google/clusterfuzzlite action, pinned by commit SHA, fed the comma-separated target list via the CHIO_CFLITE_TARGETS environment variable that .clusterfuzzlite/build.sh reads:

.github/workflows/cflite_pr.yml (run)
      - name: Build ClusterFuzzLite changed-target fuzzers
        if: steps.mode.outputs.full == 'false' && steps.targets.outputs.count != '0'
        uses: google/clusterfuzzlite/actions/build_fuzzers@82652fb49e77bc29c35da1167bb286e93c6bcc05
        env:
          CHIO_CFLITE_TARGETS: ${{ steps.targets.outputs.targets }}
        with:
          language: rust
          sanitizer: address
          github-token: ${{ secrets.GITHUB_TOKEN }}
          keep-unaffected-fuzz-targets: true

      - name: Run ClusterFuzzLite changed-target sampling without corpus storage
        if: steps.mode.outputs.full == 'false' && steps.targets.outputs.count != '0'
        uses: google/clusterfuzzlite/actions/run_fuzzers@82652fb49e77bc29c35da1167bb286e93c6bcc05
        env:
          CHIO_CFLITE_TARGETS: ${{ steps.targets.outputs.targets }}
        with:
          language: rust
          fuzz-seconds: ${{ steps.mode.outputs.fuzz_seconds }}
          mode: code-change
          sanitizer: address
          github-token: ${{ secrets.GITHUB_TOKEN }}
          report-unreproducible-crashes: false

Crashes uploaded as the cflite-pr-crashes-<PR> artifact at ./out/artifacts, with a 14-day retention window. The companion cflite_batch.yml runs the full inventory nightly with a longer per-target budget; the project config that both workflows share lives at .clusterfuzzlite/project.yaml.

.clusterfuzzlite/project.yaml
language: rust
primary_contact: "security@chio.world"
auto_ccs:
  - "security@chio.world"
sanitizers:
  - address
  - undefined
architectures:
  - x86_64
fuzzing_engines:
  - libfuzzer
report_to_oss_fuzz: false

Running Locally

Setup once:

bash
rustup toolchain install nightly
cargo install cargo-fuzz --locked

Run a single target. The cargo-fuzz wrapper builds the binary with libFuzzer instrumentation, links the seed corpus from fuzz/corpus/<target>/, and forwards the rest of the args to libFuzzer:

bash
cd fuzz
cargo +nightly fuzz run attest_verify

Sample stdout from a clean run (libFuzzer's standard coverage-guided trailer):

bash
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 3829461072
INFO: 12 files found in fuzz/corpus/attest_verify
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: seed corpus: files: 12 min: 184b max: 2718b total: 9420b rss: 41Mb
#13	INITED cov: 1842 ft: 4127 corp: 12/9420b exec/s: 0 rss: 89Mb
#256	pulse  cov: 1842 ft: 4127 corp: 12/9420b lim: 184 exec/s: 128 rss: 92Mb
#1024	pulse  cov: 1844 ft: 4131 corp: 13/9711b lim: 612 exec/s: 102 rss: 105Mb
...

Each entry pairs the cumulative iteration count, the coverage edge count (cov), the feature count (ft), and the corpus size. A run that finds no crash inside its time budget exits 0 and prints Done {N} runs in {S} second(s).

Build only (the gate that mirrors CI's build_fuzzers action):

bash
cargo +nightly fuzz build attest_verify

Run with a maximum total time budget (useful for local soak runs):

bash
cargo +nightly fuzz run attest_verify -- -max_total_time=300

CI pins a dated nightly so fuzz crashes reproduce across machines; consult .github/workflows/cflite_pr.yml and .github/workflows/nightly.yml for the toolchain currently in force.


Crashes, Minimization, Regression

When libFuzzer finds a crashing input it dumps the bytes to fuzz/artifacts/<target>/crash-... and prints a stack trace. Triage path:

  • Reproduce: cargo +nightly fuzz run <target> fuzz/artifacts/<target>/crash-...
  • Minimize: cargo +nightly fuzz tmin <target> fuzz/artifacts/<target>/crash-... shrinks the input to the smallest reproducer.
  • Classify: parser bug, validator bug, or genuine logic flaw in the surface under test.
  • Fix the bug.
  • Promote the minimized seed via scripts/promote_fuzz_seed.sh: the seed is added to the corpus and a regression test is generated in the owning crate (path resolved through fuzz/owners.toml).

The regression test pins the bug forever; even if the fuzz lane is offline, the seed still runs as a normal cargo test.

Crash report walkthrough

A libFuzzer crash report has three parts: the deduplicated finding banner, the sanitizer stack, and the input-bytes snapshot. A typical OOB-read finding from canonical_json looks like:

libFuzzer crash (canonical_json)
==12345==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000c30
READ of size 1 at 0x602000000c30 thread T0
    #0 0x55c... in chio_core_types::canonical::canonicalize crates/chio-core-types/src/canonical.rs:184
    #1 0x55c... in chio_fuzz::canonical_json::run fuzz/src/lib.rs:42
    #2 0x55c... in rust_fuzzer_test_input fuzz/fuzz_targets/canonical_json.rs:8

SUMMARY: AddressSanitizer: heap-buffer-overflow crates/chio-core-types/src/canonical.rs:184 in canonicalize
==12345==ABORTING
MS: 5 EraseBytes-ChangeBit-CopyPart-CMP-CrossOver-; base unit: 7c6f...
0x7b,0x22,0x6b,0x22,0x3a,0x22,0x5c,0x75,0x64,0x38,0x30,0x30,0x22,0x7d,0x00,
{"k":"\ud800"}\x00
artifact_prefix='./'; Test unit written to ./crash-9d2c1f...
Base64: eyJrIjoiXHVkODAwIn0A

The triage path:

reproduce + minimize
# 1. Reproduce. cargo-fuzz consumes the artifact path positionally;
#    libFuzzer runs it as a single-input replay rather than a fuzz loop.
cd fuzz
cargo +nightly fuzz run canonical_json \
  fuzz/artifacts/canonical_json/crash-9d2c1f...

# 2. Minimize. tmin shrinks the input toward a smaller reproducer that
#    still triggers the same crash signature.
cargo +nightly fuzz tmin canonical_json \
  fuzz/artifacts/canonical_json/crash-9d2c1f...

# 3. Promote. The minimized seed lands as a permanent corpus entry
#    plus a regression test in the owning crate (chio-core-types here,
#    resolved through fuzz/owners.toml).
scripts/promote_fuzz_seed.sh canonical_json \
  fuzz/artifacts/canonical_json/min-9d2c1f...

After promotion, the seed lives at fuzz/corpus/canonical_json/min-9d2c1f... and a regression test is generated in the owning crate (resolved through fuzz/owners.toml via the [targets.canonical_json] block, which names crate = "chio-core-types" and the path the regression file writes to). The MS: prefix in the crash banner records the libFuzzer mutation chain that produced the input; the trailing UTF-8 decode of the bytes is the human-readable form of the input.


Coverage Tracking

cargo-fuzz can dump an HTML coverage report after a run:

bash
cd fuzz
cargo +nightly fuzz coverage <target>
cargo +nightly fuzz coverage <target> --html

The output lands in target/<target-triple>/coverage/<target>/ with per-line and per-region edge counts. Regions that the corpus does not reach are candidates for new seeds or for tightening the fuzz_target body to drive more state.

Smoke test

fuzz/tests/smoke.rs is a regular cargo test that runs every fuzz target on its corpus seeds without nightly. Treat it as the baseline: if smoke passes, the fuzz targets at least build and run on the trusted corpus.

Scope and Limits

  • Trust-boundary only. Every target wraps the entry point where externally controlled bytes first hit chio code. Internal helpers are not fuzzed directly; if they fail, the failure surfaces through one of the wrappers above.
  • Fail-closed contract. Targets accept Err(_) as an expected outcome; the only thing they reject is a panic, an assert failure, an unwrap, or a sanitizer report.
  • Sanitizers in CI. ClusterFuzzLite runs with address and undefined sanitizers. Memory safety bugs and undefined behavior are surfaced as crashes.
  • Coverage, not exhaustion. Fuzzing finds bugs reachable by mutating from the corpus within the run budget. It does not enumerate all inputs. The Kani lane is the exhaustive (within bounds) complement.

Next

Fuzz Infrastructure · Chio Docs