Formal Assurance Overview

Source records for formal claims

This page cites these source records: formal/proof-manifest.toml, formal/theorem-inventory.json, formal/assumptions.toml, and formal/MAPPING.md. If the docs and the manifests disagree, use the manifests.

Reading Paths by Role

These four paths group the pages by reader role. Each path identifies the pages to open and the information they provide.

I operate the kernel in production (SRE / platform)

Quickstart · run the manifest's gate commands locally.
Failure Modes · what each gate failure looks like in CI output.
Assumptions and TCB · what is trusted vs verified, and what to do if an assumption breaks in production.
Constant-Time Tests · timing-leak detection on the verdict hot path.

I'm auditing or certifying chio (security auditor / compliance)

Theorem Inventory · the per-theorem catalog: ID, statement, tool, file, status.
P1 Tour · a worked example tracing capability attenuation through its Lean, Rust, and test coverage.
Assumptions and TCB · the explicit trust boundary and the ten audited assumptions the proofs are conditioned on.
Differential Tests · production-vs-spec equivalence on scope subsumption, canonical JSON, anchored roots, receipts.

I'm an academic or formal-methods researcher (Lean / TLA+)

Lean 4 Proofs · theorem statements and the project layout.
Aeneas Pipeline · the Rust-to-Lean extraction pipeline (a Lean-targeted adaptation of Aeneas, paired with explicit equivalence theorems).
TLA+ Specs · the temporal-logic spec for cross-authority revocation propagation.
Theorem Inventory · the catalog you cite when comparing to your own work.

I'm contributing a proof or harness (chio contributor)

P1 Tour · how a property is represented in Lean, Aeneas, Kani, and differential tests.
Lean 4 Proofs · Lean conventions, naming, and the proof-file layout.
Kani Harnesses · how to write a new bounded-model-checking harness.
Fuzz Infrastructure · adding a new libFuzzer target, owners table, and corpus.
Failure Modes · debugging a gate that fails on your PR.

The Assurance Pyramid

The methods below differ in target, model, and assurance claim. Formal proofs cover bounded models; tests and review cover broader code paths with correspondingly different evidence.

Higher levels cover less code with stronger guarantees. Lower levels cover more code through tests and review.

Layer	Tool	Coverage	Strength
Mechanized proof	Lean 4	Bounded models of the capability algebra, revocation, evaluation, receipts, protocol	Root-imported and checked without `sorry`.
Refinement extraction	Aeneas (via Charon)	Pure numeric and boolean helpers in `formal_aeneas.rs`	Lean equivalence theorems link extracted models to handwritten models.
Bounded model checking	Kani (CBMC backend)	Public Rust entrypoints: `verify_capability`, `evaluate`, `sign_receipt`, `NormalizedScope::is_subset_of`, `resolve_matching_grants`	Symbolic execution exhausts inputs up to configured bounds.
Refinement contracts	Creusot	Same five public symbols plus the `formal_core` helpers	SMT-discharged contracts on production Rust symbols.
Temporal model checking	TLA+ (Apalache)	Cross-authority revocation propagation and delegation-depth bounds	Four named safety invariants and one liveness property within configured bounds, plus Apalache kernel-state invariants.
Differential property tests	proptest in `chio-formal-diff-tests`	Reference spec vs production for scope subsumption, anchored roots, receipt encoding, canonical JSON	Detects drift between two implementations of the same behavior.
Coverage-guided fuzz	libFuzzer plus ClusterFuzzLite	All trust-boundary parsers, decoders, and verifiers	Medium. Random and mutation-driven inputs over many CPU-hours.
Timing analysis	dudect	Signature byte-equality, scope subset checks	Limited. Statistical detection of data-dependent timing, not a proof.
Integration tests	`cargo test`	End-to-end protocol behavior	Exercises protocol behavior without exhaustive input coverage.
Manual review	Humans	Everything outside the proof boundary	Review of behavior outside the formal boundary; reviewers can miss defects.

Mechanized proofs cover smaller models than tests and review. The distinction is stated in each row's coverage and claim.

The Tools Chio Uses

Lean 4 · Proof assistant for the capability algebra, revocation snapshots, evaluation totality, receipt coupling, and protocol-level closure theorems. Source under formal/lean4/Chio/. See Lean 4 Proofs.
Aeneas · Extracts a functional model from Rust source for proving in Lean (the original Aeneas pipeline targets F*; Chio extracts to Lean instead, paired with explicit equivalence theorems). Two lanes: a pilot at formal/aeneas/verified_core.rs and the production extraction from crates/kernel/chio-kernel-core/src/formal_aeneas.rs. See Aeneas Pipeline.
Kani · Bounded model checker for Rust via CBMC. Symbolically executes the public proof harnesses in the nightly kani-public-nightly lane. Public harnesses are at crates/kernel/chio-kernel-core/src/kani_public_harnesses.rs. See Kani Harnesses.
Creusot · SMT-backed refinement-type prover that pins contracts to production Rust symbols. Configured in formal/rust-verification/creusot-contracts.toml.
TLA+ · Temporal-logic spec language. Apalache checks the safety invariants of formal/tla/RevocationPropagation.tla on the path-scoped PR lane and its liveness property on a nightly lane, plus a delegation-depth spec (formal/tla/DelegationDepthBound.tla) and a kernel-state-subset lane under formal/apalache/. See TLA+ Specs.
proptest · Property-based generator for differential tests in formal/diff-tests/. See Differential Tests.
dudect · Statistical detector for data-dependent timing in tight inner loops. Lives in crates/kernel/chio-kernel-core/tests/dudect/. See Constant-Time Tests.
libFuzzer + ClusterFuzzLite · Coverage-guided fuzzing across the trust-boundary parsers listed in the target directory. Targets are in fuzz/fuzz_targets/ with corpora at fuzz/corpus/. See Fuzz Infrastructure.
cargo-mutants · Mutation testing across six trust-boundary crates (chio-policy, chio-credentials, chio-attest-verify, chio-kernel-core, chio-guards, chio-anchor). Runs nightly as an advisory lane, with a co-coverage job that replays the fuzz corpus against surviving mutants. Configured in .github/workflows/mutants.yml.

What Is Formally Verified Today

The proof manifest declares ten required property identifiers, P1 through P10. Each is mapped to a list of Lean theorems plus, in most cases, a Kani harness, an Aeneas equivalence proof, or a Creusot contract.

ID	Property	Lanes
P1	Capability attenuation: a child capability is a subset of its parent	Lean root, differential test, Rust projection, Aeneas equivalence, public Kani
P2	Presented revocation coverage: revoked tokens and revoked ancestors deny	Lean root, audited storage assumption, SQLite projection, Aeneas equivalence
P3	Fail-closed evaluation: deny dominates, allow requires every check to pass	Lean root, Rust core, audited subprocess assumption, public Kani, adapter no-bypass
P4	Receipt integrity: sign-then-verify, immutability, field coupling	Lean root, symbolic crypto, audited crypto assumption, receipt totality, Aeneas equivalence
P5	Presented delegation-chain semantic validity	Lean root, SQLite projection
P6	Local parent-link soundness within an authenticated session	Lean root, audited storage assumption
P7	Receipt-lineage soundness across two signed receipts	Lean root, symbolic crypto, audited crypto assumption
P8	Session continuity: anchor and continuation evidence required	Lean root, audited transport assumption, DPoP binding tests
P9	Delegation and provenance consistency in call-chain references	Lean root, audited registry assumption
P10	Report truthfulness: asserted and observed lineage cannot be relabeled as verified	Lean root, claim gate

For the full per-theorem table, see the Theorem Inventory.

The Proof Boundary

The manifest names the boundary explicitly: implementation_linked_protocol_core. The verification target is security_critical_protocol_semantics. The following sections define its model, assumptions, boundary, and gaps.

The boundary covers the pure security-decision code inside chio-kernel-core plus bounded models of adjacent state transitions (revocation snapshots, budget commits, DPoP nonce admission, guard pipeline composition, receipt coupling). The covered Rust modules listed in the manifest are:

formal/proof-manifest.toml

covered_rust_modules = [
  "crates/kernel/chio-kernel-core/src/capability_verify.rs",
  "crates/kernel/chio-kernel-core/src/scope.rs",
  "crates/kernel/chio-kernel-core/src/evaluate.rs",
  "crates/kernel/chio-kernel-core/src/formal_aeneas.rs",
  "crates/kernel/chio-kernel-core/src/formal_core.rs",
  "crates/kernel/chio-kernel-core/src/normalized.rs",
  "crates/kernel/chio-kernel-core/src/receipts.rs",
]

Code outside that list is excluded from the proof boundary. The manifest names the exclusions: concrete Ed25519, SHA-256, canonical JSON, TLS, OS clock, SQLite, chain, and hosted-registry implementations beyond their audited assumptions; async scheduling, network delivery, subprocess effects, and tool-server behavior after the verified decision core allows a call; cluster consensus, external settlement rails, and third-party registry availability beyond fail-closed handling; Aeneas extraction from async, IO, SQLite, crypto, and string-heavy production modules outside formal_aeneas.rs; and symlink resolution and OS filesystem root enforcement beyond Chio's normalized path-prefix fail-closed checks.

The boundary moves with PRs

When a property graduates from audited assumption to discharged by named theorem, the manifest's discharged_assumptions list grows and required_assumption_ids shrinks. The retired RETIRED-SQLITE-CROSS-ROW entry is a worked example: prior cross-row SQLite atomicity is no longer assumed; the conjunction of the TLA+ MonotoneLog invariant and the per-row budget invariant in crates/kernel/chio-kernel/src/budget_store.rs does the work the assumption used to do.

Scope and Gaps

The receipt theorems prove properties over a symbolic cryptography model, under the Ed25519 and SHA-256 assumptions recorded in formal/assumptions.toml, rather than concrete cryptographic implementations. Most catalog entries prove bounded models, not the running binary. Kani and Lean run nightly, not on each pull request. The drop and cancel-unwind path, which runs when a request is torn down mid-flight, has limited formal coverage. See Formal Assurance for the full honest-scope treatment.

How to Read a Proof Manifest

The manifest at formal/proof-manifest.toml is the proof inventory. Each release-facing formal claim must cite it, the theorem inventory, and the audited assumptions. Here is the shape of the relevant fields.

formal/proof-manifest.toml

schema = "chio.proof-manifest.v1"
manifest_version = 1
proof_boundary_status = "implementation_linked_protocol_core"
verification_target = "security_critical_protocol_semantics"
assumption_registry = "formal/assumptions.toml"
primary_toolchain = ["lean4", "creusot", "kani", "aeneas"]

# Root Lean modules. Theorems must be reachable from these to count.
root_modules = [
  "formal/lean4/Chio/Chio.lean",
  "formal/lean4/Chio/Chio/Spec/Properties.lean",
  "formal/lean4/Chio/Chio/Proofs/Monotonicity.lean",
  # ... 21 entries total, spanning the Core, Proofs, Capability, and
  # Treaty lanes. See the Lean 4 Proofs page for the full list.
]

# Gate commands. Run all of these to reproduce the assurance posture.
gate_commands = [
  "./scripts/check-formal-proofs.sh",
  "./scripts/check-aeneas-pilot.sh",
  "./scripts/check-aeneas-production.sh",
  "./scripts/check-aeneas-equivalence.sh",
  "./scripts/check-rust-verification-gates.sh",
  "./scripts/check-kani-public-core.sh",
  "./scripts/check-adapter-no-bypass.sh",
  "cargo test -p chio-formal-diff-tests",
  "./scripts/check-portable-kernel.sh",
  "./scripts/check-proof-report.sh",
]

# Property matrix. Each entry has the form:
#   ID|description|lanes|theorem-ids
# A property is "covered" only when every named lane runs and every
# theorem ID exists in theorem-inventory.json.
property_matrix = [
  "P1|capability attenuation|lean_root_imported,differential_test,...|spec.capability_monotonicity,...",
  # ... nine more rows
]

Three rules govern the manifest:

A theorem only counts as evidence if it is root-imported from the modules in root_modules and contains no sorry.
A property only counts as covered if every lane in its property matrix entry runs green in CI.
Excluded surfaces and audited assumptions must be cited explicitly. Silent assumptions are forbidden.

Reproducing the Proofs Locally

Every gate command in the manifest is a shell script that any contributor can run. Together, the commands reproduce the checks listed in the manifest:

reproduce assurance posture

# Lean root proofs (sorry-free, root-imported)
./scripts/check-formal-proofs.sh

# Aeneas pilot extraction (formal/aeneas/verified_core.rs)
./scripts/check-aeneas-pilot.sh

# Aeneas production extraction (chio-kernel-core/src/formal_aeneas.rs)
./scripts/check-aeneas-production.sh

# Aeneas-Lean equivalence theorems
./scripts/check-aeneas-equivalence.sh

# Creusot and Kani required-lane verification
./scripts/check-rust-verification-gates.sh

# Public Kani harnesses on chio-kernel-core
./scripts/check-kani-public-core.sh

# Adapter no-bypass: every protocol adapter must call evaluate()
./scripts/check-adapter-no-bypass.sh

# Differential tests: reference spec vs production
cargo test -p chio-formal-diff-tests

# Portable-kernel parity (the same verdict on every target)
./scripts/check-portable-kernel.sh

# Proof report: the final aggregator that all gates fed into
./scripts/check-proof-report.sh

The TLA+ lane is separate, since Apalache is not a Cargo dep. The PR job runs the safety invariants:

apalache PR safety lane

# Install Apalache via the pinned tool installer
./tools/install-apalache.sh

# Safety invariants (NoAllowAfterRevoke, MonotoneLog,
#   AttenuationPreserving, RevocationFreshness)
apalache check \
  --inv=SafetyInv \
  --config=formal/tla/MCRevocationPropagation.cfg \
  formal/tla/RevocationPropagation.tla

apalache stdout (excerpt)

PASS #11: BoundedChecker
Checker reports no error up to computation length 16
It took me 0 days  3 hours 12 min 47 sec
Total time: 11567.881 sec
EXITCODE: OK

The nightly liveness lane holds the same PROCS=4, CAPS=8 bounds as the PR safety run but drives them to a deeper computation length (--length=24) with --temporal=RevocationEventuallySeen gated on the WF_vars(PropagateAny) weak-fairness conjunct in the spec. It is a scheduled lane, not a required PR gate.

Lean toolchain pin

The Lean toolchain pin lives at formal/lean4/Chio/lean-toolchain: leanprover/lean4:v4.28.0-rc1. lake build picks it up automatically. Kani, Creusot, and Aeneas pin in their own installer scripts under tools/.

Next Steps

Theorem Inventory · the full table of theorems by ID, statement, tool, file, and status.
Assumptions and TCB · what Chio trusts, why, and what would mitigate each assumption breaking.
Aeneas Pipeline · how the production extraction is wired to the Lean equivalence theorems.
Lean 4 Proofs · the proof structure, file by file.
Kani Harnesses · the public harness inventory plus per-harness wall clocks.
TLA+ Specs · the revocation-propagation spec and its four named safety invariants.
Differential Tests · reference spec vs production for the surfaces that proofs do not cover.
Constant-Time Tests · dudect harnesses and how to read their output.
Fuzz Infrastructure · the trust-boundary fuzz target inventory.

For the broader trust framing, the Trust Model page explains where formal assurance sits inside Chio's zero-ambient-authority design.