Chio/Docs

Formal Quickstart

This page walks an SRE, an engineer, or an academic from a fresh clone of chio to a green run of every formal gate listed in formal/proof-manifest.toml. Wall clock on a current Apple silicon laptop with hot caches is roughly ten minutes. The first run takes longer because the Lean toolchain and Mathlib cache download from upstream.

What success looks like

Every gate script printed in this page exits zero. Each one prints a final line containing passed or SUCCESSFUL. The Lean build prints Build completed successfully with no sorry findings. Kani prints VERIFICATION:- SUCCESSFUL for every harness in lanes.pr. Apalache prints EXITCODE: OK. The differential test crate finishes with test result: ok.

Time and Machine

Approximate wall clock per stage on an M-series laptop with hot cargo and lake caches:

StageFirst runHot cache
Toolchain install (rustup, elan, kani, apalache)5 to 8 min0
Lean root build (lake build)3 to 5 min10 to 20 s
Aeneas pilot + production extraction1 to 2 min15 to 30 s
Aeneas-Lean equivalence theorems15 to 30 s5 to 10 s
Kani public lane (14 harnesses)3 to 5 min2 to 3 min
Differential tests (cargo test -p chio-formal-diff-tests)1 to 2 min20 to 40 s
TLA+ safety lane via Apalache3 to 4 hr3 to 4 hr
Adapter no-bypass + portable-kernel checks30 to 60 s5 to 10 s

Apalache is slow

The Apalache safety lane runs to computation length 16 onRevocationPropagation.tla and finishes in roughly three hours of wall clock per the manifest's sample stdout. The other gates fit inside a ten-minute window. Run Apalache as a separate background job, or rely on the nightly CI lane to drive it.

Machine baseline: 4 physical cores, 8 GB RAM, 10 GB free disk. Kani's symbolic backend (CBMC) wants room for SAT search; if the machine has under 8 GB you may see the heavier harnesses (public_verify_capability_rejects_untrusted_issuer_before_signature at 31s budget, public_evaluate_rejects_untrusted_issuer_before_dispatch at 28s budget) swap and miss their wall clock. The reference budgets in formal/rust-verification/kani-public-harnesses.toml:47-62 were measured on aarch64-apple-darwin with a hot cargo cache.


Prerequisites

Five toolchains pin to specific versions. Install each one before you run the gates.

Rust

The workspace at Cargo.toml sets rust-version = "1.93" on edition 2021. Install via rustup; cargo will refuse to build with an older toolchain.

install rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
rustup toolchain install 1.93.0
rustup default 1.93.0
rustc --version  # rustc 1.93.0

Lean 4

The Lean toolchain pin lives at formal/lean4/Chio/lean-toolchain: leanprover/lean4:v4.28.0-rc1. Install elan; it reads the pin and downloads the matching toolchain automatically on first lake build.

install lean
curl https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh -sSf | sh -s -- -y
source "$HOME/.elan/env"
# elan picks up the toolchain pin on the first lake invocation; no
# 'rustup default'-style command is required.
elan --version

Kani

Kani is the bounded model checker for Rust on top of CBMC. It installs as a cargo subcommand. The PR lane uses cargo-kani 0.67.0 per the comment block at formal/rust-verification/kani-public-harnesses.toml:47-62.

install kani
cargo install --locked kani-verifier
cargo kani setup
cargo kani --version

Apalache (TLA+)

The TLA+ lane uses Apalache, not the older TLC model checker. Apalache ships as a self-contained Java jar. tools/install-apalache.sh in the chio tree pins the version and unpacks it under tools/apalache/.

install apalache
./tools/install-apalache.sh
./tools/apalache/bin/apalache-mc --version

Aeneas and Charon

Aeneas extracts a functional model from Rust source via Charon. The chio repo pins both versions in its installer scripts under tools/; the production extraction at formal/aeneas/production.toml names the source as crates/chio-kernel-core/src/formal_aeneas.rs. The check scripts call into the pinned binaries directly, so a manual install is rarely needed beyond the first run.

Other tools on PATH

  • python3 (3.10+). The gate scripts call into Python for manifest parsing; macOS and most Linux distros ship a working version.
  • jq for the proof-report aggregator at scripts/check-proof-report.sh.
  • rg (ripgrep) for the sorry-scan in scripts/check-formal-proofs.sh. The script falls back to grep -RInw if rg is missing.

Clone and Bootstrap

Clone chio and warm the cargo cache. The first cargo fetch pulls every dependency the formal lanes need; subsequent gate runs work offline.

clone and warm caches
git clone https://github.com/backbay-labs/chio.git
cd chio
cargo fetch --workspace
( cd formal/lean4/Chio && lake update && lake exe cache get )

lake exe cache get pulls the precompiled Mathlib oleans from the upstream cache server. On a cold machine that step downloads a few hundred megabytes; on a warm machine it is a no-op.


Run Each Gate

The proof manifest at formal/proof-manifest.toml names every gate command. Each one is a shell script under scripts/ and prints a per-stage verdict.

Lean root build and inventory check

scripts/check-formal-proofs.sh wraps three things: lake build on the project at formal/lean4/Chio/, a literal sorry scan across every shipped Lean module, and a Python pass that checks the theorem inventory and the manifest agree.

run lean gate
./scripts/check-formal-proofs.sh
check-formal-proofs.sh stdout (excerpt)
==> Lean 4 proof build
info: leanprover/lean4:v4.28.0-rc1: trusting cached toolchain
Build completed successfully.
==> Lean 4 placeholder scan
==> Proof manifest and theorem inventory sanity
formal proof check passed

Wall clock: 10 to 20 seconds with a hot lake cache, 3 to 5 minutes on first run.

Aeneas pilot, production, equivalence

Three scripts cover the Aeneas lanes. The pilot reads formal/aeneas/verified_core.rs and reports six extracted symbols. The production lane reads crates/chio-kernel-core/src/formal_aeneas.rs and reports the fifteen extracted symbols listed in formal/aeneas/production.toml. The equivalence lane elaborates Chio.Proofs.AeneasEquivalence and writes an artifact JSON with the SHA-256 of every generated Lean file.

run aeneas gates
./scripts/check-aeneas-pilot.sh
./scripts/check-aeneas-production.sh
./scripts/check-aeneas-equivalence.sh
check-aeneas-equivalence.sh stdout (excerpt)
Aeneas equivalence gate passed

The equivalence script writes target/formal/aeneas-production/equivalence-artifacts.json with three sha256 entries (the source rust file, the generated Funs.lean, and the generated Types.lean), the expected symbol list, and a schema tag of chio.aeneas-equivalence-artifacts.v1. Wall clock: 5 to 10 seconds for equivalence, 15 to 30 seconds for each extraction.

Kani public lane

scripts/check-kani-public-core.sh reads the harness list from kani_public_harnesses.rs, confirms every expected name is present, and runs each one with --default-unwind 8 --no-unwinding-checks. The total wall clock is roughly 134 seconds with a hot cargo cache per the timing table in the toml file.

run kani public lane
./scripts/check-kani-public-core.sh
check-kani-public-core.sh stdout (excerpt)
SUMMARY:
 ** 0 of N failed (0 unreachable)
VERIFICATION:- SUCCESSFUL
Verification Time: 31.247s
...
SUMMARY:
 ** 0 of N failed (0 unreachable)
VERIFICATION:- SUCCESSFUL
Verification Time: 1.041s
Kani public core harnesses passed

The script invokes cargo kani -p chio-kernel-core --lib --harness <name> once per name; nine of the fourteen harnesses come from the bash array in the script, and the remaining five algebraic harnesses live alongside them in the same source file. Per-harness budgets run from 1 second (verify_delegation_chain_step) to 31 seconds (public_verify_capability_rejects_untrusted_issuer_before_signature).

Creusot plus required Kani

scripts/check-rust-verification-gates.sh is the umbrella for the required-strict Rust refinement lanes named in rust_refinement_lanes in the manifest. It invokes the Creusot smoke and core harnesses plus the proof-facing Kani harnesses in kani_harnesses.rs.

run rust refinement lane
./scripts/check-rust-verification-gates.sh

Adapter no-bypass

Every protocol adapter must route every tool call through chio_kernel_core::evaluate. The script greps the adapters for the literal call and fails the build if any one bypasses it.

run adapter no-bypass
./scripts/check-adapter-no-bypass.sh

Differential tests

The differential test crate at formal/diff-tests/ uses proptest to compare the reference spec in src/spec.rs against the production Rust on randomly generated input pairs. Default sample count is 256 per property; override with the PROPTEST_CASES env var.

run differential tests
cargo test -p chio-formal-diff-tests
cargo test stdout (excerpt)
running 8 tests
test scope_subset_spec_matches_impl ... ok
test grant_subset_spec_matches_impl ... ok
test resource_grant_subset_spec_matches_impl ... ok
test prompt_grant_subset_spec_matches_impl ... ok
test normalized_scope_projection_matches_spec ... ok
test normalized_scope_subset_spec_matches_impl ... ok
test normalized_grant_projection_matches_spec ... ok
test normalized_grant_subset_spec_matches_impl ... ok

test result: ok. 8 passed; 0 failed; 0 ignored

Portable-kernel parity

Chio ships three kernel variants: chio-kernel (full), chio-kernel-core (proof target), and chio-kernel-browser / chio-kernel-mobile (portable shells). The portable-kernel check ensures every variant produces the same verdict on a fixture input set.

run portable kernel parity
./scripts/check-portable-kernel.sh

Proof report aggregator

scripts/check-proof-report.sh is the final aggregator. It reads the artifacts every prior gate wrote to target/formal/ and emits a release-facing JSON report tagged chio.proof-report.v1.

run proof report aggregator
./scripts/check-proof-report.sh

TLA+ via Apalache (separate lane)

The TLA+ safety lane is gated separately because Apalache wall clock dwarfs the rest. From the manifest's sample stdout the lane runs to length 16 and finishes in roughly three hours.

run apalache safety lane
apalache check \
  --inv=SafetyInv \
  --config=formal/tla/MCRevocationPropagation.cfg \
  formal/tla/RevocationPropagation.tla
apalache stdout (excerpt)
PASS #11: BoundedChecker
Checker reports no error up to computation length 16
It took me 0 days  3 hours 12 min 47 sec
Total time: 11567.881 sec
EXITCODE: OK

The aggregate SafetyInv is the conjunction of three named invariants from formal/tla/RevocationPropagation.tla: NoAllowAfterRevoke (every allow receipt was issued before the issuing authority observed any revocation), MonotoneLog (per-authority receipt-log timestamps strictly increase), and AttenuationPreserving (depth stays in 0..DEPTH_MAX and any attenuated cap has been delegated at least once).


Run All Gates At Once

There is no single make target that drives every gate, by design: the manifest is the source of truth and the CI workflow files invoke gates one by one so a failure surfaces against a named gate. To reproduce the assurance posture in one shell, run the ten commands from gate_commands in order:

reproduce assurance posture
# From formal/proof-manifest.toml gate_commands, run in order.
./scripts/check-formal-proofs.sh
./scripts/check-aeneas-pilot.sh
./scripts/check-aeneas-production.sh
./scripts/check-aeneas-equivalence.sh
./scripts/check-rust-verification-gates.sh
./scripts/check-kani-public-core.sh
./scripts/check-adapter-no-bypass.sh
cargo test -p chio-formal-diff-tests
./scripts/check-portable-kernel.sh
./scripts/check-proof-report.sh

On a hot-cache laptop the sequence above takes 8 to 12 minutes end to end. Add the Apalache lane on a separate background job:

run apalache in background
( apalache check \
    --inv=SafetyInv \
    --config=formal/tla/MCRevocationPropagation.cfg \
    formal/tla/RevocationPropagation.tla \
    > apalache.log 2>&1 ) &
echo "apalache pid: $!"

If a Gate Fails

Every gate has a known failure shape. The Failure Modes page catalogs each one with the literal error format and a triage path. Common cases:

  • A sorry scan hit means a Lean proof ships unfinished. The placeholder scan in scripts/check-formal-proofs.sh fails the build.
  • A Kani VERIFICATION:- FAILED line means CBMC found a counterexample within the unwind bound. See Failure Modes for how to read the counterexample trace and convert it to a regression test with --concrete-playback inplace.
  • A diff-test divergence prints both spec and impl values plus the shrunken proptest input. The crate writes regression seeds under formal/diff-tests/proptest-regressions/ automatically.
  • An Apalache invariant violation prints the offending state path to stdout and writes a counterexample1.tla file. Read it as a state-machine trace from Init through the first Next step that breaks the invariant.

The manifest is the source of truth

If a script fails because it cannot find a named gate, a named theorem, or a named harness, the first thing to check is formal/proof-manifest.toml. The manifest names every root module, every gate command, every property ID, and every required theorem. Drift between the manifest and the source is itself a gate failure.

Continue