Formal Quickstart · Chio Docs

What success looks like

The gate scripts below exit zero on success. Each prints a final line containing passed or SUCCESSFUL. The Lean build prints Build completed successfully with no sorry findings. Kani prints VERIFICATION:- SUCCESSFUL for each harness run by the public-core check. Apalache prints EXITCODE: OK. The differential test crate finishes with test result: ok.

Time and Machine

Approximate wall clock per stage on an M-series laptop with hot cargo and lake caches:

Stage	First run	Hot cache
Toolchain install (rustup, elan, Kani, apalache)	5 to 8 min	0
Lean root build (`lake build`)	3 to 5 min	10 to 20 s
Aeneas pilot + production extraction	1 to 2 min	15 to 30 s
Aeneas-Lean equivalence theorems	15 to 30 s	5 to 10 s
Kani public lane (13 harnesses)	3 to 5 min	2 to 3 min
Differential tests (`cargo test -p chio-formal-diff-tests`)	1 to 2 min	20 to 40 s
TLA+ safety lane via Apalache	3 to 4 hr	3 to 4 hr
Adapter no-bypass + portable-kernel checks	30 to 60 s	5 to 10 s

Apalache is slow

The Apalache safety lane runs to computation length 16 onRevocationPropagation.tla and finishes in roughly three hours of wall clock. The other gates fit inside a ten-minute window. Run Apalache as a separate background job, or rely on the nightly CI lane to drive it.

Machine baseline: 4 physical cores, 8 GB RAM, 10 GB free disk. Kani's symbolic backend (CBMC) wants room for SAT search; if the machine has under 8 GB, the concrete-fixture harnesses (public_verify_capability_rejects_untrusted_issuer_before_signature and public_evaluate_rejects_untrusted_issuer_before_dispatch, which exercise canonical-JSON serialization plus signature verify path) swap and miss their wall clock. The algebraic harnesses model signatures instead and finish in one to three seconds; formal/rust-verification/kani-public-harnesses.toml documents the complete sweep at about 2.2 minutes locally with a hot cargo cache.

Prerequisites

Five toolchains pin to specific versions. Install each one before you run the gates.

Rust

The workspace at Cargo.toml sets rust-version = "1.93" on edition 2021. Install via rustup; cargo will refuse to build with an older toolchain.

install rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
rustup toolchain install 1.93.0
rustup default 1.93.0
rustc --version  # rustc 1.93.0

Lean 4

The Lean toolchain pin lives at formal/lean4/Chio/lean-toolchain: leanprover/lean4:v4.28.0-rc1. Install elan; it reads the pin and downloads the matching toolchain automatically on first lake build.

install lean

curl https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh -sSf | sh -s -- -y
source "$HOME/.elan/env"
# elan picks up the toolchain pin on the first lake invocation; no
# 'rustup default'-style command is required.
elan --version

Kani

Kani is the bounded model checker for Rust on top of CBMC. It installs as a cargo subcommand. Chio pins cargo-kani 0.67.0 via CHIO_KANI_VERSION in .github/workflows/nightly.yml, the workflow that runs the Kani lanes.

install kani

cargo install --locked kani-verifier
cargo kani setup
cargo kani --version

Apalache (TLA+)

The TLA+ lane uses Apalache, not the older TLC model checker. Apalache ships as a self-contained Java jar. tools/install-apalache.sh in the chio tree pins the version and unpacks it under tools/apalache/.

install apalache

./tools/install-apalache.sh
./tools/apalache/bin/apalache-mc --version

Aeneas and Charon

Aeneas extracts a functional model from Rust source via Charon. The chio repo pins both versions in its installer scripts under tools/; the production extraction at formal/aeneas/production.toml names the source as crates/kernel/chio-kernel-core/src/formal_aeneas.rs. The check scripts call into the pinned binaries directly, so a manual install is rarely needed beyond the first run.

Other tools on PATH

python3 (3.10+). The gate scripts call into Python for manifest parsing; macOS and most Linux distros ship a working version.
jq for the proof-report aggregator at scripts/check-proof-report.sh.
rg (ripgrep) for the sorry-scan in scripts/check-formal-proofs.sh. The script falls back to grep -RInw if rg is missing.

Clone and Bootstrap

Clone chio and warm the cargo cache. The first cargo fetch pulls the dependencies used by the formal checks; subsequent gate runs work offline.

clone and warm caches

git clone https://github.com/backbay-labs/chio.git
cd chio
cargo fetch --workspace
( cd formal/lean4/Chio && lake update && lake exe cache get )

lake exe cache get pulls the precompiled Mathlib oleans from the upstream cache server. On a cold machine that step downloads a few hundred megabytes; on a warm machine it is a no-op.

Run Each Gate

The proof manifest at formal/proof-manifest.toml lists the gate commands. Each is a shell script under scripts/ and prints a per-stage verdict.

Lean root build and inventory check

scripts/check-formal-proofs.sh wraps three things: lake build on the project at formal/lean4/Chio/, a literal sorry scan across the shipped Lean modules, and a Python pass that checks the theorem inventory and the manifest agree.

run lean gate

./scripts/check-formal-proofs.sh

check-formal-proofs.sh stdout (excerpt)

==> Lean 4 proof build
info: leanprover/lean4:v4.28.0-rc1: trusting cached toolchain
Build completed successfully.
==> Lean 4 placeholder scan
==> Proof manifest and theorem inventory sanity
formal proof check passed

Wall clock: 10 to 20 seconds with a hot lake cache, 3 to 5 minutes on first run.

Aeneas pilot, production, equivalence

Three scripts cover the Aeneas lanes. The pilot reads formal/aeneas/verified_core.rs and reports six extracted symbols. The production lane reads crates/kernel/chio-kernel-core/src/formal_aeneas.rs and reports the fifteen extracted symbols listed in formal/aeneas/production.toml. The equivalence lane elaborates Chio.Proofs.AeneasEquivalence and writes a JSON record with the SHA-256 of each generated Lean file.

run aeneas gates

./scripts/check-aeneas-pilot.sh
./scripts/check-aeneas-production.sh
./scripts/check-aeneas-equivalence.sh

check-aeneas-equivalence.sh stdout (excerpt)

Aeneas equivalence gate passed

The equivalence script writes target/formal/aeneas-production/equivalence-artifacts.json with three sha256 entries (the source rust file, the generated Funs.lean, and the generated Types.lean), the expected symbol list, and a schema tag of chio.aeneas-equivalence-artifacts.v1. Wall clock: 5 to 10 seconds for equivalence, 15 to 30 seconds for each extraction.

Kani public lane

scripts/check-kani-public-core.sh reads the harness list from kani_public_harnesses.rs, confirms the expected names are present, and runs the thirteen harnesses in its PUBLIC_HARNESSES array with --default-unwind 8 --no-unwinding-checks. The total wall clock is roughly two minutes with a hot cargo cache.

run kani public lane

./scripts/check-kani-public-core.sh

check-kani-public-core.sh stdout (excerpt)

SUMMARY:
 ** 0 of N failed (0 unreachable)
VERIFICATION:- SUCCESSFUL
Verification Time: 31.247s
...
SUMMARY:
 ** 0 of N failed (0 unreachable)
VERIFICATION:- SUCCESSFUL
Verification Time: 1.041s
Kani public core harnesses passed

The script invokes cargo kani -p chio-kernel-core --lib --harness <name> once per name. Its PUBLIC_HARNESSES array holds thirteen names: nine concrete-fixture deny and accept edges plus verify_delegate_no_widen, verify_delegation_receipt_canonical, verify_revocation_view_freshness, and verify_oracle_inclusion_soundness. The complete lanes.pr set in formal/rust-verification/kani-public-harnesses.toml lists twenty harnesses; the nightly kani-public-nightly job parses the toml and runs all twenty, while this local script covers the thirteen it names directly. The concrete-fixture harnesses dominate the wall clock; the algebraic harnesses finish in one to three seconds.

Creusot plus required Kani

scripts/check-rust-verification-gates.sh runs the required-strict Rust refinement checks named in rust_refinement_lanes in the manifest. It invokes the Creusot smoke and core harnesses plus the proof-facing Kani harnesses in kani_harnesses.rs.

run rust refinement lane

./scripts/check-rust-verification-gates.sh

Adapter no-bypass

The adapter check requires each protocol adapter to route tool calls through chio_kernel_core::evaluate. The script greps the adapters for the literal call and fails the build if any one bypasses it.

run adapter no-bypass

./scripts/check-adapter-no-bypass.sh

Differential tests

The differential test crate at formal/diff-tests/ uses proptest to compare the reference spec in src/spec.rs against the production Rust on randomly generated input pairs. Default sample count is 256 per property; override with the PROPTEST_CASES env var.

run differential tests

cargo test -p chio-formal-diff-tests

cargo test stdout (excerpt)

running 8 tests
test scope_subset_spec_matches_impl ... ok
test grant_subset_spec_matches_impl ... ok
test resource_grant_subset_spec_matches_impl ... ok
test prompt_grant_subset_spec_matches_impl ... ok
test normalized_scope_projection_matches_spec ... ok
test normalized_scope_subset_spec_matches_impl ... ok
test normalized_grant_projection_matches_spec ... ok
test normalized_grant_subset_spec_matches_impl ... ok

test result: ok. 8 passed; 0 failed; 0 ignored

Portable-kernel parity

Chio ships three kernel variants: chio-kernel (full), chio-kernel-core (proof target), and chio-kernel-browser / chio-kernel-mobile (portable shells). The portable-kernel check ensures each variant produces the same verdict on a fixture input set.

run portable kernel parity

./scripts/check-portable-kernel.sh

Proof report aggregator

scripts/check-proof-report.sh aggregates the reports written by prior gates to target/formal/ and emits a release-facing JSON report tagged chio.proof-report.v1.

run proof report aggregator

./scripts/check-proof-report.sh

TLA+ via Apalache (separate lane)

The TLA+ safety lane is gated separately because Apalache wall clock dwarfs the rest. The sample run below drives the spec to length 16 and finishes in roughly three hours.

run apalache safety lane

apalache check \
  --inv=SafetyInv \
  --config=formal/tla/MCRevocationPropagation.cfg \
  formal/tla/RevocationPropagation.tla

apalache stdout (excerpt)

PASS #11: BoundedChecker
Checker reports no error up to computation length 16
It took me 0 days  3 hours 12 min 47 sec
Total time: 11567.881 sec
EXITCODE: OK

The aggregate SafetyInv is the conjunction of four named safety invariants from formal/tla/RevocationPropagation.tla: NoAllowAfterRevoke (every allow receipt was issued before the issuing authority observed any revocation), MonotoneLog (per-authority receipt-log timestamps strictly increase), AttenuationPreserving (depth stays in 0..DEPTH_MAX and any attenuated cap has been delegated at least once), and RevocationFreshness (every recorded local revocation epoch is strictly less than the global clock).

Run All Gates At Once

The repository has no single target for all gates. CI invokes gates separately so failures identify a named gate. To reproduce the manifest's checks in one shell, run the ten commands from gate_commands in order:

reproduce assurance posture

# From formal/proof-manifest.toml gate_commands, run in order.
./scripts/check-formal-proofs.sh
./scripts/check-aeneas-pilot.sh
./scripts/check-aeneas-production.sh
./scripts/check-aeneas-equivalence.sh
./scripts/check-rust-verification-gates.sh
./scripts/check-kani-public-core.sh
./scripts/check-adapter-no-bypass.sh
cargo test -p chio-formal-diff-tests
./scripts/check-portable-kernel.sh
./scripts/check-proof-report.sh

On a hot-cache laptop the sequence above takes 8 to 12 minutes from start to finish. Add the Apalache lane in a separate background job:

run apalache in background

( apalache check \
    --inv=SafetyInv \
    --config=formal/tla/MCRevocationPropagation.cfg \
    formal/tla/RevocationPropagation.tla \
    > apalache.log 2>&1 ) &
echo "apalache pid: $!"

If a Gate Fails

The documented gates have known failure formats. The Failure Modes page catalogs each one with the literal error format and a triage path. Common cases:

A sorry scan hit means a Lean proof ships unfinished. The placeholder scan in scripts/check-formal-proofs.sh fails the build.
A Kani VERIFICATION:- FAILED line means CBMC found a counterexample within the unwind bound. See Failure Modes for how to read the counterexample trace and convert it to a regression test with --concrete-playback inplace.
A diff-test divergence prints both spec and impl values plus the shrunken proptest input. The crate writes regression seeds under formal/diff-tests/proptest-regressions/ automatically.
An Apalache invariant violation prints the offending state path to stdout and writes a counterexample1.tla file. Read it as a state-machine trace from Init through the first Next step that breaks the invariant.

Check the manifest first

If a script fails because it cannot find a named gate, a named theorem, or a named harness, the first thing to check is formal/proof-manifest.toml. The manifest lists root modules, gate commands, property IDs, and required theorems. Drift between the manifest and the source is itself a gate failure.

Continue

P1 Tour: Capability Monotonicity · trace capability attenuation through its proof and test methods.
Theorem Inventory · the full table of theorems by ID, statement, file, claim class, and property mapping.
Failure Modes · what each gate failure looks like and how to triage.
Formal Assurance Overview · assurance methods, the proof boundary, and audited assumptions.