Human-in-the-Loop
Some tool calls should not execute without a person saying yes. A refund above a threshold. A deploy to production. A write to a sensitive table. Chio's human-in-the-loop (HITL) protocol adds a third outcome to the guard pipeline, alongside Allow and Deny: the call is suspended, a person is asked, and the kernel resumes (or refuses) based on a signed approval token. This page describes the three surfaces that compose HITL in chio and how a request moves through them.
Alpha
Three Surfaces
HITL is not a single knob. Three surfaces compose it, and you have to wire all three for the flow to complete:
- Policy-declared approval guard. A
require_approvalconstraint on a capability grant (or a guard configured withon_trigger = require_approval) tells the kernel that a matching request needs a human before it can proceed. - In-flight suspension receipt. When the approval guard returns pending, the kernel signs a receipt with
Decision::Incompletewhose metadata marks the call as awaiting approval, names the approval request id, and records the deadline. The agent sees this as a "pending approval" response; the trust-control log has it as an auditable intermediate state. - Approval callback with a signed token. A human reviews the request through a dashboard, Slack, email, or API poll. Their decision is encoded as a
GovernedApprovalTokensigned with the approver's Ed25519 key. The kernel validates the token, re-runs capability and guard checks, then emits a final Allow or Deny receipt.
State Machine
A governed tool call that touches an approval constraint walks the following states:
The priority rule across guards is: any Deny dominates any PendingApproval. If a structural guard such as forbidden-path denies the request, the pipeline does not ask for human approval. Approval is only solicited when the non-approval guards all allow.
The Pending Verdict
Internally the kernel's Verdict enum gains a third variant for this protocol. Guards may return PendingApproval(ApprovalRequest) instead of Allow or Deny:
pub enum Verdict {
/// The action is allowed.
Allow,
/// The action is denied.
Deny,
/// The action requires human approval before proceeding.
/// Carries the approval request the kernel will dispatch to a channel.
PendingApproval(ApprovalRequest),
}The guard trait signature is unchanged; only the enum grows. Custom guards that do not care about approval simply never return the new variant.
Receipt Decisions
The signed receipt Decision enum in chio-core-types/src/receipt.rs has exactly four variants. HITL does not add new Decision variants. The approval flow reuses these four plus receipt metadata to distinguish states:
pub enum Decision {
/// The tool call was allowed and executed.
Allow,
/// The tool call was denied.
Deny { reason: String, guard: String },
/// The tool call was interrupted by explicit cancellation.
Cancelled { reason: String },
/// The tool call did not reach a complete terminal result.
Incomplete { reason: String },
}Mapping approval lifecycle states onto these four:
- Suspended awaiting approval:
Incompletewith a reason identifying the approval request, and an approval-request id plus deadline carried in receipt metadata. The agent-facing label "IncompleteAwaitingApproval" is a metadata convention, not a new enum variant. - Approved and executed:
Allow, with approver identity, token id, and the referenced approval-request id carried in receipt metadata. - Denied by a human:
Denywithguard = "human-approval"; approver identity and optional reason live in metadata. - Timed out with timeout_action = deny:
Denywithguard = "approval-timeout". - Cancelled before resolution:
Cancelled.
PendingApproval itself is a runtime-only Verdict variant in chio-kernel/src/runtime.rs. It never appears on a signed receipt; the signed Decision for a suspended call is always Incomplete with metadata pointing at the approval request.
The Approval Request
When a guard returns PendingApproval, it constructs an ApprovalRequest that the kernel will persist and route to channels. The request contains everything a human needs to render a decision and everything the kernel will later need to validate the returned token:
pub struct ApprovalRequest {
/// Unique approval id (UUIDv7). Keys the approval store.
pub approval_id: String,
/// Policy / grant id that triggered the approval.
pub policy_id: String,
/// Calling agent's identifier.
pub subject_id: AgentId,
/// Capability token id bound to this request.
pub capability_id: String,
/// Public key of the capability subject. A presented approval token
/// must carry the same subject.
pub subject_public_key: Option<PublicKey>,
/// Server hosting the target tool.
pub tool_server: ServerId,
/// Tool being invoked.
pub tool_name: String,
/// Short action verb for human summaries (e.g. "invoke", "charge").
pub action: String,
/// SHA-256 hex of the canonical JSON of the tool arguments / intent.
pub parameter_hash: String,
/// Unix seconds after which the request auto-denies or escalates.
pub expires_at: u64,
/// Hint for channels about where the human can respond.
pub callback_hint: Option<String>,
/// Unix seconds when the request was created.
pub created_at: u64,
/// Short human-readable summary for dashboards.
pub summary: String,
/// Original governed intent, when one is bound.
pub governed_intent: Option<GovernedTransactionIntent>,
/// Public keys allowed to approve this request. Empty set fails closed.
pub trusted_approvers: Vec<PublicKey>,
/// Guards that triggered the approval requirement.
pub triggered_by: Vec<String>,
}Supported Approval Triggers
Today chio ships exactly one built-in approval constraint on Constraint (see chio-core-types/src/capability.rs):
| Trigger | Fires When |
|---|---|
RequireApprovalAbove { threshold_units: u64 } | The governed intent's monetary envelope meets or exceeds the threshold, measured in minor currency units |
A policy can also flip a normal guard into an approval trigger by routing it through the kernel's approval pipeline, so that the guard returns PendingApproval instead of Deny. For example, a content-review or PII-detection guard that normally denies can be configured to suspend for a human instead.
Missing intent is not a free pass
RequireApprovalAbove is configured but the incoming request does not carry a governed intent, the kernel fails closed. You cannot bypass the threshold by simply omitting the intent.Planned trigger families (not yet shipped)
Constraint enum. Do not rely on them in policy files:RequireApprovalAlways: unconditional human approval for every invocation under a grant.RequireApprovalFirstN: human approval for the first N invocations only.RequireApprovalForActions: human approval when a tool's declared action category matches a listed category (see Action Categories below).RequireApprovalAboveTier: human approval when the governed autonomy tier is at or above a named level.
RequireApprovalAbove with a threshold of 0 on the relevant grants, or route a custom guard through the approval pipeline.Action Categories (planned)
Design-stage
RequireApprovalForActions trigger described above. They are not yet implemented. This section sketches the intended vocabulary so policy authors can weigh in on the taxonomy before it lands.The target model is that tools self-declare an action category in their manifest, and a future constraint names the categories that require approval. Candidate categories:
Financial: payments, transfers, tradesCommunication: outbound email, SMS, Slack postsInfrastructure: deploys, scale operations, deletesDataMutation: writes, deletes, updates on managed dataSensitiveDataAccess: PII, credentials, keysCustom("name"): declare your own label
End-to-End Flow
Walking through a concrete example: a support agent wants to issue a $450 refund, and the grant carries RequireApprovalAbove { threshold_units: 200 }.
- The agent submits a tool call with a governed intent whose
max_amount.units = 450. - The kernel runs guards. Structural guards pass. The approval guard sees the threshold fire and returns
PendingApproval(request). - The kernel persists the request in the approval store, dispatches it to configured channels (Slack, dashboard, email), and signs an
Incompletereceipt whose metadata marks the call as awaiting approval and carries the approval-request id and deadline. The agent receives a "pending approval" response naming the approval request id and the deadline. - A human reviews the summary, intent, and (optionally) arguments through a channel. They choose approve or deny.
- The channel sends a signed
GovernedApprovalTokentoPOST /approvals/{id}/respond. The kernel validates the signature, the request id binding, the intent hash binding, the approver whitelist, and the expiry. - On approve: the kernel re-runs capability validation (the grant may have been revoked during the wait) and non-approval guards (state may have changed), then dispatches to the tool server and emits an
Allowreceipt. Metadata carries the approver public key, the approval token id, and a reference back to the earlier suspension receipt. - On deny: the kernel emits a
Denyreceipt withguard = "human-approval", carrying the approver's public key and optional reason in metadata.
The Approval Token
The human's decision is encoded in an GovernedApprovalToken (already defined in chio-core-types::capability and re-used here):
pub struct GovernedApprovalToken {
pub id: String,
pub approver: PublicKey, // Ed25519 key that signed
pub subject: PublicKey, // agent's key
pub governed_intent_hash: String, // binds to one intent
pub request_id: String, // binds to one tool call
pub issued_at: u64,
pub expires_at: u64,
pub decision: GovernedApprovalDecision, // Approved | Denied
pub signature: Signature,
}The token is cryptographically bound to five things: the request id, the intent hash, the approver's key, the agent's key, and a time window. The kernel validates all five before accepting. A token for a different request, a different intent, a different agent, or one outside its time window is rejected.
Replay Protection
Approval tokens are single-use. The kernel combines four mechanisms to enforce this:
- Request binding: a token for request A cannot be replayed against request B.
- Time bounds: outside
[issued_at, expires_at)the token is invalid. - Lifetime cap: the kernel rejects tokens with a lifetime longer than
MAX_APPROVAL_TTL_SECS(one hour) to bound the replay-store window. - Consumption store: an LRU replay store records consumed
(request_id, intent_hash)pairs. A token presented a second time is rejected with "replay detected".
Timeout Policies
Every pending approval has a deadline. The behavior when the deadline passes is configured by timeout_action:
| Action | Behavior |
|---|---|
deny | Fail-closed. Default. Kernel signs a deny receipt with guard = approval-timeout. |
escalate | Advance to the next tier in the escalation chain. Each tier has its own approvers, channels, and timeout. |
auto_approve_advisory | Kernel self-signs an approval token. Receipt metadata flags auto_approved = true and review_required = true. |
Auto-approve is a compliance signal, not a safe default
auto_approve_advisory is not the same as allow. The call proceeds, but the receipt shouts that no human actually reviewed. Use it only where the cost of a missed call is strictly greater than the cost of an unreviewed-but-flagged call, and ensure your audit process really does catch up on these later.Python Sketch
The chio Python SDK surfaces the pending state as a structured response. A call that needs approval does not raise; it returns a result object that the caller inspects:
from chio import Client, PendingApproval, ToolDenied
client = Client.from_env()
result = await client.call_tool(
server="payment-server",
tool="issue_refund",
arguments={"customer_id": "cust-9012", "amount": 450, "currency": "USD"},
governed_intent={
"purpose": "Customer requested refund for order #8834",
"max_amount": {"units": 450, "currency": "USD"},
},
)
if isinstance(result, PendingApproval):
# Persist the approval request id in your workflow engine.
print(f"Awaiting approval: {result.approval_request_id}")
print(f"Deadline: {result.deadline}")
print(f"Summary shown to approver: {result.summary}")
# Later, when you have the signed token back from the channel:
result = await client.resume_with_approval(
approval_token=signed_token_from_channel,
)
if isinstance(result, ToolDenied):
raise RuntimeError(f"Tool denied: {result.reason}")
print("Refund issued:", result.output)TypeScript Sketch
import { ChioClient, isPendingApproval, isToolDenied } from "@chio-protocol/sdk";
const client = ChioClient.fromEnv();
const result = await client.callTool({
server: "payment-server",
tool: "issue_refund",
arguments: {
customer_id: "cust-9012",
amount: 450,
currency: "USD",
},
governedIntent: {
purpose: "Customer requested refund for order #8834",
maxAmount: { units: 450, currency: "USD" },
},
});
if (isPendingApproval(result)) {
// The agent does not see the token itself; it sees a request id
// and a deadline. Hand off to whatever workflow engine resumes
// the call when the channel posts back the signed token.
await workflow.park({
approvalRequestId: result.approvalRequestId,
deadline: result.deadline,
});
return;
}
if (isToolDenied(result)) {
throw new Error(`denied: ${result.reason}`);
}
console.log("refund issued:", result.output);In both sketches the SDK is careful not to block on a pending approval. The caller is expected to suspend its own workflow (a Temporal signal, a LangGraph interrupt, a Prefect pause) and resume only when the channel has produced a signed token.
Approval Channels
A channel is the surface through which a human sees the request and returns a decision. Chio ships several and lets you implement your own:
| Channel | Best For |
|---|---|
WebhookChannel | Custom dashboards, internal approval tools |
SlackChannel | Teams already on Slack; Block Kit with approve/deny buttons |
EmailChannel | Compliance-heavy environments that want a signed action link |
DashboardChannel | Ops teams watching the chio dashboard in real time (WebSocket push) |
ApiPollChannel | Programmatic approvers that poll /approvals/pending |
Configuration
A grant-level HITL policy in TOML looks like this:
[[grants]]
server = "payment-server"
tool = "issue_refund"
operations = ["invoke"]
[grants.constraints]
require_approval_above = { threshold_units = 200 }
governed_intent_required = true
[grants.approval]
timeout_seconds = 3600
timeout_action = "deny"
[[grants.approval.approvers]]
public_key = "ed25519:abc..."
display_name = "Finance Lead"
contact = { slack = "@finance-lead", email = "finance@example.com" }
[[grants.approval.approvers]]
public_key = "ed25519:def..."
display_name = "CFO"
contact = { slack = "@cfo", email = "cfo@example.com" }
[grants.approval.escalation]
terminal_action = "deny"
[[grants.approval.escalation.tiers]]
approvers = ["Finance Lead"]
channels = ["slack", "dashboard"]
timeout_seconds = 900
[[grants.approval.escalation.tiers]]
approvers = ["CFO"]
channels = ["slack", "email"]
timeout_seconds = 3600Batch Approval
Per-call approval creates friction for repetitive operations. Batch approval lets a human pre-approve a class of calls for a bounded window. A BatchApprovalToken declares a server pattern, tool pattern, per-call and total monetary ceilings, a max call count, and a validity window. The kernel consults the batch store before dispatching to channels: if a matching batch exists, the call is approved immediately and the batch counters are incremented.
# Examples of batch approval scopes:
"Approve all search calls for the next hour"
server_pattern: "search-server"
tool_pattern: "*"
max_calls: None
not_after: now + 3600
"Approve up to 20 database reads in the next 30 minutes"
server_pattern: "db-server"
tool_pattern: "read_*"
max_calls: Some(20)
not_after: now + 1800
"Approve payments under $100 for 4 hours, max $500 total"
server_pattern: "payment-server"
tool_pattern: "charge"
max_amount_per_call: { units: 100, currency: USD }
max_total_amount: { units: 500, currency: USD }
not_after: now + 14400Receipts for batch-approved calls carry the batch_approval_id in metadata so the audit trail never loses the connection between a specific call and the blanket approval that authorized it.
Receipts in the Approval Chain
Every transition is a signed receipt. An approved call produces a chain of two receipts: an Incomplete receipt at suspension, and an Allow receipt at execution. A denied-by-human call produces a chain of an Incomplete receipt at suspension and a Deny receipt at denial. A timeout produces one Incomplete at suspension and one Deny (or Allow if auto_approve_advisory is configured). The second receipt's metadata links back to the first by previous_receipt_id.
// Receipt 1: suspended (Decision::Incomplete with approval metadata)
{
"id": "rc-001",
"decision": { "verdict": "incomplete", "reason": "awaiting human approval" },
"metadata": {
"approval_request_id": "ar-d4e5",
"summary": "Agent wants to issue a $450 refund",
"deadline": 1713200400
}
}
// Receipt 2: approved and dispatched (Decision::Allow)
{
"id": "rc-002",
"decision": { "verdict": "allow" },
"metadata": {
"approval_request_id": "ar-d4e5",
"approval_token_id": "at-g6h7",
"approver": "ed25519:finance-lead...",
"approval_latency_ms": 127000,
"approver_display_name": "Finance Lead",
"channel": "slack",
"previous_receipt_id": "rc-001"
}
}Auditing Approval Activity
The Receipt Query API exposes filters that make it easy to audit HITL activity. Common queries:
# All pending approvals in the last day (Incomplete receipts with
# an approval_request_id in metadata).
chio receipt list --decision incomplete \
--meta approval_request_id --since 24h
# All human-denied calls.
chio receipt list --decision deny --guard human-approval
# Average approval latency for approved calls (Allow receipts that
# carry an approver in metadata).
chio receipt stats --decision allow --meta approver \
--group-by metadata.approver_display_name
# Calls auto-approved due to timeout (review_required=true).
chio receipt list --decision allow --meta auto_approved=trueSecurity Properties
- Fail-closed at every step. Channel dispatch failure, invalid signature, expired token, wrong request, capability revoked during wait, or timeout with no response: all produce a deny receipt.
- Kernel-native. The agent never sees the approval token and never talks to the approver directly. The kernel owns the lifecycle, which is why the approval guard can be part of the pipeline rather than an out-of-band hook.
- Non-repudiation. Every decision is signed with the approver's key and recorded in the receipt chain. The chain proves who approved, what they approved, when, and through which channel.
- Separation of concerns. Approvers see only the summary and intent, not raw arguments, unless the policy explicitly exposes them. Tool servers see only validated approved calls; they never learn that HITL was in the loop.
Adopting HITL in an Existing Deployment
The rollout sequence we recommend:
- Start with
RequireApprovalAboveon a single high-value tool. Pick a threshold that will trigger approvals for only the top few calls per day. - Configure one channel first (dashboard or Slack). Do not wire all channels at once; each adds review surface area.
- Set
timeout_action = denyfor the first month. Verify the team is meeting the SLA before softening toescalate. - Add batch approval once the team is confident with per-call review. Batch policies are harder to reason about; do not reach for them first.
- Once batch approval is settled and you have a handle on which tool categories are actually routed through the team, layer category-gated approval by routing your custom content-review or category-specific guards through the approval pipeline. (A first- class category constraint is on the roadmap; see Open Design Questions.)
Monitor the denial rate
Open Design Questions
These items are on the roadmap but not yet in the stable protocol:
- Multi-approver quorum. "2 of 3 must approve" policies for high-value operations.
- Approval delegation. Time-bounded handoff of approval authority (vacation coverage) reusing the existing delegation link mechanism.
- Partial approval. A human approves an amended version of the request (approve the refund for $300 instead of $450), requiring the token to carry modified parameters that the kernel re-binds into the intent.
- Cross-kernel approval. In federated deployments, whether an approval from one kernel can satisfy a pending request on another.
Summary
- HITL adds a third verdict to the guard pipeline:
PendingApproval, alongside Allow and Deny. Deny dominates; approval is solicited only when non-approval guards allow. - The request is suspended in a signed
Decision::Incompletereceipt whose metadata carries the approval request id and deadline, and routed to human-facing channels. - The human's decision is a signed
GovernedApprovalTokenbound to the request, intent, approver, agent, and a time window, with replay protection. - On approve the kernel re-validates capability and guards before dispatch, then emits an
Allowreceipt whose metadata identifies the approver and token. On deny it signs aDenyreceipt withguard = "human-approval". On timeout it follows the configured policy. - The protocol is alpha; expect additions for quorum, delegation, partial approval, and cross-kernel approval.