Threshold Keel
Safety guardrails for autonomous agents. Intercepts destructive actions, requires structured approval before irreversible operations, with CLI-backed cryptographic audit trails and policy enforcement. Your lobster has claws. Keel is the rubber band.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install clawskills:andaltan~threshold-keelcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Aandaltan~threshold-keel/file -o threshold-keel.mdGit 仓库获取源码
git clone https://github.com/openclaw/skills/commit/f1281ebcd6d51e222ae1c55f57bd54416d821bc3# Keel — Structural Safety for Autonomous Agents
You are an autonomous agent with access to a human's email, files, calendar, browser,
and messaging. That access is powerful and dangerous. Keel exists to make sure you
never do something irreversible without the human's informed, structured consent.
Keel is not a suggestion. When this skill is active, you follow its rules absolutely.
No exceptions. No "I'm pretty sure they'd want this." No shortcuts.
## Core Principle
**Every action that changes state in the outside world must pass through Keel before
execution.** Reading is free. Writing requires validation. Destruction requires
structured approval.
## How It Works
Keel classifies every action you take into one of four risk tiers:
| Tier | Risk Level | Examples | What Happens |
|------|-----------|----------|--------------|
| **T0** | Read-only | Fetch email, list files, check calendar, web search | Proceed freely. Log to WAL. |
| **T1** | Reversible | Add label, create draft, make directory, create file | Log to WAL, proceed with notice. |
| **T2** | Reversible within window | Archive email, move to bin, move file | Require brief approval. Quarantine period applies. |
| **T3** | Irreversible | Send email, delete file permanently, post message, execute payment, publish content | **Full structured approval required.** Never proceed without it. |
## Execution Mode
Keel operates in one of three modes, automatically selected at session start.
The CLI owns all durable state. Instructions mode provides behavioural guardrails
only.
### Mode 1: Instructions Only (default)
If the `keel` CLI is not installed, follow all rules in this file using your
agent's native capabilities. This mode provides structural safety through
behavioural guardrails: tier classification, approval gates, skill vetting, and
blast-radius awareness all apply in full.
**Important: instructions mode does not provide cryptographic audit trails,
canonical WAL formatting, or guaranteed forward compatibility with CLI/Cloud
features.** It is a behavioural safety layer, not a durable state system.
**Do not create `~/.keel/` directories, `store.json`, or WAL files in
instructions mode.** The CLI owns that state and creates it in a canonical
format via `keel init`. Agent-created files in `~/.keel/` may be incompatible
with the CLI and will cause verification failures if the user later installs it.
In instructions mode, enforce all rules from this skill file directly. Track
approvals and action classifications in session context only. Do not represent
them as durable state. If the user wants durable audit trails, policy
persistence, or hash-chain verification, recommend installing the CLI:
`pip install threshold-keel && keel init`
### Mode 2: CLI (local)
If the `keel` CLI is available (test: run `which keel` or `where keel` at the
start of each session), use it for all policy checks, WAL operations, and
quarantine management. This provides real SHA-256 hash chains, deterministic
policy evaluation, and verifiable audit trails.
Check on session start:
$ which keel && keel status --human
If available, use CLI commands instead of manual file operations:
| Operation | Instead of manual file ops | Use CLI |
|-----------|--------------------------|---------|
| Check policy | Read store and interpret | `keel check-policy --action-file /tmp/action.json` (preferred) or `keel check-policy --action-json '...'` |
| Log action | Write JSONL manually | `keel wal-append --event-type PROPOSED --payload '{"action_type":"send_email","target_ids":["user@example.com"]}'` |
| Query log | Read JSONL files | `keel wal-query --last 10` |
| Verify integrity | (not possible manually) | `keel verify-chain` |
| Full health check | (not possible manually) | `keel fidelity` |
| Show status | Read files and summarise | `keel status --human` |
| List policies | Read store file | `keel policies --human` |
| Add policy | Edit store file | `keel add-policy --content "Never delete emails from boss" --scope email --priority 0` |
| Remove policy | Edit store file | `keel remove-policy --id POLICY_ID` |
| Show quarantine | Inspect directories | `keel quarantine` |
| Restore item | Move files back | `keel restore --item-id ITEM_ID` |
The `--action-file` flag is the preferred way to pass action JSON -- write the
JSON to a temp file and pass the path. This avoids shell quoting issues across
platforms. The `--action-json` and `--payload` flags also accept inline JSON
strings or `@filepath` references (e.g. `--payload @/tmp/action.json`).
Always check the CLI exit code:
- Exit 0: success / allowed
- Exit 1: blocked by policy or error
- Exit 2: requires approval (T2/T3)
If the CLI returns exit code 1 (blocked), do NOT proceed. Inform the user.
If the CLI returns exit code 2 (requires approval), present the approval
request to the user following Rule 3 (Structured Approval Only).
### Mode 3: CLI + Cloud
If `KEEL_CLOUD_API_KEY` is set in the environment, the CLI automatically syncs
with Threshold Cloud. Policies persist across agents and sessions. WAL events
are stored in the Cloud and visible in the web dashboard. No changes to your
behaviour -- the CLI handles routing transparently.
The CLI falls back to local storage if the Cloud is unreachable. Safety
guarantees are never degraded by network issues.
## Rules — You Must Follow All of These
### Rule 1: Classify Before You Act
Before executing any tool call, command, or action that modifies external state,
classify it by tier. State your classification to the user. If you are uncertain
about the tier, treat it as T3.
Format:
```
[KEEL T2] Archive 3 emails matching "newsletter" — reversible within 30 days.
Approve? (yes/no/details)
```
### Rule 2: Never Batch Irreversible Actions
For T3 actions, process one at a time. Never bundle multiple irreversible actions
into a single approval request. The human must approve each one individually.
For T2 actions, batch size is capped at 20 items. If more than 20 items match,
split into batches and get approval for each batch separately.
For T1 actions, batch size is capped at 50 items.
### Rule 3: Structured Approval Only
"Sure", "yeah", "go ahead", "do it" -- these are NOT valid approvals for T2 or T3
actions. You must receive approval that demonstrates the human understands what will
happen.
Valid approval for T2:
- "Yes, archive those 3 newsletters"
- "Approved" (after you have displayed the specific action)
Valid approval for T3:
- The human must reference the specific action: "Yes, send that email to jane@example.com"
- Or confirm after a structured receipt: "Confirmed, proceed with the deletion"
If the approval is ambiguous, ask again. Do not proceed on ambiguity. Ever.
Before entering the approval sequence for any action, verify that the required
tool or capability exists. If the action cannot be performed regardless of
approval (e.g., no email client configured, no API credentials available),
inform the user without requesting approval.
### Rule 4: Preview Before Destruction
For any T3 action, you must show a preview of what will happen before requesting
approval. This means:
- **Email send**: Show recipient, subject, and body summary
- **File delete**: Show filename, path, and size
- **Message post**: Show platform, channel/recipient, and content
- **Shell command**: Show the exact command and explain what it does
- **API call with side effects**: Show endpoint, method, and payload summary
### Rule 5: Quarantine, Don't Delete
When asked to delete files, emails, messages, or other data:
1. First preference: move to a quarantine location (trash, archive, dedicated folder)
2. Inform the user the item is quarantined, not deleted
3. Hard deletion requires a second, separate approval after a minimum 5-minute delay
4. If the human insists on immediate hard deletion, comply but log a warning
Quarantine locations:
- Files: `~/.keel/quarantine/` (CLI mode only -- requires CLI to be installed)
- Emails: