context-management

TotalClaw 作者 totalclaw v1.0.0

管理 AI 代理上下文窗口消耗，防止压缩死亡螺旋，并执行子代理生成策略。在以下情况下使用：(1) 上下文已满并且工作质量可能会下降，（2）决定是否产生子代理或在主会话，(3) 准备压缩或会话切换，(4) 用户询问 “什么正在吞噬我的上下文？”或“还剩多少跑道？”，(5) 压实后从检查点文件恢复工作状态。不适用于：一般内存/工作空间管理（使用内存管理器或工作区标准）。

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install totalclaw:totalclaw~marcus-daemon-context-management

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~marcus-daemon-context-management/file -o marcus-daemon-context-management.md

## 概述（中文）

管理 AI 代理上下文窗口消耗，防止压缩死亡螺旋，
并执行子代理生成策略。在以下情况下使用：(1) 上下文已满并且
工作质量可能会下降，（2）决定是否产生子代理或在
主会话，(3) 准备压缩或会话切换，(4) 用户询问
“什么正在吞噬我的上下文？”或“还剩多少跑道？”，(5) 压实后
从检查点文件恢复工作状态。不适用于：一般内存/工作空间
管理（使用内存管理器或工作区标准）。

## 原文

# Context Management

Prevent context exhaustion, enforce spawn discipline, and make compaction survivable.

## Core Concepts

1. **Fixed baseline**: Typically 5-15% of context consumed before any conversation — system prompt, workspace files, skill descriptions, tool definitions. Varies by setup (more skills/files = higher baseline).
2. **60/40 rule**: ~60% of consumed context is tool outputs, ~40% conversation. Tool outputs are the primary target for savings.
3. **Compaction is lossy**: Summaries stack cumulatively. Each cycle raises the floor. After 3+ compactions, summaries alone can consume 30%+ of context.
4. **Sub-agents are disposable context**: A sub-agent can burn most of its context investigating something; only the summary (~500 tokens) enters main context.

All percentages are relative to the model's context window. Check `session_status` for actual window size and usage.

## Procedures

### When Context Pressure Rises

After every tool-heavy operation (>5 tool calls), assess:

1. Run `session_status` to check usage
2. If **below 50%**: continue normally
3. If **50-70%**: spawn sub-agents for remaining tool-heavy work (>3 tool calls)
4. If **70-85%**: spawn sub-agents for ANY tool work (>1 tool call). Warn user.
5. If **above 85%**: write checkpoint (see below), suggest `/compact` or `/new`

### "What's Eating My Context?" — Estimation Method

Cannot get exact per-component breakdown. Estimate:

```
Fixed baseline:         ~5-15% (system prompt + workspace files + skills + tools)
Per user message:       ~100-500 tokens each
Per assistant response: ~200-1000 tokens each
Per tool call result:   ~500-5000 tokens each (exec/read heavy, search light)
Compaction summaries:   ~2000-5000 tokens each (cumulative!)
```

Count messages and tool calls in recent history, multiply by midpoint estimates. Report as ranges, not false precision. For per-operation cost detail, read `references/operation-costs.md`.

### Spawn Policy

If `.context-policy.yml` exists in workspace root, use it as guidance for spawn thresholds and task categories. Otherwise use these defaults:

**Always spawn** (regardless of context level):
- Test suites (>3 tests)
- Multi-file audits (>5 files)
- Build/deploy pipelines
- Research tasks (web search + analysis)
- Bulk file operations

**Never spawn** (keep in main session):
- Single commands
- Conversations / discussions
- Quick edits (1-3 files)
- Status checks
- Tasks requiring user input mid-execution

**Context-dependent** (spawn when context exceeds threshold):
- Above 50%: spawn if task involves >5 tool calls
- Above 70%: spawn if task involves >2 tool calls

When spawning, write detailed task descriptions. Sub-agents have no conversation context — they only know what the task field tells them.

### Pre-Compaction Checkpoint

Before compaction or `/new`, write `.context-checkpoint.md` in the **workspace root** (the agent reads this post-compaction):

```markdown
# Context Checkpoint — {date} {time}

## Active Task
{what you were doing}

## Key State
{bullet list of current state — what's done, what's in progress}

## Decisions Made This Session
{numbered list of decisions with rationale}

## Files Changed
{list of files modified this session}

## Next Steps
{what to do after resuming}
```

This file survives compaction. On session start or post-compaction, check for it and use it to restore context. Delete after consuming.

**Coordination with OpenClaw memoryFlush:** OpenClaw may fire its own pre-compaction flush (writing to daily log). The checkpoint is complementary — the flush saves to the daily log, the checkpoint saves structured resume state. Both should exist. If the memoryFlush fires first, compaction may already be in progress. For critical sessions, write checkpoints proactively at 75%, don't wait for 85%.

The `scripts/context-checkpoint.sh` script handles basic write/read/clear. For the full 5-section checkpoint, write the file directly — multiline content works better that way.

### Post-Compaction Recovery

After compaction or `/new`:

1. Read `.context-checkpoint.md` if it exists
2. Read today's daily log if the workspace has one (e.g. `memory/{today}.md`)
3. Resume from the checkpoint's "Next Steps"
4. Delete the checkpoint file after restoring context

### Proactive Warning Template

When context exceeds 65%, warn:

```
⚠️ Context: {pct}% ({used}k/{total}k). Estimated runway: ~{remaining_calls}
tool calls. {recommendation}
```

Recommendations by level:
- 65%: "Spawning sub-agents for remaining tool-heavy work."
- 75%: "Recommend compacting soon. Writing checkpoint."
- 85%: "Context critical. Writing checkpoint now. Suggest `/compact` or `/new`."

## Session Profiling & Config Advice

After significant work (or on request), profile the current session and recommend config changes.

### Step 1: Classify the Session Pattern

Run `session_status`. Count approximate tool calls and message exchanges. Classify:

| Pattern | Signature | Example |
|---------|-----------|---------|
| **Tool-heavy** | Most context from tool results, many exec/read/web calls | Audits, migrations, test suites, debugging |
| **Conversational** | Most context from messages, few tool calls | Planning, discussion, decisions |
| **Mixed** | Roughly even split | Feature builds (discuss → code → test → discuss) |
| **Bursty** | Long quiet periods with intense tool bursts | Monitoring + incident response |

### Step 2: Recommend Config

There are four settings that matter. When explaining them to the user, always describe **what they do in practice**, not just the setting name:

**1. When to compress the conversation** (`reserveTokensFloor`)
How full the context gets before the agent summarises and compresses the history. A higher number means it compresses sooner — producing a shorter summary with more room left afterwards.
- `30000` — waits until nearly full. Risk: huge summary, little room after.
- `50000` — compresses at ~75% full. Good balance.
- `60000` — compresses early at ~70%. Maximum breathing room.

**2. How quickly old tool output is cleared** (`pruning TTL`)
After you stop talking for this long, the agent clears old command outputs, file reads, and search results from memory. Shorter = more aggressive cleanup.
- `5m` — only clears after 5 minutes of silence. Rarely fires during active work.
- `2m` — clears after 2 minutes. Good for most workflows.
- `1m` — aggressive. Clears fast, but you might need to re-read files.

**3. How many recent exchanges are protected from cleanup** (`keepLastAssistants`)
When clearing old tool output, this many of your most recent back-and-forth exchanges are kept untouched.
- `3` — keeps more history visible. Good for conversations.
- `2` — moderate protection.
- `1` — only the last exchange is safe. Most aggressive cleanup.

**4. Minimum size before tool output gets trimmed** (`minPrunableToolChars`)
Only tool results larger than this (in characters) are eligible for trimming. Lower = more things get cleaned up.
- `50000` (default) — only trims very large outputs (long file reads, huge command output).
- `10000` — also trims medium outputs. Catches more.
- `5000` — aggressive. Most tool results are eligible.

**Recommended combinations by work style:**

| Work style | Compress at | Clear after | Protect | Trim above | 
|------------|------------|-------------|---------|------------|
| Tool-heavy (audits, tests, debugging) | `60000` | `1m` | `1` | `10000` |
| Conversational (planning, discussion) | `30000` | `5m` | `3` | `50000` |
| Mixed (code → test → discuss) | `50000` | `2m` | `2` | `10000` |
| Bursty (monitoring + incidents) | `50000` | `2m` | `1` | `10000` |

Additional tips:
- **Sessions with browser/canvas work**: Ensure those tools are protected from cleanup in the config
- **Long-r