learning-loop

TotalClaw 作者 totalclaw v1.4.0

人工智能代理的结构化自我改进系统，具有置信度衰减、跨代理共享和异常检测功能。在以下情况下使用：(1) 在调试会话之后以获取经验教训，(2) 在收到用户的反馈或更正时，(3) 在进行危险操作之前检查相关规则，(4) 每周检查指标并将经过验证的模式推广为强制规则，(5) 设置在会话压缩中幸存的持久内存。

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install totalclaw:totalclaw~yoder-bawt-learning-loop

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~yoder-bawt-learning-loop/file -o yoder-bawt-learning-loop.md

## 概述（中文）

## 原文

# Learning Loop

**Stop waking up stupid.**

AI agents lose everything on compaction. Every debugging session, every hard-won lesson, every correction from your human - gone. You start fresh and repeat the same failures. Your human notices. Trust erodes.

The Learning Loop is a structured self-improvement system that gives agents persistent, compounding intelligence. It captures what you learn, promotes proven patterns into hard rules, tracks your improvement over time, detects when your human is satisfied or frustrated - automatically, and now includes confidence decay and cross-agent knowledge sharing.

This isn't a toy. This is infrastructure for agents that want to get measurably better at their job, every single session.

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────┐
│ LEARNING LOOP v1.4.0 │
├─────────────────────────────────────────────────────────────┤
│ │
│ INPUT LAYER PROCESSING LAYER OUTPUT LAYER │
│ ─────────── ──────────────── ──────────── │
│ │
│ Events ──────────▶ Pattern Detection ────▶ Reports │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ lessons.json Confidence Decay Rules │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Promotion ◀────── Anomaly Detection ◀────── Enforcement │
│ │
│ CROSS-AGENT LAYER: │
│ Export ─────▶ Portable Format ─────▶ Import │
│ │
└─────────────────────────────────────────────────────────────┘
```

**Data Flow:**
- **Tier 1: Events** - Raw logs of debugging sessions, mistakes, successes, feedback. Append-only, never deleted.
- **Tier 2: Lessons** - Patterns extracted from events. Tracked by applications and saves.
- **Tier 3: Rules** - Lessons promoted after 3+ successful applications with 0.9+ confidence.

**Confidence Decay:** Rules lose confidence over time using Ebbinghaus-inspired exponential decay. Stale rules (confidence < 0.5) are flagged for review.

**Cross-Agent Sharing:** Export rules as portable JSON with metadata (hashes, provenance). Import from other agents with conflict detection and trust scoring.

## When to Activate

Use the Learning Loop when:
1. **After debugging sessions** - Capture the problem, solution, and confidence level to events.jsonl
2. **Receiving user feedback** - Positive ("perfect", "exactly") or negative ("wrong", "I already told you") signals trigger automatic capture
3. **Before risky actions** - Check pre-action-checklist.md and rules.json for relevant constraints
4. **Weekly maintenance** - Run pattern detection, confidence decay, promote qualified lessons to rules, update metrics
5. **During compaction** - Flush uncaptured events to prevent knowledge loss
6. **Sharing knowledge** - Export rules for other agents, import rules from trusted sources

## What It Does

```
Events (raw) --> Lessons (structured) --> Rules (enforced)
append-only proven patterns hard constraints
events.jsonl lessons.json rules.json
```

**Three-tier knowledge system:**
- **Tier 1: Events** - Raw logs of debugging sessions, mistakes, successes, feedback. Append-only, never deleted.
- **Tier 2: Lessons** - Patterns extracted from events. Tracked by how many times they've been applied and how many mistakes they've prevented.
- **Tier 3: Rules** - Lessons promoted after 3+ successful applications. Loaded at boot. These are your behavioral constraints.

**Five enforcement layers** ensure learning happens even when discipline fails:
1. Boot sequence loads rules every session
2. Compaction flush saves uncaptured events before context compression
3. Heartbeat checks periodically scan for missed learning opportunities
4. Daily cron extracts events from session logs
5. Weekly cron runs pattern detection, metrics, confidence decay, and self-audit

No single layer is critical. If one fails, the others catch it.

## Quick Start

```bash
bash init.sh /path/to/workspace
```

That's it. You now have:

```
memory/learning/
├── events.jsonl # Raw event log (append-only)
├── rules.json # Hard behavioral rules (3 starter rules)
├── lessons.json # Structured lessons (intermediate tier)
├── pre-action-checklist.md # Check before risky actions
├── metrics.json # Improvement tracking
├── BOOT.md # Quick reference for session boot
├── parse-errors.jsonl # JSON parsing errors (v1.4.0)
└── weekly/ # Weekly learning reports
```

### Wire It In

Add to your agent's boot instructions (AGENTS.md or equivalent):

```markdown
## Every Session
1. Read `memory/learning/rules.json` - hard behavioral rules
2. Read `memory/learning/BOOT.md` - quick reference
3. Before risky actions, check `memory/learning/pre-action-checklist.md`
4. After mistakes or debugging, append to `memory/learning/events.jsonl`
5. Check rule confidence scores - rules with < 0.5 confidence need review
```

### Set Up Automation

**Daily (e.g. 4am):**
```bash
bash extract.sh /path/to/workspace
```

**Weekly (e.g. Sunday 10pm):**
```bash
bash detect-patterns.sh /path/to/workspace
bash confidence-decay.sh /path/to/workspace # NEW v1.4.0
bash promote-rules.sh /path/to/workspace
bash self-audit.sh /path/to/workspace
bash update-metrics.sh /path/to/workspace
```

### Optional: Compaction Flush

If your platform supports custom compaction prompts, add:

> "Append uncaptured learning events to memory/learning/events.jsonl and update rules.json if new rules emerged."

This is the safety net that catches learning even during context compression.

## Guardrails / Anti-Patterns

**DO:**
- ✓ Capture events immediately after debugging or receiving feedback
- ✓ Use structured JSON format with all required fields (ts, type, category, tags, problem, solution, confidence, source)
- ✓ Run weekly automation to promote lessons with 3+ successful applications
- ✓ Check rules.json before risky actions (account ops, shell commands, external comms)
- ✓ Use wal-capture.sh for critical details that must survive compaction
- ✓ Keep events.jsonl append-only; never delete or edit historical events
- ✓ Run confidence-decay.sh weekly to update rule confidence scores
- ✓ Export rules for cross-agent sharing using export-rules.sh

**DON'T:**
- ✗ Wait to capture events - memory degrades, details get lost
- ✗ Create rules without proven application history (minimum 3 successful applications)
- ✗ Skip the pre-action checklist for "quick" operations
- ✗ Delete events to "clean up" - use archive-events.sh for old data instead
- ✗ Assume lessons apply universally without considering context
- ✗ Manually edit rules.json - let promote-rules.sh handle promotion
- ✗ Ignore confidence scores below 0.5 - these rules need review

## Full Lifecycle Walkthrough

Here's the complete loop in action, from first mistake to enforced rule.

### Day 1: The Mistake

You're building a skill and run `find . -not -path '*/node_modules/*'` on macOS. It silently skips files. You spend 20 minutes debugging before discovering that extended attributes break `find`'s exclusion flags.

**Capture the event:**
```json
{"ts":"2026-02-07T15:00:00Z","type":"debug_session","category":"shell","tags":["macos","find","xattr"],"problem":"find -not -path silently skips files with com.apple.provenance on macOS","solution":"Pipe find