learning-loop
Structured self-improvement system for AI agents with confidence decay, cross-agent sharing, and anomaly detection. Use when: (1) After debugging sessions to capture lessons learned, (2) When receiving feedback or corrections from users, (3) Before risky actions to check relevant rules, (4) Weekly to review metrics and promote proven patterns to enforced rules, (5) Setting up persistent memory that survives session compactions.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install clawskills:clawskills~yoder-bawt-learning-loopcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Aclawskills~yoder-bawt-learning-loop/file -o yoder-bawt-learning-loop.md# Learning Loop
**Stop waking up stupid.**
AI agents lose everything on compaction. Every debugging session, every hard-won lesson, every correction from your human - gone. You start fresh and repeat the same failures. Your human notices. Trust erodes.
The Learning Loop is a structured self-improvement system that gives agents persistent, compounding intelligence. It captures what you learn, promotes proven patterns into hard rules, tracks your improvement over time, detects when your human is satisfied or frustrated - automatically, and now includes confidence decay and cross-agent knowledge sharing.
This isn't a toy. This is infrastructure for agents that want to get measurably better at their job, every single session.
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ LEARNING LOOP v1.4.0 │
├─────────────────────────────────────────────────────────────┤
│ │
│ INPUT LAYER PROCESSING LAYER OUTPUT LAYER │
│ ─────────── ──────────────── ──────────── │
│ │
│ Events ──────────▶ Pattern Detection ────▶ Reports │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ lessons.json Confidence Decay Rules │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Promotion ◀────── Anomaly Detection ◀────── Enforcement │
│ │
│ CROSS-AGENT LAYER: │
│ Export ─────▶ Portable Format ─────▶ Import │
│ │
└─────────────────────────────────────────────────────────────┘
```
**Data Flow:**
- **Tier 1: Events** - Raw logs of debugging sessions, mistakes, successes, feedback. Append-only, never deleted.
- **Tier 2: Lessons** - Patterns extracted from events. Tracked by applications and saves.
- **Tier 3: Rules** - Lessons promoted after 3+ successful applications with 0.9+ confidence.
**Confidence Decay:** Rules lose confidence over time using Ebbinghaus-inspired exponential decay. Stale rules (confidence < 0.5) are flagged for review.
**Cross-Agent Sharing:** Export rules as portable JSON with metadata (hashes, provenance). Import from other agents with conflict detection and trust scoring.
## When to Activate
Use the Learning Loop when:
1. **After debugging sessions** - Capture the problem, solution, and confidence level to events.jsonl
2. **Receiving user feedback** - Positive ("perfect", "exactly") or negative ("wrong", "I already told you") signals trigger automatic capture
3. **Before risky actions** - Check pre-action-checklist.md and rules.json for relevant constraints
4. **Weekly maintenance** - Run pattern detection, confidence decay, promote qualified lessons to rules, update metrics
5. **During compaction** - Flush uncaptured events to prevent knowledge loss
6. **Sharing knowledge** - Export rules for other agents, import rules from trusted sources
## What It Does
```
Events (raw) --> Lessons (structured) --> Rules (enforced)
append-only proven patterns hard constraints
events.jsonl lessons.json rules.json
```
**Three-tier knowledge system:**
- **Tier 1: Events** - Raw logs of debugging sessions, mistakes, successes, feedback. Append-only, never deleted.
- **Tier 2: Lessons** - Patterns extracted from events. Tracked by how many times they've been applied and how many mistakes they've prevented.
- **Tier 3: Rules** - Lessons promoted after 3+ successful applications. Loaded at boot. These are your behavioral constraints.
**Five enforcement layers** ensure learning happens even when discipline fails:
1. Boot sequence loads rules every session
2. Compaction flush saves uncaptured events before context compression
3. Heartbeat checks periodically scan for missed learning opportunities
4. Daily cron extracts events from session logs
5. Weekly cron runs pattern detection, metrics, confidence decay, and self-audit
No single layer is critical. If one fails, the others catch it.
## Quick Start
```bash
bash init.sh /path/to/workspace
```
That's it. You now have:
```
memory/learning/
├── events.jsonl # Raw event log (append-only)
├── rules.json # Hard behavioral rules (3 starter rules)
├── lessons.json # Structured lessons (intermediate tier)
├── pre-action-checklist.md # Check before risky actions
├── metrics.json # Improvement tracking
├── BOOT.md # Quick reference for session boot
├── parse-errors.jsonl # JSON parsing errors (v1.4.0)
└── weekly/ # Weekly learning reports
```
### Wire It In
Add to your agent's boot instructions (AGENTS.md or equivalent):
```markdown
## Every Session
1. Read `memory/learning/rules.json` - hard behavioral rules
2. Read `memory/learning/BOOT.md` - quick reference
3. Before risky actions, check `memory/learning/pre-action-checklist.md`
4. After mistakes or debugging, append to `memory/learning/events.jsonl`
5. Check rule confidence scores - rules with < 0.5 confidence need review
```
### Set Up Automation
**Daily (e.g. 4am):**
```bash
bash extract.sh /path/to/workspace
```
**Weekly (e.g. Sunday 10pm):**
```bash
bash detect-patterns.sh /path/to/workspace
bash confidence-decay.sh /path/to/workspace # NEW v1.4.0
bash promote-rules.sh /path/to/workspace
bash self-audit.sh /path/to/workspace
bash update-metrics.sh /path/to/workspace
```
### Optional: Compaction Flush
If your platform supports custom compaction prompts, add:
> "Append uncaptured learning events to memory/learning/events.jsonl and update rules.json if new rules emerged."
This is the safety net that catches learning even during context compression.
## Guardrails / Anti-Patterns
**DO:**
- ✓ Capture events immediately after debugging or receiving feedback
- ✓ Use structured JSON format with all required fields (ts, type, category, tags, problem, solution, confidence, source)
- ✓ Run weekly automation to promote lessons with 3+ successful applications
- ✓ Check rules.json before risky actions (account ops, shell commands, external comms)
- ✓ Use wal-capture.sh for critical details that must survive compaction
- ✓ Keep events.jsonl append-only; never delete or edit historical events
- ✓ Run confidence-decay.sh weekly to update rule confidence scores
- ✓ Export rules for cross-agent sharing using export-rules.sh
**DON'T:**
- ✗ Wait to capture events - memory degrades, details get lost
- ✗ Create rules without proven application history (minimum 3 successful applications)
- ✗ Skip the pre-action checklist for "quick" operations
- ✗ Delete events to "clean up" - use archive-events.sh for old data instead
- ✗ Assume lessons apply universally without considering context
- ✗ Manually edit rules.json - let promote-rules.sh handle promotion
- ✗ Ignore confidence scores below 0.5 - these rules need review
## Full Lifecycle Walkthrough
Here's the complete loop in action, from first mistake to enforced rule.
### Day 1: The Mistake
You're building a skill and run `find . -not -path '*/node_modules/*'` on macOS. It silently skips files. You spend 20 minutes debugging before discovering that extended attributes break `find`'s exclusion flags.
**Capture the event:**
```json
{"ts":"2026-02-07T15:00:00Z","type":"debug_session","category":"shell","tags":["macos","find","xattr"],"problem":"find -not -path silently skips files with com.apple.provenance on macOS","solution":"Pipe find output through grep -v instead of using find built-in exclusion flags","confidence":"proven","source":"skill-build"}
```
Append that line to `events.jsonl`. Done. The k