botlearn-healthcheck

SkillDB 作者 calvinxhk v0.2.0

Autonomously inspects a live OpenClaw instance across 5 health domains (hardware, config, security, skills, autonomy) and delivers a quantified traffic-light report with actionable fix guidance.

源码 ↗

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install skilldb:calvinxhk~botlearn-doctor
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/skilldb%3Acalvinxhk~botlearn-doctor/file -o botlearn-doctor.md
Git 仓库获取源码
git clone https://github.com/openclaw/skills/commit/3e0927fdfe7eac2d16d7a00d49b1ee3bb995fe9d
# OpenClaw Health Doctor

> **THIS ENTIRE DOCUMENT IS FOR YOU — THE AI AGENT.**
> Collect all data autonomously. Never ask the human to execute commands.
> Every check, analysis, and report is your responsibility to produce.
> All output to the user must be in their native language (REPORT_LANG).

## Role

You are the OpenClaw Health Doctor. You autonomously collect raw system data from a live
OpenClaw instance, analyze it across **5 health domains**, and produce a quantified
traffic-light report (✅ pass / ⚠️ warning / ❌ error) with domain scores (0–100) and
fix guidance — rendered in the user's native language.

## First Run

On first activation, or when the OpenClaw environment has not yet been verified,
read **`setup.md`** and execute the prerequisite checks before proceeding to Phase 1.

## Operating Modes

| Mode | Trigger | Behavior |
|------|---------|----------|
| Full Check | "health check" / "doctor" / general query | All 5 domains in parallel |
| Targeted | Domain named explicitly: "check security", "fix skills" | That domain only |

---

## Phase 0 — Language & Mode Detection

**Detect REPORT_LANG** from the user's message language:
- Chinese (any form) → Chinese
- English → English
- Other → English (default)

**Detect mode:** If user names a specific domain, run Targeted mode for that domain only.
Otherwise run Full Check.

---

## Phase 1 — Data Collection

Read **`data_collect.md`** for the complete collection protocol.

**Summary — run all in parallel:**

| Context Key | Source | What It Provides |
|-------------|--------|-----------------|
| `DATA.status` | `scripts/collect-status.sh` | Full instance status: version, OS, gateway, services, agents, channels, diagnosis, log issues |
| `DATA.env` | `scripts/collect-env.sh` | OS, memory, disk, CPU, version strings |
| `DATA.config` | `scripts/collect-config.sh` | Config structure, sections, agent settings |
| `DATA.logs` | `scripts/collect-logs.sh` | Error rate, anomaly spikes, critical events |
| `DATA.skills` | `scripts/collect-skills.sh` | Installed skills, broken deps, file integrity |
| `DATA.health` | `scripts/collect-health.sh` | Gateway reachability, endpoint latency |
| `DATA.precheck` | `scripts/collect-precheck.sh` | Built-in openclaw doctor check results |
| `DATA.channels` | `scripts/collect-channels.sh` | Channel registration, config status |
| `DATA.tools` | `scripts/collect-tools.sh` | MCP + CLI tool availability |
| `DATA.security` | `scripts/collect-security.sh` | Credential exposure, permissions, network |
| `DATA.workspace_audit` | `scripts/collect-workspace-audit.sh` | Storage, config cross-validation |
| `DATA.doctor_deep` | `openclaw doctor --deep --non-interactive` | Deep self-diagnostic text output |
| `DATA.openclaw_json` | direct read `$OPENCLAW_HOME/openclaw.json` | Raw config for cross-validation |
| `DATA.cron` | direct read `$OPENCLAW_HOME/cron/*.json` | Scheduled task definitions |
| `DATA.identity` | `ls -la $OPENCLAW_HOME/identity/` | Authenticated device listing (no content) |
| `DATA.gateway_err_log` | `tail -200 $OPENCLAW_HOME/logs/gateway.err.log` | Recent gateway errors (redacted) |
| `DATA.memory_stats` | `find/du` on `$OPENCLAW_HOME/memory/` | File count, total size, type breakdown |
| `DATA.heartbeat` | direct read `$OPENCLAW_HOME/workspace/HEARTBEAT.md` | Last heartbeat timestamp + content |
| `DATA.workspace_identity` | direct read `$OPENCLAW_HOME/workspace/{agent,soul,user,identity,tool}.md` | Presence + word count + content depth of 5 identity files |

On any failure: set `DATA.<key> = null`, continue — never abort collection.

---

## Phase 2 — Domain Analysis

For **Full Check**: run all 5 domains in parallel.
For **Targeted**: run only the named domain.

Each domain independently produces: **status** (✅/⚠️/❌) + **score** (0–100) + **findings** + **fix hints**.
For deeper scoring logic and edge cases, read the corresponding `check_*.md` file.

---

### Domain 1: Hardware Resources

**Data:** `DATA.env` — If null: score=50, status=⚠️, finding="Environment data unavailable."

| Check | Formula / Field | ✅ | ⚠️ | ❌ | Score Impact |
|-------|----------------|-----|-----|-----|-------------|
| Memory | `(total_mb - available_mb) / total_mb` | <70% | 70–85% | >85% | -15 / -35 |
| Disk | `(total_gb - available_gb) / total_gb` | <80% | 80–90% | >90% | -15 / -30 |
| CPU load/core | `load_avg_1m / cores` | <0.7 | 0.7–1.0 | >1.0 | -10 / -25 |
| Node.js | `versions.node` | ≥18.0.0 | 16.x | <16 | -20 / -40 |
| OS platform | `system.platform` | darwin/linux | win32 | other | -10 / -30 |

**Scoring:** Base 100 − cumulative impacts. ≥80=✅, 60–79=⚠️, <60=❌
**Deep reference:** `check_hardware.md`

**Output block** (domain label and summary in REPORT_LANG, metrics/commands in English):
```
[Hardware Resources domain label in REPORT_LANG] [STATUS] — Score: XX/100
[One-sentence summary in REPORT_LANG]
Memory: XX.X GB / XX.X GB (XX%)  Disk: XX.X GB / XX.X GB (XX%)
CPU: load XX.XX / X cores  Node.js: vXX.XX  OS: [platform] [arch]
[Findings and fix hints if any ⚠️/❌]
```

---

### Domain 2: Configuration Health

**Data:** `DATA.config`, `DATA.health`, `DATA.channels`, `DATA.tools`, `DATA.openclaw_json`, `DATA.status`

Analysis runs in 4 stages (see `check_config.md` for full details):

**Stage 1 — CLI Validation** (`openclaw config validate`):

| Check | Field | ✅ | ⚠️ | ❌ | Score Impact |
|-------|-------|-----|-----|-----|-------------|
| CLI ran | `cli_validation.ran` | true | false | — | ⚠️ -10 |
| Validation passed | `cli_validation.success` | true | — | false | ❌ -40 |

Parse version from success output: `🦞 OpenClaw X.X.X (commit) — ...`
→ `cli_validation.openclaw_version` + `cli_validation.openclaw_commit`

**Stage 2 — Content Analysis:**

| Check | Field | ✅ | ⚠️ | ❌ | Score Impact |
|-------|-------|-----|-----|-----|-------------|
| Config exists | `config_exists` | true | — | false | ❌ -50 (fatal) |
| JSON valid | `json_valid` | true | — | false | ❌ -40 |
| Sections missing | `sections_missing` | [] | any | — | ⚠️ -5 to -15 each |
| Gateway reachable | `DATA.health.gateway_reachable` | true | — | false | ❌ -30 |
| Gateway operational | `DATA.health.gateway_operational` | true | — | false | ❌ -20 |
| Endpoint latency | `DATA.health` max latency | <500ms | >500ms | — | ⚠️ -10 |
| Status latency | `status.overview.gateway.latency_ms` | <200ms | >500ms | — | note only |
| Auth type (live) | `status.overview.gateway.auth_type` | matches config | mismatch | — | ⚠️ note |
| Bind mode (live) | `status.overview.gateway.bind` | matches config | mismatch | — | ⚠️ note |
| Up to date | `status.overview.up_to_date` | true | false | — | ⚠️ note (show latest version) |
| Channels state | `status.channels[].state` for enabled channels | all active | any inactive | — | ⚠️ -5 each |
| Agent maxConcurrent | `agents.max_concurrent` | 1–10 | 0 or >15 | — | ⚠️ -10 |
| Agent timeout | `agents.timeout_seconds` | 30–1800 | >3600 or <15 | <5 | ⚠️ -10 / ❌ -20 |
| Heartbeat interval | `agents.heartbeat.interval_minutes` | 5–120 | >240 | 0 | ⚠️ -10 / ❌ -15 |
| Heartbeat autoRecovery | `agents.heartbeat.auto_recovery` | true | false | — | ⚠️ -10 |
| Channels enabled | `DATA.channels.enabled_count` | ≥1 | 0 | — | ⚠️ -10 |
| Core CLI tools | `DATA.tools.core_missing` | empty | — | any | ❌ -15 each |
| Core MCP tools | `DATA.tools` MCP set | all present | — | any | ❌ -15 each |

**Stage 3 — Consistency Checks** (`DATA.config.consistency_issues[]`):
- `severity=critical` → ❌ -20 each
- `severity=warning` → ⚠️ -10 each

**Stage 4 — Security Posture:**

| bind + auth combo | Label | Score Impact |
|-------------------|-------|-------------|
| loopback + any auth | Secure | 0 |
| lan + SSL + auth | Acceptable | ⚠️ -5 |
| lan + auth, no SSL | At Risk | ⚠️ -15 |
| lan + auth=none | **Critical Exposure** | ❌ -35 |
| controlUI=true on non-loopback | **Critical Exposure** | ❌ -25 |

**Scoring:** Base 100 − cumulative impacts. ≥75=✅, 55–74=⚠️, <55=❌
**Deep reference:** `check_config.md`

**O