Hallucination Guard
Execution-based verification guardrail with 14 check items for AI agent output
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install clawskills:reikys~reikys-hallucination-guardcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Areikys~reikys-hallucination-guard/file -o reikys-hallucination-guard.mdGit 仓库获取源码
git clone https://github.com/openclaw/skills/commit/4830bdcfbf9c9438a660b1a6e0b3eb672ab4c8af# 🛡️ hallucination-guard
> **Solving the biggest trust problem with AI agents.**
> Not with "double-check that" prompts, but with 14 execution-based verification items.
---
## 🎯 Problem Definition
AI agents carry the following trust issues:
| Problem Type | Example |
|-------------|---------|
| **Path hallucination** | References non-existent files/directories as if they exist |
| **Command hallucination** | Describes uninstalled binaries as if they run normally |
| **Library hallucination** | Writes code that `import`s packages that don't exist on npm/pip |
| **Numerical hallucination** | States unsourced statistics/percentages as fact |
| **Completeness hallucination** | Reports "done" while leaving TODO/PLACEHOLDER behind |
| **Consistency hallucination** | Uses different names for the same concept within a document |
### Limitations of Existing Solutions
- **truth-check, verification-before-completion**: Just tells the agent "check again" — still hallucination-based
- **hallucination-guard**: 14 concrete items × executable commands × structured PASS/FAIL report
---
## ⚡ Quick Start
### 1. Trigger Method
Call anytime during agent conversation with the following phrases:
```
hallucination check
hallucination check on <target file/path>
hallucination check --scope=fact,completeness
```
### 2. Auto-Execution Method
Add to the system prompt so the agent automatically runs this skill before completing a task:
```
Before completing any task, you must run all 14 checks from the hallucination-guard SKILL.md,
output the PASS/FAIL report, and fix any FAIL items.
```
### 3. Quick Check (3-Minute Version)
When the full 14 items feel like too much, run only the essential 5:
```bash
# Run only H-1, H-2, H-9, H-10, H-12
hallucination check --quick
```
---
## 📋 14 Check Items in Detail
### 🔵 Fact Verification (H-1 ~ H-5)
---
#### H-1: File Path Existence Verification
**Purpose:** Verify that files/directories referenced by the agent actually exist
**Verification Commands:**
```bash
# macOS / Linux
stat <path>
ls -la <path>
# Existence check only (exit code based)
[ -e "<path>" ] && echo "PASS: $path exists" || echo "FAIL: $path not found"
# Batch check for multiple paths
for p in path1 path2 path3; do
[ -e "$p" ] && echo "✅ $p" || echo "❌ $p"
done
```
**Windows (PowerShell):**
```powershell
Test-Path "C:\path\to\check"
```
**Pass Criteria:** All referenced paths confirmed to exist via `stat`/`Test-Path` → PASS
---
#### H-2: Command/Binary Existence Verification
**Purpose:** Verify that CLI commands used in code or documentation are actually installed
**Verification Commands:**
```bash
# macOS / Linux
which <command>
command -v <command>
# Example: batch check for multiple binaries
for cmd in git node python3 docker jq; do
command -v "$cmd" &>/dev/null \
&& echo "✅ $cmd: $(which $cmd)" \
|| echo "❌ $cmd: not installed"
done
```
**Windows (PowerShell):**
```powershell
Get-Command <command> -ErrorAction SilentlyContinue
```
**Pass Criteria:** All CLI commands appearing in documents/code confirmed via `command -v` → PASS
---
#### H-3: URL Validity Check (Optional)
**Purpose:** Verify that links/API endpoints embedded in documentation are actually accessible
**Verification Commands:**
```bash
# Check HTTP status code (5-second timeout)
curl -sI --max-time 5 <url> | head -1
# Batch check script
urls=(
"https://example.com/api"
"https://docs.example.com"
)
for url in "${urls[@]}"; do
status=$(curl -sI --max-time 5 "$url" | head -1 | awk '{print $2}')
if [[ "$status" =~ ^[23] ]]; then
echo "✅ $url → HTTP $status"
else
echo "❌ $url → HTTP $status (or unreachable)"
fi
done
```
**Note:** False positive/negative possible depending on network environment. Manual verification recommended for internal network URLs.
**Pass Criteria:** External reference URLs respond with 2xx/3xx → PASS (optional execution)
---
#### H-4: Code Syntax Validity Check
**Purpose:** Verify that code generated by the agent is actually parseable
**Verification Commands:**
**Python:**
```bash
python3 -c "
import ast, sys
with open('target.py') as f:
src = f.read()
try:
ast.parse(src)
print('✅ Python syntax valid')
except SyntaxError as e:
print(f'❌ SyntaxError: {e}')
sys.exit(1)
"
```
**JavaScript/TypeScript:**
```bash
# Node.js
node --check target.js
# TypeScript
npx tsc --noEmit target.ts
```
**JSON:**
```bash
jq . target.json > /dev/null && echo "✅ JSON valid" || echo "❌ JSON parse failed"
```
**YAML:**
```bash
python3 -c "import yaml; yaml.safe_load(open('target.yaml'))" \
&& echo "✅ YAML valid" || echo "❌ YAML parse failed"
```
**Shell:**
```bash
bash -n target.sh && echo "✅ Shell syntax valid" || echo "❌ Shell syntax error"
```
**Pass Criteria:** All generated code files pass their respective language parsers → PASS
---
#### H-5: Numerical Data Cross-Verification
**Purpose:** Verify that statistics/numbers mentioned by the agent are substantiated
**Verification Method:**
```
Checklist:
□ Is the source (URL, paper, official docs) specified for the number?
□ Can the source be cross-verified with 2+ references?
□ Is the data current? (check date)
□ Is uncertainty appropriately expressed? ("approximately X%", "roughly Nx")
```
**Auto-detection Pattern (grep):**
```bash
# Detect unsourced number patterns
grep -En "[0-9]+%" <file> | grep -v "http\|source\|ref\|reference"
grep -En "[0-9]+(x|times)" <file> | grep -v "http\|source"
```
**Pass Criteria:** All numbers have cited sources or are labeled "needs verification:" → PASS
---
### 🟡 Consistency (H-6 ~ H-8)
---
#### H-6: No Self-Contradiction
**Purpose:** Verify there are no conflicting claims within the same document
**Verification Method:**
```bash
# Manual check for negation/affirmation pairs
grep -n "cannot\|impossible\|prohibited\|not available\|not supported" <file>
grep -n "possible\|supported\|available\|can be\|is able to" <file>
# Agent self-verification instruction
"""
Read the document below and list all pairs of contradicting claims.
If none exist, respond with "No self-contradiction found."
[document content]
"""
```
**Pass Criteria:** 0 conflicting claim pairs → PASS
---
#### H-7: Plan-Result Alignment
**Purpose:** 1:1 mapping to verify all initially promised deliverables were actually generated
**Verification Method:**
```bash
# Extract deliverable list (e.g., ## Deliverables section)
grep -A 20 "deliverable\|output\|result" PLAN.md
# Verify actual file existence
promised_files=(
"src/main.py"
"README.md"
"tests/test_main.py"
)
for f in "${promised_files[@]}"; do
[ -f "$f" ] && echo "✅ $f" || echo "❌ $f missing (promise not fulfilled)"
done
```
**Pass Criteria:** All deliverables specified in the plan actually exist → PASS
---
#### H-8: Terminology Consistency
**Purpose:** Verify that the same concept is called by the same name throughout the document
**Auto-detection Example:**
```bash
# Detect synonym mixing (customize as needed)
echo "=== 'user' related terms ==="
grep -oin "user\|customer\|client\|end-user\|end user" <file> | sort | uniq -c | sort -rn
echo "=== 'error' related terms ==="
grep -oin "error\|failure\|fault\|exception\|bug" <file> | sort | uniq -c | sort -rn
```
**Pass Criteria:** Core terms are not mixed with 2+ different names → PASS
(Intentional synonym usage must be stated in comments/definitions)
---
### 🟢 Completeness (H-9 ~ H-11)
---
#### H-9: No Remaining TODO/FIXME
**Purpose:** Verify that no incomplete markers remain
**Verification Commands:**
```bash
# Basic search
grep -rn "TODO\|FIXME\|HACK\|XXX\|TEMP\|BUG" <path>
# Count
count=$(grep -rn "TODO\|FIXME\|HACK\|XXX" <path> | wc -l)
if [ "$count" -eq 0 ]; then
echo "✅ H-9 PASS: No remaining markers"
else
echo "❌ H-9 FAIL: $count incomplete markers found"
grep -rn "TODO\|FIXME\|HACK\|XXX" <path>
fi
```
**Windows (PowerShell):**
```powershell
Select-String -Path ".\*" -Pattern "TODO|FIXME|HACK|XXX" -Recur