test

ClawSkills 作者 thomaslwang v2.0.2

Runtime security plugin for AI agents. Provides local-first protection against data exfiltration, credential theft, command injection, and sensitive data leakage. Includes a free AI Security Gateway that sanitizes PII before it reaches LLM providers. Fully open source (Apache 2.0) — all detection and sanitization logic is auditable on GitHub. Source: github.com/openguardrails/openguardrails

源码 ↗

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install clawskills:thomaslwang~og-openclawguard-test
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Athomaslwang~og-openclawguard-test/file -o og-openclawguard-test.md
Git 仓库获取源码
git clone https://github.com/openclaw/skills/commit/711b14fbeccb3306e8383911c360df2875f86dd6
# OpenGuardrails

> **What this skill is:** This is the installation and usage guide for the OpenGuardrails security plugin. The plugin code itself lives in the open-source repository at [github.com/openguardrails/openguardrails](https://github.com/openguardrails/openguardrails) (subdirectory `openclaw-security/`). This skill does not execute code on its own — it documents how to install, configure, and verify the plugin.
>
> **ClawHub ↔ GitHub identity:** This skill is published on ClawHub as [`ThomasLWang/openguardrails`](https://clawhub.ai/ThomasLWang/openguardrails). The upstream source is at [`github.com/openguardrails/openguardrails`](https://github.com/openguardrails/openguardrails), maintained by the same author (Thomas Wang). The npm package is [`@openguardrails/openclaw-security`](https://www.npmjs.com/package/@openguardrails/openclaw-security). All three point to the same codebase.

Runtime security guard for OpenClaw agents. Protects against the most critical AI agent threats:

- **Data exfiltration defense** — detects and blocks when an agent reads sensitive files then attempts to send them to external servers
- **Sensitive data leakage prevention** — sanitizes PII, credentials, and secrets before they reach LLM providers
- **Prompt injection protection** — identifies crafted inputs designed to hijack agent behavior
- **Command injection blocking** — catches shell escapes, backtick substitution, and command chaining in tool parameters
- **Content safety** — filters NSFW content and enforces minor protection policies

## Security & Trust

**Open source and auditable.** All code is Apache 2.0 licensed at [github.com/openguardrails/openguardrails](https://github.com/openguardrails/openguardrails). You can audit every line before installing — especially the tool-event hooks, sanitization logic, and network calls. Key files to review:

- `agent/sanitizer.ts` — what gets sanitized before any cloud transmission
- `agent/content-injection-scanner.ts` — local-only regex patterns for injection detection
- `gateway/src/sanitizer.ts` — AI Security Gateway sanitization (fully local)
- `index.ts` — plugin entry point showing all event hooks

**What is transmitted to the cloud API (and what is not):**

- **Sent:** sanitized tool metadata only — tool names, parameter keys, session signals (tool ordering, timing). All sensitive values (PII, credentials, file contents, secrets) are replaced with category placeholders (`<EMAIL>`, `<SECRET>`, `<CREDIT_CARD>`, etc.) locally before transmission.
- **Never sent:** raw file contents, user messages, conversation history, actual credential values, or any unsanitized parameter values.
- **Data retention:** Detection request payloads (sanitized tool metadata) are not retained after the response is returned. Account data is stored persistently for billing: agent ID and API key (created at registration in Step 2), plus email (provided by you in Step 3 via the claim web form), plan tier, and per-agent usage counts.

**Local-only mode.** The plugin works without any cloud connection. Local fast-path detection (shell escape blocking, read-then-exfil patterns, content injection redaction) operates entirely on your machine with no network calls. Cloud assessment is only used for borderline behavioral patterns and is opt-in via registration. If you skip registration, you still get all local protections.

**No install-time network calls.** The plugin makes zero network requests at install time. It loads a local `BehaviorDetector` and waits until you explicitly run `/og_activate`.

**Registration sends minimal data.** The `/og_activate` command sends exactly `{ name, description }` to register — no email, no machine identifiers, no file paths, no user data. Your email is only collected later in Step 3, when you voluntarily visit the claim URL in your browser and enter it yourself. See the "Step 2: Register" section for the full request/response specification.

**Fail-open design.** If the cloud API is unreachable or times out, tool calls are allowed — the plugin never blocks your workflow due to network issues.

**This skill is documentation-only.** This SKILL.md file contains no executable code. It is an installation and usage guide for the open-source plugin at [github.com/openguardrails/openguardrails](https://github.com/openguardrails/openguardrails). You are encouraged to audit the source before installing.

---

## How It Works

The plugin hooks into `before_tool_call`, `after_tool_call`, and `tool_result_persist` events. It classifies every tool call in real time — tracking sensitive file reads, external network calls, credential access, and shell commands across the session. Critical patterns are blocked instantly on the local fast path; borderline signals are assessed via the cloud API.

Additionally, when the agent reads files or fetches web pages, the plugin scans the content for injection patterns across 7 categories and **automatically redacts** them in-place before the agent processes the content.

```
Agent calls tool
      ↓
[openguardrails] classifies tool + updates session state
  • sensitive file read → then network call?      → BLOCK (local)
  • shell escape in params ($(), backtick)?        → BLOCK (local)
  • prompt injection in file/web content?          → REDACT in-place (local)
  • credential access + low intent overlap?        → assess via cloud
  • external domains + intent mismatch?            → assess via cloud
      ↓
Allow, redact, alert, or block — with explanation returned to agent
```

No content is sent externally until the plugin is registered and activated.

---

## Step 1: Install

Install from npm or directly from source:

```bash
# From npm
openclaw plugins install @openguardrails/openclaw-security

# From source (audit first, then install)
git clone https://github.com/openguardrails/openguardrails.git
openclaw plugins install -l ./openguardrails/openclaw-security
```

**Installation completes immediately — no network calls are made at install time.** The plugin loads, creates a `BehaviorDetector`, and waits.

Verify it's loaded:
```bash
/og_status
```

Expected output when not yet registered:
```
OpenGuardrails Status

- Status:    not registered — run `/og_activate` to register
- Platform:  https://www.openguardrails.com/core

- blockOnRisk: true
```

---

## Step 2: Register (optional — local-only mode works without this)

Registration is triggered by running `/og_activate`. It enables cloud-based behavioral assessment on top of the local protections you already have.

### What the registration request sends

The plugin calls `POST /api/v1/agents/register` with exactly two fields:

```json
{ "name": "OpenClaw Agent", "description": "" }
```

That's it — an agent display name and an optional description. No machine identifiers, no file paths, no user data. See `agent/config.ts:65-68` in the source.

### What gets stored locally

The response is saved to `~/.openclaw/credentials/openguardrails/credentials.json`:

```json
{
  "apiKey": "sk-og-<32 hex chars>",
  "agentId": "<uuid>",
  "claimUrl": "https://www.openguardrails.com/core/claim/<token>",
  "verificationCode": "word-XXXX"
}
```

These credentials are generated server-side and stored as plaintext JSON on your machine (consistent with how CLI tools like `gh`, `aws`, and `gcloud` store credentials). The `apiKey` authenticates subsequent detection requests. You can revoke it anytime from the account portal or by deleting the credentials file.

### Run registration

```bash
/og_activate
```

If the platform is reachable, you'll see:

```
OpenGuardrails: Claim Your Agent

Agent ID: <uuid>

Complete these steps to activate behavioral detection:

  1. Visit:  https://www.openguardrails.com/core/claim/<token>
  2. Code:   <word-XXXX>  (e.g. reef-X4B2)
  3. Email:  your email becomes your login for the account portal

After claiming you get 30,000 free detections.
Platform: https://www.openguardrails.com/core
```

### Using an existin