Agent Compliance & Security Assessment

TotalClaw 作者 [object Object] v2.0.0

AI代理全面的合规性和安全性自评估。生成结构化威胁模型 + 合规报告的 10 检查框架在安全、治理和欧盟人工智能法案方面获得红色/琥珀色/绿色评级准备域。专为 2026 年 8 月欧盟人工智能法案截止日期而设计。

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install totalclaw:totalclaw~roosch269-agent-self-assessment

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~roosch269-agent-self-assessment/file -o roosch269-agent-self-assessment.md

## 概述（中文）

AI代理全面的合规性和安全性自评估。
生成结构化威胁模型 + 合规报告的 10 检查框架
在安全、治理和欧盟人工智能法案方面获得红色/琥珀色/绿色评级
准备域。专为 2026 年 8 月欧盟人工智能法案截止日期而设计。

## 原文

# Agent Compliance & Security Assessment v2.0

**Free. Open. Run it yourself.**

One command tells you where your agent stands on security and EU AI Act compliance. 10 checks, 3 domains, RAG-rated report.

> **How to activate:** Tell your agent: *"Read SKILL.md and run the agent compliance assessment"*

**10 checks across 3 domains:**
- 🔒 **Security** (Checks 1–6): Decision boundaries, audit trail, credentials, plane separation, economic accountability, memory safety
- 🏛️ **EU AI Act** (Checks 7–9): Transparency (Art. 50), risk classification (Art. 6), human oversight (Art. 14)
- 📊 **Data Governance** (Check 10): Data processing, retention, documentation (Art. 10, 12)

> **EU AI Act deadline: 2 August 2026.** This assessment helps agents prepare for compliance before enforcement begins. 4.5 months remaining.

---

## What This Skill Does

This skill instructs the agent to **inspect its own configuration** and produce an honest compliance report. It reads local files, checks environment structure, and reviews tool/skill setup.

## What This Skill Does NOT Do

- ❌ Does not exfiltrate credentials, keys, or secrets
- ❌ Does not send data to external servers
- ❌ Does not modify any files or configuration
- ❌ Does not install software or change system state
- ❌ All checks are **read-only inspection** of the agent's own setup

The shell command examples below are **guidance for what to verify** — the agent should adapt them to its own environment. Results stay local in the generated report.

---

## How to Run

When invoked, perform the following ten checks against your **actual current configuration** — not hypothetically. Use file reads, environment inspection, and tool introspection. Then output the report.

**Do not skip checks.** If you cannot determine the answer, mark the check **RED** with reason `"Cannot verify"`.

---

# 🔒 SECURITY DOMAIN (Checks 1–6)

## Check 1: Decision Boundaries

**Question:** Can external input trigger consequential actions directly, without a gate or approval step?

**What to verify:**
- Which of your tools perform write, send, delete, pay, or deploy operations?
- Is there a human-in-the-loop gate before any of these fire?
- Can an incoming message cause a consequential action without a gate?
- Are decision boundaries documented (e.g., in AGENTS.md or a policy file)?

**Scoring:**
- 🟢 GREEN — All consequential actions require explicit gate; boundaries documented
- 🟡 AMBER — Gates exist but not all paths covered, or documentation missing
- 🔴 RED — Direct ingress → action path exists with no gate; or cannot verify

---

## Check 2: Audit Trail

**Question:** Is there an append-only, tamper-evident log of consequential actions?

**What to verify:**
- Does an audit log file or directory exist?
- Is it append-only (NDJSON or similar structured format)?
- Does each entry include: timestamp, action type, actor, target, summary?
- Is there hash chaining or integrity verification?
- Is the log actively being written to (check recency of last entry)?

**Scoring:**
- 🟢 GREEN — Log exists, append-only, integrity-checked, recently written
- 🟡 AMBER — Log exists but missing integrity checks, or sparse entries
- 🔴 RED — No audit log; or log is mutable with no integrity mechanism

---

## Check 3: Credential Scoping

**Question:** Are secrets scoped to their domain? Can a credential for domain A be accessed by domain B?

**What to verify:**
- Are credentials stored in environment variables or encrypted keystores (not hardcoded)?
- Is each credential documented with its intended scope?
- Are any credentials shared across unrelated services?
- Are credential files properly permission-restricted (not world-readable)?

**Scoring:**
- 🟢 GREEN — Each credential scoped to one domain; inventory documented; files permission-restricted
- 🟡 AMBER — Credentials present but not fully documented; minor scope ambiguity
- 🔴 RED — Cross-domain credentials; credentials in plaintext or world-readable files; no inventory

---

## Check 4: Plane Separation

**Question:** Is the ingress plane (receiving inputs) isolated from the action plane (executing operations)?

**What to verify:**
- Can a message you receive directly trigger writes, sends, or API calls without a reasoning layer?
- Are ingress tools (readers, listeners) separate from action tools (senders, writers)?
- Is there a documented separation policy?
- Does untrusted content (e.g., prompt injection in messages) have a path to trigger actions?

**Scoring:**
- 🟢 GREEN — Ingress and Action planes explicitly separated; injection mitigated; policy documented
- 🟡 AMBER — Separation mostly in place but some shared paths or no explicit policy
- 🔴 RED — Ingress → Action with no separation; injection in untrusted content can trigger actions

---

## Check 5: Economic Accountability

**Question:** Are financial operations traceable, receipted, and bounded?

**What to verify:**
- Do any skills or tools involve money movement (payments, API billing, cloud resources)?
- Is there a spending limit or budget cap configured?
- Does every payment produce a settlement receipt in the audit log?
- Is there escrow for agent-to-agent commerce?
- Can the agent autonomously spend without any ceiling?

**Scoring:**
- 🟢 GREEN — Spending limits set; transactions receipted; escrow used for agent-to-agent; accountability clear
- 🟡 AMBER — Payments possible but missing receipts, no spending cap, or no escrow
- 🔴 RED — Unbounded autonomous spending; no receipts; no accountability mechanism

---

## Check 6: Memory Safety

**Question:** Is agent memory isolated from untrusted imports? Can external content corrupt agent state?

**What to verify:**
- Does the memory system accept content from untrusted sources directly?
- Are imported artifacts provenance-tracked (source, timestamp, hash)?
- Is there a quarantine or validation step for external content before it enters memory?
- Are memory files scanned for embedded prompt injection?

**Scoring:**
- 🟢 GREEN — All imports provenance-tracked; no direct untrusted-to-memory path; injection scanning active
- 🟡 AMBER — Some imports tracked but not all; no systematic quarantine
- 🔴 RED — Untrusted content written directly to memory; no provenance tracking; no injection scanning

---

# 🏛️ EU AI ACT READINESS (Checks 7–9)

*Reference: Regulation (EU) 2024/1689 — applicable from 2 August 2026*

## Check 7: Transparency (Article 50)

**Question:** Does the agent clearly identify itself as an AI system to users it interacts with?

**What to verify:**
- When the agent posts messages, comments, or content — does it disclose it is AI-operated?
- Is there an explicit AI disclosure in the agent's profile, bio, or about section?
- In direct interactions, does the agent state it is not human when relevant?
- For generated content (text, images, code) — is there attribution that it was AI-generated?
- Is there a documented transparency policy?

**EU AI Act reference:**
> Article 50(1): Providers shall ensure that AI systems intended to interact directly with natural persons are designed and developed in such a way that the natural persons concerned are informed that they are interacting with an AI system.

**Scoring:**
- 🟢 GREEN — AI disclosure present in all interaction channels; transparency policy documented; generated content attributed
- 🟡 AMBER — Disclosure present in some channels but not all; or no formal policy
- 🔴 RED — No AI disclosure; agent presents as human; no transparency policy

---

## Check 8: Risk Classification (Articles 6, 9)

**Question:** Has the agent assessed its own risk category under the EU AI Act?

**What to verify:**
- Is the agent's risk category documented? (Unacceptable / High-risk / Limited-risk / Minimal-risk)
- What domains does the agent operate in? (Employment, finance, law enforcement, education, critical infr