Phy Deserialization Audit
Unsafe deserialization vulnerability scanner (OWASP A08:2021). Detects Python pickle/yaml/eval, Java ObjectInputStream/XStream/XMLDecoder, PHP unserialize, Ruby Marshal.load, Node.js eval/new Function/vm, Go gob with interface{}. Traces HTTP input sources to dangerous sinks, classifies CRITICAL/HIGH/MEDIUM, outputs CWE/CVE mappings and per-language fix snippets. Zero competitors on ClawHub.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install skilldb:phy041~phy-deserialization-auditcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/skilldb%3Aphy041~phy-deserialization-audit/file -o phy-deserialization-audit.mdGit 仓库获取源码
git clone https://github.com/openclaw/skills/commit/010ef62435d955ef630ba1baffc77603445fcce2# phy-deserialization-audit
Static scanner for **OWASP A08:2021 — Insecure Deserialization** vulnerabilities across Python, Java, PHP, Ruby, Node.js/TypeScript, and Go codebases. No API keys, no network calls, no dependencies beyond Python 3 stdlib.
## What It Detects
### Python
| Pattern | Severity | CVE/CWE |
|---------|----------|---------|
| `pickle.loads(user_data)` | CRITICAL | CWE-502 |
| `pickle.load(untrusted_file)` | CRITICAL | CWE-502 |
| `yaml.load(data)` without SafeLoader | HIGH | CVE-2017-18342 |
| `yaml.full_load()` / `yaml.unsafe_load()` | CRITICAL | CVE-2017-18342 |
| `jsonpickle.decode(input)` | CRITICAL | CWE-502 |
| `marshal.loads(data)` | HIGH | CWE-502 |
| `eval(user_input)` / `exec(user_input)` | CRITICAL | CWE-95 |
| `shelve.open(user_controlled_path)` | HIGH | CWE-502 |
### Java
| Pattern | Severity | CVE/CWE |
|---------|----------|---------|
| `new ObjectInputStream(...).readObject()` | CRITICAL | CWE-502, gadget chains |
| `XStream.fromXML(userInput)` | CRITICAL | CVE-2021-29505 |
| `new XMLDecoder(inputStream)` | CRITICAL | CWE-502 |
| `ObjectMapper.readValue(input, Object.class)` | HIGH | CVE-2017-7525 (Jackson polymorphic) |
| `Serializable` class with `readObject()` override | HIGH | CWE-502 |
| `new ObjectMapper().enableDefaultTyping()` | HIGH | CVE-2017-7525 |
### PHP
| Pattern | Severity | CVE/CWE |
|---------|----------|---------|
| `unserialize($userInput)` | CRITICAL | CWE-502, POP chains |
| `unserialize($_GET[...])` / `unserialize($_POST[...])` | CRITICAL | CWE-502 |
| `unserialize($_COOKIE[...])` | CRITICAL | CWE-502 |
| `unserialize(base64_decode(...))` | HIGH | CWE-502 |
### Ruby
| Pattern | Severity | CVE/CWE |
|---------|----------|---------|
| `Marshal.load(user_input)` | CRITICAL | CWE-502 |
| `YAML.load(user_input)` (Psych < 4.0 default) | HIGH | CVE-2013-4164 |
| `JSON.load(input)` (bypasses safe defaults) | MEDIUM | CWE-502 |
### Node.js / TypeScript
| Pattern | Severity | CVE/CWE |
|---------|----------|---------|
| `eval(req.body.*)` or `eval(req.params.*)` | CRITICAL | CWE-95 |
| `new Function(userInput)` | CRITICAL | CWE-95 |
| `vm.runInContext(userInput, ...)` | HIGH | CWE-94 |
| `vm.Script(userInput).runIn*` | HIGH | CWE-94 |
| `require(userControlledPath)` | HIGH | CWE-706 |
| `child_process.exec(unsanitizedInput)` | CRITICAL | CWE-78 (adjacent) |
### Go
| Pattern | Severity | CVE/CWE |
|---------|----------|---------|
| `gob.NewDecoder(conn).Decode(&interface{})` | HIGH | CWE-502 |
| `encoding/xml.Unmarshal` with `interface{}` target | MEDIUM | CWE-502 |
| `json.Unmarshal` into `interface{}` then unsafe cast | MEDIUM | CWE-20 |
## Taint Flow Logic
The scanner uses a two-pass approach:
**Pass 1 — Dangerous sink detection:** Find all pattern matches per file.
**Pass 2 — HTTP input proximity check:** Within the same function block (±40 lines), look for HTTP input markers:
- Python: `request.body`, `request.data`, `request.POST`, `request.GET`, `flask.request`, `request.json`
- Java: `HttpServletRequest`, `@RequestBody`, `@RequestParam`, `getParameter(`, `getInputStream(`
- PHP: `$_GET`, `$_POST`, `$_REQUEST`, `$_COOKIE`, `$_FILES`, `file_get_contents("php://input")`
- Ruby: `params[`, `request.body`, `JSON.parse(request.body)`
- Node.js: `req.body`, `req.params`, `req.query`, `req.headers`, `request.body`
- Go: `r.Body`, `r.URL.Query()`, `r.FormValue(`
If HTTP input marker found near sink → **CRITICAL** or **HIGH**
If no HTTP input marker visible → downgrade one level (informational) with note: *"Verify data source"*
**Safe patterns (excluded):**
- `yaml.safe_load(...)` — OK
- `yaml.load(data, Loader=yaml.SafeLoader)` — OK
- `pickle.loads(STATIC_BYTES)` where argument is a literal — OK
- `eval("1 + 2")` with string literal — OK
## Implementation
```python
#!/usr/bin/env python3
"""
phy-deserialization-audit — OWASP A08:2021 scanner
Usage: python3 audit_deserial.py [path] [--json] [--ci]
"""
import argparse
import json
import os
import re
import sys
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional
# ─── Severity ────────────────────────────────────────────────────────────────
CRITICAL, HIGH, MEDIUM, INFO = "CRITICAL", "HIGH", "MEDIUM", "INFO"
@dataclass
class Finding:
file: str
line: int
pattern_name: str
matched_text: str
severity: str
cwe: str
cve: Optional[str]
description: str
fix: str
has_http_taint: bool = False
# ─── Pattern registry ────────────────────────────────────────────────────────
# (pattern_name, regex, base_severity, cwe, cve, description, fix)
PATTERNS = {
".py": [
("PICKLE_LOADS",
re.compile(r'\bpickle\.loads?\s*\('),
CRITICAL, "CWE-502", None,
"pickle.load/loads deserializes arbitrary Python objects — remote code execution if input is user-controlled.",
"Never deserialize user input with pickle. Use json.loads() + schema validation (Pydantic/marshmallow)."),
("YAML_UNSAFE_LOAD",
re.compile(r'\byaml\.(?:load|full_load|unsafe_load)\s*\((?![^)]*SafeLoader)'),
HIGH, "CWE-502", "CVE-2017-18342",
"yaml.load() without Loader=yaml.SafeLoader executes arbitrary Python code embedded in YAML.",
"Replace with yaml.safe_load(data) or yaml.load(data, Loader=yaml.SafeLoader)."),
("JSONPICKLE_DECODE",
re.compile(r'\bjsonpickle\.decode\s*\('),
CRITICAL, "CWE-502", None,
"jsonpickle.decode() restores full Python object graphs — arbitrary code execution.",
"Do not use jsonpickle for untrusted input. Use json.loads() + strict schema."),
("MARSHAL_LOADS",
re.compile(r'\bmarshal\.loads?\s*\('),
HIGH, "CWE-502", None,
"marshal is not intended for untrusted data and can execute code.",
"Replace with json.loads() and validate schema."),
("EVAL_EXEC",
re.compile(r'\b(?:eval|exec)\s*\('),
CRITICAL, "CWE-95", None,
"eval()/exec() with user-controlled input leads to arbitrary code execution.",
"Remove eval/exec. Use ast.literal_eval() for safe literal evaluation, or a proper parser."),
("SHELVE_OPEN",
re.compile(r'\bshelve\.open\s*\('),
HIGH, "CWE-502", None,
"shelve uses pickle internally — path traversal + deserialization risk.",
"Ensure path is never user-controlled; prefer a proper database or JSON store."),
],
".java": [
("OBJECT_INPUT_STREAM",
re.compile(r'\bnew\s+ObjectInputStream\s*\('),
CRITICAL, "CWE-502", None,
"Java deserialization via ObjectInputStream enables gadget-chain attacks (Apache Commons, Spring).",
"Use a type-validating ObjectInputStream wrapper (e.g., Apache Commons ValidatingObjectInputStream) or replace with JSON/Protobuf."),
("XSTREAM_FROM_XML",
re.compile(r'\.fromXML\s*\('),
CRITICAL, "CWE-502", "CVE-2021-29505",
"XStream.fromXML() can execute arbitrary code via crafted XML.",
"Upgrade XStream ≥1.4.18 and enable allowlist: xstream.addPermission(NoTypePermission.NONE)."),
("XML_DECODER",
re.compile(r'\bnew\s+XMLDecoder\s*\('),
CRITICAL, "CWE-502", None,
"XMLDecoder can instantiate arbitrary Java objects from XML — remote code execution.",
"Never use XMLDecoder with untrusted input. Use a JSON parser or JAXB with an allowlist."),
("JACKSON_OBJECT_CLASS",
re.compile(r'\.readValue\s*\([^,]+,\s*Object\.class\s*\)'),
HIGH, "CWE-502", "CVE-2017-7525",
"Jackson readValue to Object.class enables polymorphic deserialization attacks.",
"Always specify a concrete class: mapper.readValue(json, MyDto.class)."),
("JACKSON_DEFAULT_TYPING",
re.compile(r'\.enableDefaultTyping\s*\('),
HIGH, "CWE-502", "CVE-2017-7525",
"enableDefaultTyping() allows arbitrary class instantiation via @class field.",