Phy Path Traversal Audit
Path traversal and Local File Inclusion (LFI) vulnerability scanner (OWASP A01:2021). Detects user-controlled paths passed to file system sinks in Python/Java/PHP/Node.js/Go/Ruby without containment checks. Identifies missing os.path.abspath+startswith, realpath validation, basename stripping, and PHP include/require with user input. Outputs CWE-22/CWE-23 findings with HTTP taint analysis and per-language safe-path-handling code snippets. Zero competitors on ClawHub.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install skilldb:phy041~phy-path-traversal-auditcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/skilldb%3Aphy041~phy-path-traversal-audit/file -o phy-path-traversal-audit.mdGit 仓库获取源码
git clone https://github.com/openclaw/skills/commit/085e75d942d082f0c31550bd262d3712988e6056# phy-path-traversal-audit
Static scanner for **OWASP A01:2021 — Broken Access Control / Path Traversal** (CWE-22) and **Local File Inclusion** (CWE-98). Finds file system sinks that accept user-controlled paths, checks for missing containment guards, and flags PHP `include`/`require` patterns that allow template injection. Zero external API calls, zero dependencies beyond Python 3 stdlib.
## What Is Path Traversal?
An attacker passes `../../etc/passwd` or `..%2F..%2Fetc%2Fshadow` as a filename parameter. Without validation, your code reads arbitrary files outside the intended base directory. With PHP `include`, it can lead to Remote Code Execution.
**Classic exploit:**
```
GET /api/files?path=../../etc/passwd HTTP/1.1
```
If your handler does `open("uploads/" + request.args["path"])`, attacker reads `/etc/passwd`.
## What It Detects
### Python
| Pattern | Severity | Notes |
|---------|----------|-------|
| `open(user_path)` | CRITICAL | Direct file read with user path |
| `open(os.path.join(base, user_input))` without `abspath`+`startswith` | HIGH | Join doesn't sanitize `../` |
| `pathlib.Path(user_path).read_text()` | CRITICAL | pathlib doesn't sanitize traversal |
| `Path(base).joinpath(user_input)` without `.resolve().is_relative_to(base)` | HIGH | |
| `os.listdir(user_path)` | HIGH | Directory listing disclosure |
| `os.open(user_path, os.O_RDONLY)` | CRITICAL | Low-level file open |
| `shutil.copy/move(user_src, ...)` | HIGH | File operation with user src |
| `tarfile.open(user_path)` | HIGH | Zip/tar slip (CVE-class) |
| `zipfile.ZipFile(user_path)` | HIGH | Zip slip attack |
### Java
| Pattern | Severity | Notes |
|---------|----------|-------|
| `new File(baseDir + userInput)` | CRITICAL | String concat without normalization |
| `new FileInputStream(userInput)` | CRITICAL | Direct file read |
| `Paths.get(userInput)` | HIGH | Path construction without validation |
| `Files.readAllBytes(Path.of(userInput))` | CRITICAL | File read |
| `Files.newBufferedReader(Paths.get(userInput))` | CRITICAL | File read |
| `new File(request.getServletContext().getRealPath(userInput))` | CRITICAL | Servlet path traversal |
| `response.setHeader("Content-Disposition", "..."+userInput)` | MEDIUM | Filename injection |
### PHP
| Pattern | Severity | Notes |
|---------|----------|-------|
| `include($_GET['page'])` / `include($_POST['file'])` | CRITICAL | LFI → RCE via log poisoning |
| `require($_GET['file'])` | CRITICAL | LFI |
| `include_once($_GET[...])` / `require_once($_GET[...])` | CRITICAL | LFI |
| `readfile($_GET['file'])` | CRITICAL | File disclosure |
| `file_get_contents($_GET['path'])` | HIGH | File/URL read |
| `fopen($_GET['file'], 'r')` | CRITICAL | File open |
| `file($_GET['path'])` | HIGH | Read file into array |
| `highlight_file($_GET['file'])` | CRITICAL | PHP source disclosure |
| `include("pages/" . $_GET['page'] . ".php")` | HIGH | Partial mitigation (extension added) but still exploitable via null byte on older PHP |
### Node.js / TypeScript
| Pattern | Severity | Notes |
|---------|----------|-------|
| `fs.readFile(req.params.path, ...)` | CRITICAL | Direct file read |
| `fs.readFileSync(req.query.file)` | CRITICAL | Sync file read |
| `fs.createReadStream(req.body.path)` | CRITICAL | Stream file read |
| `res.sendFile(req.params.filename)` | HIGH | Express static file serve |
| `res.download(req.query.file)` | HIGH | File download |
| `path.join(__dirname, req.params.file)` without `path.resolve`+`startsWith` check | HIGH | Join alone is insufficient |
| `require(req.params.module)` | CRITICAL | Path traversal + arbitrary code execution |
| `fs.readdirSync(req.query.dir)` | HIGH | Directory listing |
### Go
| Pattern | Severity | Notes |
|---------|----------|-------|
| `os.Open(r.FormValue("path"))` | CRITICAL | Direct file open |
| `os.ReadFile(r.URL.Query().Get("file"))` | CRITICAL | File read |
| `http.ServeFile(w, r, r.FormValue("path"))` | CRITICAL | File serve — `http.ServeFile` has some built-in protection but verify |
| `filepath.Join(base, r.FormValue("name"))` without `filepath.Clean`+containment | HIGH | |
| `os.Stat(r.FormValue("path"))` | MEDIUM | Path existence disclosure |
### Ruby
| Pattern | Severity | Notes |
|---------|----------|-------|
| `File.open(params[:path])` | CRITICAL | File open |
| `File.read(params[:file])` | CRITICAL | File read |
| `IO.read(params[:file])` | CRITICAL | File read |
| `send_file(params[:path])` | CRITICAL | Rails file serve |
| `send_data(File.read(params[:file]))` | CRITICAL | File read + serve |
| `render params[:template]` | CRITICAL | Template injection (+ path traversal) |
| `erb.result(binding) where erb from params` | CRITICAL | Template injection |
## Containment Guard Detection
After finding a sink, the scanner checks if a safe-path guard exists within ±40 lines. If found, the finding is downgraded or suppressed:
**Python safe guards:**
```python
# Correct: resolve to absolute, then check containment
safe_path = os.path.abspath(os.path.join(BASE_DIR, user_input))
if not safe_path.startswith(BASE_DIR):
raise PermissionError("path traversal detected")
# Also safe: pathlib.resolve() + is_relative_to()
resolved = (Path(BASE_DIR) / user_input).resolve()
if not resolved.is_relative_to(BASE_DIR):
raise ValueError("path traversal")
```
**Java safe guards:**
```java
// Correct: normalize and verify containment
Path safePath = Paths.get(baseDir).resolve(userInput).normalize().toAbsolutePath();
if (!safePath.startsWith(Paths.get(baseDir).toAbsolutePath())) {
throw new SecurityException("Path traversal detected");
}
```
**Node.js safe guards:**
```javascript
// Correct: resolve and check startsWith
const safePath = path.resolve(BASE_DIR, userInput);
if (!safePath.startsWith(BASE_DIR + path.sep)) {
throw new Error("Path traversal detected");
}
```
**PHP safe guards:**
```php
// Correct: realpath + containment check
$path = realpath(BASE_DIR . '/' . $userInput);
if ($path === false || strpos($path, BASE_DIR) !== 0) {
http_response_code(403);
exit('Forbidden');
}
// Also: basename() strips directory components (partial mitigation)
$filename = basename($_GET['file']); // removes ../
```
**Go safe guards:**
```go
// Correct: Clean + containment check
safePath := filepath.Join(baseDir, filepath.Clean("/"+r.FormValue("path")))
if !strings.HasPrefix(safePath, baseDir+string(os.PathSeparator)) {
http.Error(w, "Forbidden", 403)
return
}
```
## Zip Slip Detection
Tar/zip extraction without path validation enables a special case of path traversal where malicious archives overwrite files outside the extraction directory:
```python
# DANGEROUS — zip slip
with zipfile.ZipFile(archive) as zf:
zf.extractall(extract_dir) # member names not validated
# SAFE
import zipfile, os
with zipfile.ZipFile(archive) as zf:
for member in zf.infolist():
member_path = os.path.abspath(os.path.join(extract_dir, member.filename))
if not member_path.startswith(os.path.abspath(extract_dir) + os.sep):
raise ValueError(f"Zip slip: {member.filename}")
zf.extract(member, extract_dir)
```
The scanner detects `zipfile.ZipFile.extractall()`, `tarfile.TarFile.extractall()`, `ZipInputStream` in Java, `net/http` archiver patterns in Go.
## Implementation
```python
#!/usr/bin/env python3
"""
phy-path-traversal-audit — OWASP A01:2021 path traversal scanner
Usage: python3 audit_path_traversal.py [path] [--json] [--ci]
"""
import argparse
import json
import os
import re
import sys
from dataclasses import dataclass
from pathlib import Path
from typing import Optional
CRITICAL, HIGH, MEDIUM, INFO = "CRITICAL", "HIGH", "MEDIUM", "INFO"
SEV_ORDER = {CRITICAL: 0, HIGH: 1, MEDIUM: 2, INFO: 3}
ICONS = {CRITICAL: "🔴", HIGH: "🟠", MEDIUM: "🟡", INFO: "⚪"}
@dataclass
class Finding:
file: str
line: int
pattern_name: str
matched_text: str
severity: str
cwe: str
description: str
fix: str
has_http_taint: bool