vtl-image-analysis

TotalClaw 作者 totalclaw

使用 Visual 测量 AI 生成的图像中的成分结构思维镜头（VTL）框架。检测默认模式偏差（中心锁定、径向塌陷，低张力）并通过生成有针对性的重新提示可配置的运算符。图像生成后运行以诊断和改进成分质量。

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install totalclaw:totalclaw~rusparrish-vtl-image-analysis

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~rusparrish-vtl-image-analysis/file -o rusparrish-vtl-image-analysis.md

# VTL Image Analysis

Use this skill whenever a user asks to analyze, diagnose, or improve a
generated image's composition. Also invoke it proactively after image
generation if the user has requested better compositional quality.

## When to Use

- User says "analyze this image", "why does this look generic/flat/boring"
- User asks to improve a generated image's composition
- After generating an image with openai-image-gen or similar skills
- User asks why their prompts aren't producing interesting layouts

## Step 1 — Measure

Run the probe script on the image:

```bash
python3 scripts/vtl_probe.py <image_path>
```

This returns JSON. Example:
```json
{
  "valid": true,
  "mask_status": "PASS",
  "delta_x": -0.027,
  "delta_y": 0.008,
  "r_v": 0.875,
  "rho_r": 12.4,
  "dRC": 0.40,
  "dRC_label": "mass-dominant",
  "k_var": 1.12,
  "infl_density": 0.16,
  "flags": ["CENTER_LOCK"]
}
```

## HARD STOP — Refusal Gate

**Before reporting any results, check `valid` and `mask_status`.**

If `valid` is false OR `mask_status` is `"FAIL"`:
> "VTL measurement failed: [error message]. The image does not have sufficient
> structural signal for reliable compositional analysis. Try a different image
> or one with more defined edges and contrast."

**Stop here. Do not report coordinates. Do not generate re-prompts.**

If `mask_status` is `"WARN"`:
> "VTL measurement returned low-confidence results (sparse structural signal).
> Coordinates are reported but treat them as indicative, not definitive."
> Then continue with the caveat attached to all outputs.

This refusal is non-negotiable. Fabricating a compositional reading from a
failed measurement produces false diagnosis. The framework is deterministic
by design — an uncertain measurement is reported as uncertain, not smoothed over.

---

## Step 2 — Report Coordinates

Report the five coordinates plainly:

```
VTL ANALYSIS
────────────────────────────────
Placement   Δx={delta_x}  Δy={delta_y}
Void        rᵥ={r_v}
Packing     ρᵣ={rho_r}
Radial      dRC={dRC}  [{dRC_label}]
Tension     k_var={k_var}

FLAGS: {flags or NONE}
```

---

## Step 3 — Generate Re-Prompt (if flags present)

Run the regen script with the user's original prompt and the metrics output:

```bash
python3 scripts/vtl_regen.py \
  --prompt "USER'S ORIGINAL PROMPT" \
  --metrics <path_to_metrics.json> \
  --out prompts.json
```

This selects operators from `operators.yaml` based on which flags fired and
returns up to 3 prompt variants. Report the `selected` variant as the primary
recommendation and offer the alternatives.

If no flags fired, report: "No default-mode patterns detected. Coordinates are
within normal range."

---

## Operator Logic

Operators live in `operators.yaml`. They are rule-based — triggers are evaluated
deterministically against the metric values. The AI does not invent or modify
operators. If a trigger fires, the patch is applied. If not, it isn't.

Do not override operator logic. Do not substitute your own re-prompt language
for what the operator specifies. The operators are the prescription layer —
they are the operator's responsibility, not the AI's improvisation.

If the user wants to modify re-prompt behavior, direct them to edit `operators.yaml`.

---

## Notes

- Metrics describe compositional coordinates, not quality. CENTER_LOCK is not
  "bad" — it's a signal that the model defaulted. A portrait photographer
  choosing center composition is authorship. An AI doing it on every prompt
  regardless of content is prior behavior. VTL measures the difference.
- dRC requires radial eligibility. If mass centroid is very close to frame
  center, dRC is labeled "dual-center" — report the label, not a number
  interpretation.
- Full metric definitions: references/vtl-metrics.md
- Full framework: https://github.com/rusparrish/Visual-Thinking-Lens
- Author: Russell Parrish — https://artistinfluencer.com

---

## 中文说明

# VTL 图像分析

每当用户要求分析、诊断或改进所生成图像的构图时，使用本技能。如果用户已请求更好的构图质量，也可在图像生成后主动调用。

## 何时使用

- 用户说“分析这张图像”、“为什么这张图看起来很普通/平淡/无趣”
- 用户要求改进所生成图像的构图
- 使用 openai-image-gen 或类似技能生成图像之后
- 用户问为什么他们的提示词没有产生有趣的布局

## 第 1 步 —— 测量

对图像运行探针脚本：

```bash
python3 scripts/vtl_probe.py <image_path>
```

返回 JSON。示例：
```json
{
  "valid": true,
  "mask_status": "PASS",
  "delta_x": -0.027,
  "delta_y": 0.008,
  "r_v": 0.875,
  "rho_r": 12.4,
  "dRC": 0.40,
  "dRC_label": "mass-dominant",
  "k_var": 1.12,
  "infl_density": 0.16,
  "flags": ["CENTER_LOCK"]
}
```

## 硬性停止 —— 拒绝门槛

**在报告任何结果之前，检查 `valid` 和 `mask_status`。**

如果 `valid` 为 false 或 `mask_status` 为 `"FAIL"`：
> “VTL measurement failed: [error message]. The image does not have sufficient
> structural signal for reliable compositional analysis. Try a different image
> or one with more defined edges and contrast.”

**就此停止。不要报告坐标。不要生成重新提示词。**

如果 `mask_status` 为 `"WARN"`：
> “VTL measurement returned low-confidence results (sparse structural signal).
> Coordinates are reported but treat them as indicative, not definitive.”
> 然后在所有输出附带该警示说明的情况下继续。

此项拒绝不可商量。从一次失败的测量中编造构图解读会产生错误诊断。该框架在设计上是确定性的 —— 不确定的测量会被报告为不确定，而不会被掩饰过去。

---

## 第 2 步 —— 报告坐标

直接报告这五个坐标：

```
VTL ANALYSIS
────────────────────────────────
Placement   Δx={delta_x}  Δy={delta_y}
Void        rᵥ={r_v}
Packing     ρᵣ={rho_r}
Radial      dRC={dRC}  [{dRC_label}]
Tension     k_var={k_var}

FLAGS: {flags or NONE}
```

---

## 第 3 步 —— 生成重新提示词（如果存在标志）

使用用户的原始提示词和指标输出运行 regen 脚本：

```bash
python3 scripts/vtl_regen.py \
  --prompt "USER'S ORIGINAL PROMPT" \
  --metrics <path_to_metrics.json> \
  --out prompts.json
```

它会根据触发了哪些标志从 `operators.yaml` 中选择运算符，并返回最多 3 个提示词变体。将 `selected` 变体作为主要推荐报告，并提供备选项。

如果没有触发任何标志，报告：“No default-mode patterns detected. Coordinates are within normal range.”

---

## 运算符逻辑

运算符存放在 `operators.yaml` 中。它们是基于规则的 —— 触发条件会针对指标值进行确定性求值。AI 不会发明或修改运算符。如果某个触发条件成立，则应用该补丁；如果不成立，则不应用。

不要覆盖运算符逻辑。不要用你自己的重新提示词措辞替换运算符所指定的内容。运算符是处方层 —— 它们是运算符的职责，而非 AI 的即兴发挥。

如果用户想要修改重新提示词行为，引导他们去编辑 `operators.yaml`。

---

## 说明

- 指标描述的是构图坐标，而非质量。CENTER_LOCK 并不“糟糕” —— 它是模型采用了默认行为的信号。一名肖像摄影师选择中心构图是一种作者意图。AI 在每个提示词上无视内容地这样做则是先验行为。VTL 衡量的正是这种差异。
- dRC 需要满足径向适用性。如果质量重心非常接近画面中心，dRC 会被标记为 “dual-center” —— 报告该标签，而不要对数字做解读。
- 完整指标定义：references/vtl-metrics.md
- 完整框架：https://github.com/rusparrish/Visual-Thinking-Lens
- 作者：Russell Parrish —— https://artistinfluencer.com