Phy Db Index Advisor
数据库索引顾问,静态分析 ORM 查询模式,在丢失的索引成为生产瓶颈之前预测它们。扫描 SQLAlchemy、Django ORM、TypeORM、Prisma、GORM、ActiveRecord 和 Sequelize 代码以查找 WHERE/filter、ORDER BY 和 JOIN 条件中使用的列。交叉引用现有模型索引定义和迁移文件以抑制已索引的列。按查询频率对建议进行排名,并输出准备运行的 CREATE INDEX SQL + 每个 ORM 迁移片段。 ClawHub 上零竞争对手 — 13,700 多个文件中没有一个 db-index-advisor SKILL.md。
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install totalclaw:phy041~phy-db-index-advisorcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Aphy041~phy-db-index-advisor/file -o phy-db-index-advisor.mdGit 仓库获取源码
git clone https://github.com/openclaw/skills/commit/94fd7e7799362b664f285aa4c06dace426832846## 概述(中文)
数据库索引顾问,静态分析 ORM 查询模式,在丢失的索引成为生产瓶颈之前预测它们。扫描 SQLAlchemy、Django ORM、TypeORM、Prisma、GORM、ActiveRecord 和 Sequelize 代码以查找 WHERE/filter、ORDER BY 和 JOIN 条件中使用的列。交叉引用现有模型索引定义和迁移文件以抑制已索引的列。按查询频率对建议进行排名,并输出准备运行的 CREATE INDEX SQL + 每个 ORM 迁移片段。 ClawHub 上零竞争对手 — 13,700 多个文件中没有一个 db-index-advisor SKILL.md。
## 原文
# phy-db-index-advisor
Static analysis tool that reads your **ORM query patterns** and predicts which database columns are missing indexes — before a slow query alert fires in production. Works by counting how often each column appears in `.filter()`, `.where()`, `.order_by()`, and JOIN conditions across your entire codebase, then cross-referencing model definitions to suppress columns already indexed.
## Why This Exists
- 80% of production slow queries stem from missing indexes on columns used in WHERE clauses
- `User.objects.filter(email=email)` running 1,000× per minute causes full table scans
- Existing linters don't know your query patterns; `EXPLAIN ANALYZE` only catches issues after the fact
- This skill finds them **before deployment**
## What It Detects
### Query Patterns Scanned
| Access Pattern | Why It Matters |
|---------------|----------------|
| **WHERE / filter()** | Full table scan without index — O(n) per query |
| **ORDER BY / order_by()** | Sort without index reads all rows then sorts in memory |
| **JOIN ON column** | Nested-loop join without index is O(n²) |
| **UNIQUE constraint candidates** | Columns with `unique=True` queries need unique indexes |
### Supported ORMs
| ORM | Language | Patterns Detected |
|-----|----------|-------------------|
| **Django ORM** | Python | `.filter(col=)`, `.get(col=)`, `.exclude(col=)`, `.order_by('col')`, `Meta.ordering` |
| **SQLAlchemy** | Python | `.filter(Model.col ==)`, `.filter_by(col=)`, `.order_by(col)`, `join(Model, on=)` |
| **Peewee** | Python | `.where(Model.col ==)`, `.order_by(Model.col)` |
| **TypeORM** | TypeScript | `.where("t.col = :val")`, `findBy({col:})`, `.orderBy("t.col")`, `@JoinColumn({name: 'col'})` |
| **Prisma** | TypeScript | `where: { col: }`, `orderBy: { col: }`, `include: { relation: }` |
| **Sequelize** | TypeScript/JS | `where: { col: }`, `order: [['col', 'ASC']]` |
| **GORM** | Go | `.Where("col = ?")`, `.Order("col")`, `.Joins("JOIN ... ON col")` |
| **ActiveRecord** | Ruby | `.where(col:)`, `.find_by(col:)`, `.order(:col)`, `.joins()` |
### Existing Index Detection (Suppression)
The scanner reads existing index definitions so it doesn't recommend indexes that already exist:
| ORM | Where Indexes Are Found |
|-----|------------------------|
| Django | `db_index=True` on field, `Meta.indexes`, `Meta.unique_together` |
| SQLAlchemy | `Column(index=True)`, `Column(unique=True)`, `Index(...)` objects |
| TypeORM | `@Index()` decorator, `@Column({index: true})`, `@Unique()` |
| Prisma | `@@index([col])`, `@@unique([col])`, `@unique` on field |
| GORM | `gorm:"index"`, `gorm:"uniqueIndex"` struct tags |
| ActiveRecord | `add_index` in migrations, `index: true` in column definition |
| SQL migrations | `CREATE INDEX`, `CREATE UNIQUE INDEX` statements |
## Implementation
```python
#!/usr/bin/env python3
"""
phy-db-index-advisor — ORM query pattern analyzer for missing indexes
Usage: python3 advisor.py [path] [--json] [--min-count N]
"""
import argparse
import json
import os
import re
import sys
from collections import defaultdict
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional
# ─── Data structures ─────────────────────────────────────────────────────────
@dataclass
class QueryHit:
file: str
line: int
pattern: str
orm: str
access_type: str # WHERE, ORDER_BY, JOIN
@dataclass
class ColumnReport:
table_hint: str # Guessed model/table name
column: str
where_count: int = 0
order_count: int = 0
join_count: int = 0
files: set = field(default_factory=set)
hits: list = field(default_factory=list)
already_indexed: bool = False
@property
def total_count(self) -> int:
return self.where_count + self.order_count + self.join_count
@property
def priority(self) -> str:
if self.already_indexed:
return "INDEXED"
if self.where_count >= 10 or self.total_count >= 15:
return "CRITICAL"
if self.where_count >= 5 or self.total_count >= 8:
return "HIGH"
if self.total_count >= 3:
return "MEDIUM"
return "LOW"
# ─── Query pattern registry ───────────────────────────────────────────────────
# (orm_name, access_type, regex, model_group_idx, col_group_idx)
QUERY_PATTERNS = [
# ── Django ORM ──
("Django", "WHERE",
re.compile(r'\.(?:filter|get|exclude|count|exists)\s*\([^)]*?(\w+)__?\w*\s*='),
None, 1),
("Django", "WHERE",
re.compile(r'\.(?:filter|get|exclude)\s*\(\s*(\w+)\s*='),
None, 1),
("Django", "ORDER_BY",
re.compile(r'\.order_by\s*\(\s*[\'"-](\w+)[\'"]\s*\)'),
None, 1),
("Django", "ORDER_BY",
re.compile(r'ordering\s*=\s*\[[^\]]*?[\'"](\w+)[\'"]'),
None, 1),
# ── SQLAlchemy ──
("SQLAlchemy", "WHERE",
re.compile(r'\.filter\s*\(\s*(\w+)\.(\w+)\s*=='),
1, 2),
("SQLAlchemy", "WHERE",
re.compile(r'\.filter_by\s*\([^)]*?(\w+)\s*='),
None, 1),
("SQLAlchemy", "ORDER_BY",
re.compile(r'\.order_by\s*\(\s*(\w+)\.(\w+)'),
1, 2),
("SQLAlchemy", "ORDER_BY",
re.compile(r'\.order_by\s*\(\s*(?:asc|desc)\s*\(\s*(\w+)\.(\w+)'),
1, 2),
# ── TypeORM ──
("TypeORM", "WHERE",
re.compile(r'where\s*:\s*\{[^}]*?(\w+)\s*:'),
None, 1),
("TypeORM", "WHERE",
re.compile(r'\.where\s*\(\s*[\'"`](?:\w+\.)?(\w+)\s*(?:=|LIKE|IN|>|<)'),
None, 1),
("TypeORM", "ORDER_BY",
re.compile(r'\.orderBy\s*\(\s*[\'"`](?:\w+\.)?(\w+)[\'"`]'),
None, 1),
("TypeORM", "ORDER_BY",
re.compile(r'orderBy\s*:\s*\{[^}]*?(\w+)\s*:'),
None, 1),
# ── Prisma ──
("Prisma", "WHERE",
re.compile(r'where\s*:\s*\{[^}]*?(\w+)\s*:'),
None, 1),
("Prisma", "ORDER_BY",
re.compile(r'orderBy\s*:\s*\{[^}]*?(\w+)\s*:'),
None, 1),
# ── Sequelize ──
("Sequelize", "WHERE",
re.compile(r'where\s*:\s*\{[^}]*?(\w+)\s*:'),
None, 1),
("Sequelize", "ORDER_BY",
re.compile(r'order\s*:\s*\[\s*\[\s*[\'"`](\w+)[\'"`]'),
None, 1),
# ── GORM ──
("GORM", "WHERE",
re.compile(r'\.(?:Where|Find|First|Last)\s*\([^,)]*?[\'"`](?:\w+\.)?(\w+)\s*(?:=|LIKE|IN|>|<|\?)'),
None, 1),
("GORM", "ORDER_BY",
re.compile(r'\.Order\s*\(\s*[\'"`](\w+)'),
None, 1),
("GORM", "JOIN",
re.compile(r'\.Joins\s*\([^)]*?ON\s+\w+\.(\w+)\s*=\s*\w+\.(\w+)'),
None, 1),
# ── ActiveRecord (Ruby) ──
("ActiveRecord", "WHERE",
re.compile(r'\.where\s*\(\s*(\w+):\s*'),
None, 1),
("ActiveRecord", "WHERE",
re.compile(r'\.find_by\s*\(\s*(\w+):\s*'),
None, 1),
("ActiveRecord", "ORDER_BY",
re.compile(r'\.order\s*\(\s*:(\w+)\s*\)'),
None, 1),
("ActiveRecord", "ORDER_BY",
re.compile(r'\.order\s*\(\s*[\'"](\w+)'),
None, 1),
]
# ─── Existing index detection ─────────────────────────────────────────────────
EXISTING_INDEX_PATTERNS = [
# Django
re.compile(r'(\w+)\s*=\s*\w+Field\s*\([^)]*\bdb_index\s*=\s*True'),
re.compile(r'(\w+)\s*=\s*\w+Field\s*\([^)]*\bunique\s*=\s*True'),
re.compile(r'models\.Index\s*\(\s*fields\s*=\s*\[([^\]]+)\]'),
# SQLAlchemy
re.compile(r'Column\s*\([^)]*\bindex\s*=\s*True[^)]*\).*?#.*?(\w+)'),
re.compile(r'(\w+)\s*=\s*Column\s*\([^)]*\bindex\s*=\s*True'),
re.compile(r'(\w+)\s*=\s*Column\s*\([^)]*\bunique\s*=\s*True'),
re.compile(r'Index\s*\(\s*[\'"`]\w+[\'"`]\s*,\s*\w+\.(\w+)'),
# TypeORM
re.compile(r'@(?:Index|Unique|Column)\s*\([^)]*\bindex\s*:\s*true'),
re.compile(r'@Column\s*\([^)]*\bunique\s*:\s*true[^)]*\)\s*\w+\s*:\s*\w+\s*(\w+)'),
# Prisma
re.compile(r'@@index\s*\(\s*\