Phy Db Index Advisor

ClawSkills 作者 PHY041 v1.0.0

Database index advisor that statically analyzes ORM query patterns to predict missing indexes before they become production bottlenecks. Scans SQLAlchemy, Django ORM, TypeORM, Prisma, GORM, ActiveRecord, and Sequelize code for columns used in WHERE/filter, ORDER BY, and JOIN conditions. Cross-references existing model index definitions and migration files to suppress already-indexed columns. Ranks recommendations by query frequency and outputs ready-to-run CREATE INDEX SQL + per-ORM migration snippets. Zero competitors on ClawHub — not a single db-index-advisor SKILL.md in 13,700+ files.

源码 ↗

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install clawskills:phy041~phy-db-index-advisor
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Aphy041~phy-db-index-advisor/file -o phy-db-index-advisor.md
Git 仓库获取源码
git clone https://github.com/openclaw/skills/commit/94fd7e7799362b664f285aa4c06dace426832846
# phy-db-index-advisor

Static analysis tool that reads your **ORM query patterns** and predicts which database columns are missing indexes — before a slow query alert fires in production. Works by counting how often each column appears in `.filter()`, `.where()`, `.order_by()`, and JOIN conditions across your entire codebase, then cross-referencing model definitions to suppress columns already indexed.

## Why This Exists

- 80% of production slow queries stem from missing indexes on columns used in WHERE clauses
- `User.objects.filter(email=email)` running 1,000× per minute causes full table scans
- Existing linters don't know your query patterns; `EXPLAIN ANALYZE` only catches issues after the fact
- This skill finds them **before deployment**

## What It Detects

### Query Patterns Scanned
| Access Pattern | Why It Matters |
|---------------|----------------|
| **WHERE / filter()** | Full table scan without index — O(n) per query |
| **ORDER BY / order_by()** | Sort without index reads all rows then sorts in memory |
| **JOIN ON column** | Nested-loop join without index is O(n²) |
| **UNIQUE constraint candidates** | Columns with `unique=True` queries need unique indexes |

### Supported ORMs
| ORM | Language | Patterns Detected |
|-----|----------|-------------------|
| **Django ORM** | Python | `.filter(col=)`, `.get(col=)`, `.exclude(col=)`, `.order_by('col')`, `Meta.ordering` |
| **SQLAlchemy** | Python | `.filter(Model.col ==)`, `.filter_by(col=)`, `.order_by(col)`, `join(Model, on=)` |
| **Peewee** | Python | `.where(Model.col ==)`, `.order_by(Model.col)` |
| **TypeORM** | TypeScript | `.where("t.col = :val")`, `findBy({col:})`, `.orderBy("t.col")`, `@JoinColumn({name: 'col'})` |
| **Prisma** | TypeScript | `where: { col: }`, `orderBy: { col: }`, `include: { relation: }` |
| **Sequelize** | TypeScript/JS | `where: { col: }`, `order: [['col', 'ASC']]` |
| **GORM** | Go | `.Where("col = ?")`, `.Order("col")`, `.Joins("JOIN ... ON col")` |
| **ActiveRecord** | Ruby | `.where(col:)`, `.find_by(col:)`, `.order(:col)`, `.joins()` |

### Existing Index Detection (Suppression)
The scanner reads existing index definitions so it doesn't recommend indexes that already exist:

| ORM | Where Indexes Are Found |
|-----|------------------------|
| Django | `db_index=True` on field, `Meta.indexes`, `Meta.unique_together` |
| SQLAlchemy | `Column(index=True)`, `Column(unique=True)`, `Index(...)` objects |
| TypeORM | `@Index()` decorator, `@Column({index: true})`, `@Unique()` |
| Prisma | `@@index([col])`, `@@unique([col])`, `@unique` on field |
| GORM | `gorm:"index"`, `gorm:"uniqueIndex"` struct tags |
| ActiveRecord | `add_index` in migrations, `index: true` in column definition |
| SQL migrations | `CREATE INDEX`, `CREATE UNIQUE INDEX` statements |

## Implementation

```python
#!/usr/bin/env python3
"""
phy-db-index-advisor — ORM query pattern analyzer for missing indexes
Usage: python3 advisor.py [path] [--json] [--min-count N]
"""
import argparse
import json
import os
import re
import sys
from collections import defaultdict
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional

# ─── Data structures ─────────────────────────────────────────────────────────

@dataclass
class QueryHit:
    file: str
    line: int
    pattern: str
    orm: str
    access_type: str  # WHERE, ORDER_BY, JOIN

@dataclass
class ColumnReport:
    table_hint: str  # Guessed model/table name
    column: str
    where_count: int = 0
    order_count: int = 0
    join_count: int = 0
    files: set = field(default_factory=set)
    hits: list = field(default_factory=list)
    already_indexed: bool = False

    @property
    def total_count(self) -> int:
        return self.where_count + self.order_count + self.join_count

    @property
    def priority(self) -> str:
        if self.already_indexed:
            return "INDEXED"
        if self.where_count >= 10 or self.total_count >= 15:
            return "CRITICAL"
        if self.where_count >= 5 or self.total_count >= 8:
            return "HIGH"
        if self.total_count >= 3:
            return "MEDIUM"
        return "LOW"

# ─── Query pattern registry ───────────────────────────────────────────────────

# (orm_name, access_type, regex, model_group_idx, col_group_idx)
QUERY_PATTERNS = [
    # ── Django ORM ──
    ("Django", "WHERE",
     re.compile(r'\.(?:filter|get|exclude|count|exists)\s*\([^)]*?(\w+)__?\w*\s*='),
     None, 1),
    ("Django", "WHERE",
     re.compile(r'\.(?:filter|get|exclude)\s*\(\s*(\w+)\s*='),
     None, 1),
    ("Django", "ORDER_BY",
     re.compile(r'\.order_by\s*\(\s*[\'"-](\w+)[\'"]\s*\)'),
     None, 1),
    ("Django", "ORDER_BY",
     re.compile(r'ordering\s*=\s*\[[^\]]*?[\'"](\w+)[\'"]'),
     None, 1),

    # ── SQLAlchemy ──
    ("SQLAlchemy", "WHERE",
     re.compile(r'\.filter\s*\(\s*(\w+)\.(\w+)\s*=='),
     1, 2),
    ("SQLAlchemy", "WHERE",
     re.compile(r'\.filter_by\s*\([^)]*?(\w+)\s*='),
     None, 1),
    ("SQLAlchemy", "ORDER_BY",
     re.compile(r'\.order_by\s*\(\s*(\w+)\.(\w+)'),
     1, 2),
    ("SQLAlchemy", "ORDER_BY",
     re.compile(r'\.order_by\s*\(\s*(?:asc|desc)\s*\(\s*(\w+)\.(\w+)'),
     1, 2),

    # ── TypeORM ──
    ("TypeORM", "WHERE",
     re.compile(r'where\s*:\s*\{[^}]*?(\w+)\s*:'),
     None, 1),
    ("TypeORM", "WHERE",
     re.compile(r'\.where\s*\(\s*[\'"`](?:\w+\.)?(\w+)\s*(?:=|LIKE|IN|>|<)'),
     None, 1),
    ("TypeORM", "ORDER_BY",
     re.compile(r'\.orderBy\s*\(\s*[\'"`](?:\w+\.)?(\w+)[\'"`]'),
     None, 1),
    ("TypeORM", "ORDER_BY",
     re.compile(r'orderBy\s*:\s*\{[^}]*?(\w+)\s*:'),
     None, 1),

    # ── Prisma ──
    ("Prisma", "WHERE",
     re.compile(r'where\s*:\s*\{[^}]*?(\w+)\s*:'),
     None, 1),
    ("Prisma", "ORDER_BY",
     re.compile(r'orderBy\s*:\s*\{[^}]*?(\w+)\s*:'),
     None, 1),

    # ── Sequelize ──
    ("Sequelize", "WHERE",
     re.compile(r'where\s*:\s*\{[^}]*?(\w+)\s*:'),
     None, 1),
    ("Sequelize", "ORDER_BY",
     re.compile(r'order\s*:\s*\[\s*\[\s*[\'"`](\w+)[\'"`]'),
     None, 1),

    # ── GORM ──
    ("GORM", "WHERE",
     re.compile(r'\.(?:Where|Find|First|Last)\s*\([^,)]*?[\'"`](?:\w+\.)?(\w+)\s*(?:=|LIKE|IN|>|<|\?)'),
     None, 1),
    ("GORM", "ORDER_BY",
     re.compile(r'\.Order\s*\(\s*[\'"`](\w+)'),
     None, 1),
    ("GORM", "JOIN",
     re.compile(r'\.Joins\s*\([^)]*?ON\s+\w+\.(\w+)\s*=\s*\w+\.(\w+)'),
     None, 1),

    # ── ActiveRecord (Ruby) ──
    ("ActiveRecord", "WHERE",
     re.compile(r'\.where\s*\(\s*(\w+):\s*'),
     None, 1),
    ("ActiveRecord", "WHERE",
     re.compile(r'\.find_by\s*\(\s*(\w+):\s*'),
     None, 1),
    ("ActiveRecord", "ORDER_BY",
     re.compile(r'\.order\s*\(\s*:(\w+)\s*\)'),
     None, 1),
    ("ActiveRecord", "ORDER_BY",
     re.compile(r'\.order\s*\(\s*[\'"](\w+)'),
     None, 1),
]

# ─── Existing index detection ─────────────────────────────────────────────────

EXISTING_INDEX_PATTERNS = [
    # Django
    re.compile(r'(\w+)\s*=\s*\w+Field\s*\([^)]*\bdb_index\s*=\s*True'),
    re.compile(r'(\w+)\s*=\s*\w+Field\s*\([^)]*\bunique\s*=\s*True'),
    re.compile(r'models\.Index\s*\(\s*fields\s*=\s*\[([^\]]+)\]'),
    # SQLAlchemy
    re.compile(r'Column\s*\([^)]*\bindex\s*=\s*True[^)]*\).*?#.*?(\w+)'),
    re.compile(r'(\w+)\s*=\s*Column\s*\([^)]*\bindex\s*=\s*True'),
    re.compile(r'(\w+)\s*=\s*Column\s*\([^)]*\bunique\s*=\s*True'),
    re.compile(r'Index\s*\(\s*[\'"`]\w+[\'"`]\s*,\s*\w+\.(\w+)'),
    # TypeORM
    re.compile(r'@(?:Index|Unique|Column)\s*\([^)]*\bindex\s*:\s*true'),
    re.compile(r'@Column\s*\([^)]*\bunique\s*:\s*true[^)]*\)\s*\w+\s*:\s*\w+\s*(\w+)'),
    # Prisma
    re.compile(r'@@index\s*\(\s*\[([^\]]+)\]'),
    re.compile(r'@@unique\s*\(\s*\[([^\]]+)\]'),
    re.compile(r'(\w+)\s+\w+\s+@unique'),
    # GORM
    re.compile(r'(\w+)\s+\w+\s+`[^`]*gorm:"[^"]*(?:index|uniqueIndex)[^"]*"`'),
    # SQL migrations
    re.compile(r'CREATE\s+(?:UNIQUE\s+)?INDEX\s+\w+\s+ON\s+\w+\s*\(([^)]+)\)', re.IGNORECASE),