sql-query-generator
Generate secure SQL queries with validation, pagination helpers, risk analysis, and audit-focused safeguards.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install github:LeoYeAI~openclaw-master-skills~sql-query-generatorcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/github%3ALeoYeAI~openclaw-master-skills~sql-query-generator/file -o sql-query-generator.md# SQL Query Generator Skill
## Overview
This skill enables AI agents to generate accurate, optimized SQL queries from natural language descriptions. It supports multiple database systems and follows best practices for query construction, security, and performance.
## Installation
### Method 1: Direct Download
```bash
# Clone or download the repository
git clone https://github.com/yourusername/sql-query-generator.git
cd sql-query-generator
# No external dependencies required for core functionality
python sql_query_generator.py
```
### Method 2: Using as a Module
```bash
# Copy sql_query_generator.py to your project
cp sql_query_generator.py /path/to/your/project/
# Import in your code
from sql_query_generator import SQLQueryGenerator, DatabaseType
```
### Method 3: AI Agent Integration
For AI agents using this skill:
1. Read this SKILL.md file completely before generating queries
2. Follow all security guidelines strictly
3. Always use parameterized queries
4. Validate all inputs before query generation
5. Include security warnings in responses
### Optional Database Drivers
Install only the drivers you need:
```bash
# PostgreSQL
pip install psycopg2-binary
# MySQL
pip install mysql-connector-python
# SQL Server
pip install pyodbc
# Oracle
pip install cx_Oracle
# For testing and development
pip install pytest pytest-cov
```
### System Requirements
- Python 3.7 or higher
- No external dependencies for core query generation
- Database drivers only needed for actual query execution
## Supported Database Systems
- PostgreSQL
- MySQL
- SQLite
- Microsoft SQL Server
- Oracle Database
- MariaDB
## Core Capabilities
### 1. Query Generation
- **SELECT Queries**: Simple and complex data retrieval
- **JOIN Operations**: INNER, LEFT, RIGHT, FULL OUTER, CROSS
- **Aggregations**: GROUP BY, HAVING, aggregate functions
- **Subqueries**: Correlated and non-correlated
- **CTEs**: Common Table Expressions (WITH clause)
- **Window Functions**: OVER, PARTITION BY, ROW_NUMBER, RANK
- **INSERT/UPDATE/DELETE**: Data manipulation queries
- **DDL**: CREATE, ALTER, DROP statements
### 2. Query Optimization
- Index usage recommendations
- Query execution plan analysis
- Performance optimization suggestions
- Avoiding N+1 query problems
### 3. Security Features
- SQL injection prevention
- Parameterized query generation
- Input validation patterns
- Role-based access control patterns
## Usage Instructions
### Basic Query Generation
When generating SQL queries, follow these steps:
1. **Understand the Request**
- Parse natural language input
- Identify required tables
- Determine join conditions
- Extract filter criteria
2. **Generate Base Query**
```sql
-- Example structure
SELECT
column1,
column2,
aggregate_function(column3) AS alias
FROM
table1
JOIN
table2 ON table1.id = table2.foreign_id
WHERE
condition1 = value1
AND condition2 > value2
GROUP BY
column1, column2
HAVING
aggregate_condition
ORDER BY
column1 DESC
LIMIT 100;
```
3. **Apply Security Measures**
- Use parameterized queries
- Validate all inputs
- Escape special characters
### Query Patterns
#### Pattern 1: Simple SELECT
```sql
-- Natural language: "Get all users who registered after January 1, 2024"
SELECT
id,
username,
email,
registration_date
FROM
users
WHERE
registration_date > $1 -- Parameterized
ORDER BY
registration_date DESC;
```
#### Pattern 2: JOIN with Aggregation
```sql
-- Natural language: "Show total orders by customer in 2024"
SELECT
c.customer_name,
c.email,
COUNT(o.order_id) AS total_orders,
SUM(o.total_amount) AS total_spent
FROM
customers c
INNER JOIN
orders o ON c.customer_id = o.customer_id
WHERE
EXTRACT(YEAR FROM o.order_date) = $1
GROUP BY
c.customer_id,
c.customer_name,
c.email
HAVING
COUNT(o.order_id) > 5
ORDER BY
total_spent DESC;
```
#### Pattern 3: Subquery
```sql
-- Natural language: "Find products with above-average prices"
SELECT
product_name,
price,
category
FROM
products
WHERE
price > (
SELECT AVG(price)
FROM products
)
ORDER BY
price DESC;
```
#### Pattern 4: CTE (Common Table Expression)
```sql
-- Natural language: "Get top 3 products per category by sales"
WITH product_sales AS (
SELECT
p.product_id,
p.product_name,
p.category_id,
c.category_name,
SUM(oi.quantity * oi.unit_price) AS total_sales,
ROW_NUMBER() OVER (
PARTITION BY p.category_id
ORDER BY SUM(oi.quantity * oi.unit_price) DESC
) AS rank_in_category
FROM
products p
JOIN
order_items oi ON p.product_id = oi.product_id
JOIN
categories c ON p.category_id = c.category_id
GROUP BY
p.product_id,
p.product_name,
p.category_id,
c.category_name
)
SELECT
category_name,
product_name,
total_sales,
rank_in_category
FROM
product_sales
WHERE
rank_in_category <= 3
ORDER BY
category_name,
rank_in_category;
```
#### Pattern 5: Window Functions
```sql
-- Natural language: "Show running total of sales per day"
SELECT
sale_date,
daily_total,
SUM(daily_total) OVER (
ORDER BY sale_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total,
AVG(daily_total) OVER (
ORDER BY sale_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_average_7days
FROM (
SELECT
DATE(order_date) AS sale_date,
SUM(total_amount) AS daily_total
FROM
orders
GROUP BY
DATE(order_date)
) daily_sales
ORDER BY
sale_date;
```
## Best Practices
### 1. Query Structure
- Always use explicit column names (avoid SELECT *)
- Use meaningful table aliases
- Indent for readability
- Comment complex logic
### 2. Performance
- Create appropriate indexes
- Avoid SELECT DISTINCT when possible (use GROUP BY instead)
- Use EXISTS instead of IN for large datasets
- Limit result sets when appropriate
- Use EXPLAIN to analyze query plans
### 3. Security (CRITICAL)
#### 3.1 MANDATORY Security Rules
**THESE RULES ARE NON-NEGOTIABLE AND MUST ALWAYS BE FOLLOWED:**
1. **NEVER CONCATENATE USER INPUT INTO SQL**
```python
# WRONG - CRITICAL SECURITY VULNERABILITY
query = f"SELECT * FROM users WHERE username = '{user_input}'"
# CORRECT - Always use parameters
query = "SELECT * FROM users WHERE username = %s"
cursor.execute(query, (user_input,))
```
2. **ALL VALUES MUST BE PARAMETERIZED**
- Even seemingly "safe" values like numbers
- Even values from "trusted" sources
- Even internal application values
- NO EXCEPTIONS
3. **VALIDATE AND SANITIZE ALL INPUTS**
```python
# Whitelist validation
VALID_STATUSES = ['active', 'inactive', 'pending']
if status not in VALID_STATUSES:
raise ValueError("Invalid status")
# Type validation
if not isinstance(user_id, int):
raise TypeError("user_id must be integer")
# Length validation
if len(username) > 50:
raise ValueError("username too long")
```
4. **ESCAPE DYNAMIC IDENTIFIERS PROPERLY**
```python
from psycopg2 import sql
# For table/column names that must be dynamic
query = sql.SQL("SELECT * FROM {} WHERE id = %s").format(
sql.Identifier(table_name)
)
cursor.execute(query, (user_id,))
```
#### 3.2 Input Validation Framework
```python
import re
from typing import Any, List, Optional
class SQLInputValidator:
"""Comprehensive input validation for SQL queries"""
@staticmethod
def validate_identifier(identifier: str, max_length: int = 63) -> str:
"""Validate table/column names"""
# Check length
if len(identifier) > max_length:
raise ValueError(f"Identifier too long: {len(ide