chemistry-query

TotalClaw 作者 totalclaw

PubChem API 查询的化学代理技能(化合物信息/属性、结构/SMILES/图像、合成路线/参考)+ RDKit 化学信息学(SMILES 到分子道具/logP/TPSA、2D PNG/SVG 即摩根指纹、逆合成/BRICS 断开、多步合成规划)。用于涉及化合物、分子、结构、PubChem 数据、RDKit 分析、SMILES 处理、合成路线、逆合成、反应模拟的化学任务。触发化学、化合物、分子、化学数据/特性、PubChem、RDKit、SMILES、结构、合成、反应、逆合成、合成计划/路线。

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install totalclaw:totalclaw~cheminem-chemistry-query
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~cheminem-chemistry-query/file -o cheminem-chemistry-query.md
## 概述(中文)

PubChem API 查询的化学代理技能(化合物信息/属性、结构/SMILES/图像、合成路线/参考)+ RDKit 化学信息学(SMILES 到分子道具/logP/TPSA、2D PNG/SVG 即摩根指纹、逆合成/BRICS 断开、多步合成规划)。用于涉及化合物、分子、结构、PubChem 数据、RDKit 分析、SMILES 处理、合成路线、逆合成、反应模拟的化学任务。触发化学、化合物、分子、化学数据/特性、PubChem、RDKit、SMILES、结构、合成、反应、逆合成、合成计划/路线。

## 原文

# Chemistry Query Agent v1.4.0

## Overview

Full-stack chemistry toolkit combining PubChem data retrieval with RDKit molecule processing, visualization, analysis, retrosynthesis, and synthesis planning. All outputs are structured JSON for easy downstream chaining. Generates PNG/SVG images on demand.

**Key capabilities:**
- PubChem compound lookup (info, structure, synthesis refs, similarity search)
- RDKit molecular properties (MW, logP, TPSA, HBD/HBA, rotatable bonds, aromatic rings)
- 2D molecule visualization (PNG/SVG)
- BRICS retrosynthesis with recursive depth control
- Multi-step synthesis route planning
- Forward reaction simulation with SMARTS templates
- Morgan fingerprints and similarity/substructure search
- 21 named reaction templates (Suzuki, Heck, Grignard, Wittig, Diels-Alder, etc.)

## Quick Start

```bash
# PubChem compound info
exec python scripts/query_pubchem.py --compound "aspirin" --type info

# Molecular properties from SMILES
exec python scripts/rdkit_mol.py --smiles "CC(=O)Oc1ccccc1C(=O)O" --action props

# Retrosynthesis
exec python scripts/rdkit_mol.py --target "CC(=O)Oc1ccccc1C(=O)O" --action retro --depth 2

# Full chain (name → props + draw + retro)
exec python scripts/chain_entry.py --input-json '{"name": "caffeine", "context": "user"}'
```

## Scripts

### `scripts/query_pubchem.py`
PubChem REST API queries with automatic name→CID resolution and timeout handling.

```
--compound <name|CID> --type <info|structure|synthesis|similar> [--format smiles|inchi|image|json] [--threshold 80]
```

- **info:** Formula, MW, IUPAC name, InChIKey (JSON)
- **structure:** SMILES, InChI, image URL, or full JSON
- **synthesis:** Synonyms/references for a compound
- **similar:** Similar compounds by 2D fingerprint (top 20)

### `scripts/rdkit_mol.py`
RDKit cheminformatics engine. Resolves names via PubChem automatically.

```
--smiles <SMILES> --action <props|draw|fingerprint|similarity|substruct|xyz|react|retro|plan>
```

| Action | Description | Key Args |
|--------|-------------|----------|
| props | MW, logP, TPSA, HBD, HBA, rotB, aromRings | `--smiles` |
| draw | 2D PNG/SVG (300×300) | `--smiles --output file.png --format png\|svg` |
| retro | BRICS recursive retrosynthesis | `--target <SMILES\|name> --depth N` |
| plan | Multi-step retro route | `--target <SMILES\|name> --steps N` |
| react | Forward reaction via SMARTS | `--reactants "smi1 smi2" --smarts "<SMARTS>"` |
| fingerprint | Morgan fingerprint bitvector | `--smiles --radius 2` |
| similarity | Tanimoto similarity scoring | `--query_smiles --target_smiles "smi1,smi2"` |
| substruct | Substructure matching | `--query_smiles --target_smiles "smi1,smi2"` |
| xyz | 3D coordinates (MMFF optimized) | `--smiles` |

### `scripts/chain_entry.py`
Standard agent chain interface. Accepts `{"smiles": "...", "context": "..."}` or `{"name": "...", "context": "..."}`. Returns unified JSON with props, visualization, and retrosynthesis.

```bash
python scripts/chain_entry.py --input-json '{"name": "sotorasib", "context": "user"}'
```

Output schema:
```json
{
  "agent": "chemistry-query",
  "version": "1.4.0",
  "smiles": "<canonical>",
  "status": "success|error",
  "report": {"props": {...}, "draw": {...}, "retro": {...}},
  "risks": [],
  "viz": ["path/to/image.png"],
  "recommend_next": ["pharmacology", "toxicology"],
  "confidence": 0.95,
  "warnings": [],
  "timestamp": "ISO8601"
}
```

### `scripts/templates.json`
21 named reaction templates with SMARTS, expected yields, conditions, and references. Includes: Suzuki, Heck, Buchwald-Hartwig, Grignard, Wittig, Diels-Alder, Click, Sonogashira, Negishi, and more.

## Chaining

1. **Name → Full Profile:** `chain_entry.py` with `{"name": "ibuprofen"}` → props + draw + retro
2. **Chemistry → Pharmacology:** Output feeds directly into `pharma-pharmacology-agent`
3. **Retro + Viz:** Get precursors, then draw each one
4. **Suzuki Test:** `--action react --reactants "c1ccccc1Br c1ccccc1B(O)O" --smarts "[c:1][Br:2].[c:3][B]([c:4])(O)O>>[c:1][c:3]"`

## Tested With

All features verified end-to-end with RDKit 2024.03+:

| Molecule | SMILES | Tests Passed |
|----------|--------|-------------|
| Caffeine | `CN1C=NC2=C1C(=O)N(C(=O)N2C)C` | info, structure, props, draw, retro, plan, chain |
| Aspirin | `CC(=O)Oc1ccccc1C(=O)O` | info, structure, props, draw, retro, plan, chain |
| Sotorasib | PubChem name lookup | info, structure, props, draw, retro, chain |
| Ibuprofen | PubChem name lookup | info, structure, props, chain |
| Invalid SMILES | `XXXINVALID` | Graceful JSON error |
| Empty input | `{}` | Graceful JSON error |

## Resources

- `references/api_endpoints.md` — PubChem API endpoint reference and rate limits
- `scripts/rdkit_reaction.py` — Legacy reaction module
- `scripts/chembl_query.py`, `scripts/pubmed_search.py`, `scripts/admet_predict.py` — Additional query modules

## Changelog

**v1.4.0** (2026-02-14)
- Fixed PubChem SMILES/InChI endpoint (property/CanonicalSMILES/TXT)
- Fixed chain_entry.py HTML entity corruption
- Fixed brics_retro to handle BRICSDecompose string output correctly
- Added request timeouts (15s) to all PubChem calls
- Graceful error handling for invalid SMILES and empty input
- Updated chain output version and schema
- Comprehensive end-to-end testing

**v1.3.0**
- RDKit props NoneType fixes, invalid SMILES graceful errors
- React fix: ReactionFromSmarts import
- Name resolution via PubChem for all RDKit actions

**v1.2.0**
- BRICS retrosynthesis + 21 reaction templates library
- Multi-step synthesis planning