pubmed-edirect

ClawSkills 作者 clawskills

Search and retrieve literature from PubMed using NCBI's EDirect command-line tools. ⚠️ Advanced skill requiring manual installation.

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install clawskills:clawskills~killgfat-pubmed-edirect
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Aclawskills~killgfat-pubmed-edirect/file -o killgfat-pubmed-edirect.md
# PubMed EDirect Skill

Search and retrieve literature from PubMed using NCBI's EDirect command-line tools.

## ⚠️ Security Advisory

**Important**: This skill requires installation of external command-line tools. The installation process involves:

1. **External script execution**: Downloading and executing installation scripts from the official NCBI FTP server
2. **System modifications**: Adding directories to your PATH environment variable
3. **Permission requirements**: May require installation of Perl modules and dependencies

**Before installation, you must**:
1. Review the installer script content after downloading
2. Confirm the source is trustworthy (official `ftp.ncbi.nlm.nih.gov` domain)
3. Validate in a test environment
4. Understand all commands that will be executed

## Overview

This skill provides access to PubMed and other NCBI databases through the official EDirect (Entrez Direct) utilities. EDirect is a suite of programs that provide access to the NCBI's suite of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from Unix terminals.

**Note: This is a local installation skill** – all tools run directly on your system without Docker or containerization. Follow the [INSTALL.md](INSTALL.md) guide for local setup.

## Structure

The skill is organized into the following files:

- **`INSTALL.md`** - Installation and configuration guide
- **`BASICS.md`** - Basic usage and common commands
- **`ADVANCED.md`** - Advanced techniques and complex queries
- **`EXAMPLES.md`** - Practical usage examples
- **`REFERENCE.md`** - Quick reference (field qualifiers, formats, etc.)
- **`OPENCLAW_INTEGRATION.md`** - OpenClaw-specific usage guide
- **`scripts/`** - Useful bash scripts for common tasks

## Quick Start

1. **Read the installation guide**: Review [INSTALL.md](INSTALL.md) for secure installation steps
2. **Manually install EDirect**:
   ```bash
   # Step 1: Download the script
   wget -q https://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/install-edirect.sh
   
   # Step 2: Review content (important for security)
   less install-edirect.sh
   
   # Step 3: Execute installation
   ./install-edirect.sh
   ```
3. **Verify installation**:
   ```bash
   esearch -db pubmed -query "test" -retmax 1
   ```
4. **Explore examples**: Check [EXAMPLES.md](EXAMPLES.md)

## Core Tools

The skill provides access to EDirect tools through OpenClaw's `exec` capability:

- `esearch` - Search databases
- `efetch` - Retrieve records
- `elink` - Find related records
- `efilter` - Filter results
- `xtract` - Extract data from XML
- `einfo` - Get database information

## Databases Supported

EDirect supports numerous NCBI databases including:

- `pubmed` - Biomedical literature
- `pmc` - PubMed Central full-text articles
- `gene` - Gene information
- `nuccore` - Nucleotide sequences
- `protein` - Protein sequences
- `mesh` - Medical Subject Headings
- And many more...

## Key Features

- **Command-line access** to NCBI databases
- **Pipeline architecture** using Unix pipes
- **Structured data extraction** with XML parsing
- **Batch processing** capabilities
- **Cross-database linking** between records

## Getting Help

- Use `-help` with any EDirect command: `esearch -help`
- Consult the [official documentation](https://www.ncbi.nlm.nih.gov/books/NBK179288/)
- Check troubleshooting in installation guide

## Included Scripts

The `scripts/` directory contains ready-to-use bash scripts:

### `batch_fetch_abstracts.sh`

Fetch abstracts for a list of PMIDs with error handling and rate limiting.

```bash
./scripts/batch_fetch_abstracts.sh pmids.txt abstracts/ 0.5
```

### `search_export_csv.sh`

Search PubMed and export results to CSV with metadata.

```bash
./scripts/search_export_csv.sh "CRISPR [TIAB]" 100 results.csv
```

### `publication_trends.sh`

Analyze publication trends over time with visualization.

```bash
./scripts/publication_trends.sh "machine learning" 2010 2023 trends.csv
```

## Security Best Practices

### 1. Script Review
```bash
# Always download first and review scripts
wget -q SOURCE_URL -O script.sh
less script.sh  # or cat script.sh | head -50
# Execute only after review
./script.sh
```

### 2. Environment Isolation
- Running in Docker containers provides isolation
- Use virtual machines for testing
- Set up dedicated user accounts

### 3. Least Privilege
- Do not run as root
- Set appropriate file permissions
- Use dedicated directories for data

### 4. Network Controls
- Configure firewall rules
- Use proxies for controlled access
- Monitor network traffic

## Notes

**Important**: This skill requires manual installation and configuration. All installation steps require explicit user confirmation and execution.

This skill provides command-line access to NCBI databases through local installation of EDirect tools.