wechat-search

TotalClaw 作者 totalclaw

使用 OpenClaw 的网络搜索、Tavily API 和网络获取功能以及以合规为中心的设计来搜索微信公众号文章。

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install totalclaw:totalclaw~jixsonwang-wechat-search
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~jixsonwang-wechat-search/file -o jixsonwang-wechat-search.md
# WeChat Search Skill

Search for WeChat Official Account (微信公众号) articles using a compliant, three-layer approach that prioritizes legal search APIs and falls back to respectful web scraping when needed.

## Features
- **Compliant Design**: Prioritizes legal search APIs, respects robots.txt and rate limits
- **Three-Layer Strategy**: 
  - Primary: OpenClaw web_search (Brave Search API)
  - Secondary: Tavily Search API (if Brave unavailable)
  - Fallback: Direct page fetching from WeChat search
- **Recent Results**: Returns the 5 most recent articles by default (configurable)
- **Time Filtering**: Support for date range and recency filters
- **Multiple Output Formats**: Text, JSON, and markdown formats available

## Prerequisites
- **OpenClaw Web Tools**: Requires `web_search`, `web_fetch` tools to be available
- **API Keys** (optional but recommended):
  - Brave Search API Key (for primary search)
  - Tavily API Key (for secondary search, already configured in your environment)

## Usage

### Basic Search
```bash
wechat-search "人工智能"
```

### Advanced Options
```bash
# Return 10 results instead of default 5
wechat-search "机器学习" --max-results 10

# Search within past week
wechat-search "大模型" --past-week

# Custom date range
wechat-search "AI应用" --from 2026-01-01 --to 2026-02-01

# JSON output format
wechat-search "开源AI" --output json

# Force specific strategy
wechat-search "最新技术" --strategy tavily_only
```

## Configuration
Create `~/.openclaw/wechat-search-config.json` to customize behavior:

```json
{
  "defaultMaxResults": 5,
  "maxResultsLimit": 20,
  "requestDelayMs": 5000,
  "cacheDurationHours": 1,
  "userAgent": "OpenClaw-WeChat-Search-Bot/1.0 (+https://github.com/your-username/wechat-search-skill)"
}
```

## Search Strategy Details

### Layer 1: OpenClaw Web Search (Brave Search)
- Uses Brave Search API with `site:mp.weixin.qq.com` filter
- Fastest and most reliable when API key is configured
- Respects search engine's indexing and ranking

### Layer 2: Tavily Search API
- Activated when Brave Search is unavailable or fails
- Uses Tavily's AI-powered search with WeChat site restriction
- Provides high-quality, relevant results with good coverage

### Layer 3: Direct Web Fetch
- Final fallback when both APIs are unavailable
- Scrapes WeChat search results directly from搜狗微信搜索
- Implements proper delays and respects robots.txt
- Parses HTML to extract article metadata

## Compliance & Ethics
- **Respects robots.txt**: Checks and follows robots.txt directives
- **Rate limiting**: Minimum 5-second delay between requests
- **Transparent identification**: Clear User-Agent string identifying the bot
- **Public content only**: Only accesses publicly available articles
- **No data retention**: Does not store full article content, only metadata

## Error Handling
- Automatic retry on network failures (up to 3 attempts)
- Graceful fallback between all three search strategies
- Clear error messages for debugging
- Handles API key missing scenarios gracefully

## Future Enhancements
- RSS feed integration support
- Article content summarization
- Author/subscription management
- Enhanced filtering options

This skill is designed to be both useful and responsible, providing access to valuable WeChat Official Account content while respecting platform rules and legal requirements.

---

## 中文说明

# 微信搜索技巧

使用合规的三层策略搜索微信公众号文章,优先采用合法的搜索 API,并在需要时回退到尊重平台规则的网页抓取。

## 功能特性
- **合规设计**:优先使用合法搜索 API,遵守 robots.txt 和速率限制
- **三层策略**:
  - 主要:OpenClaw web_search(Brave Search API)
  - 次要:Tavily Search API(当 Brave 不可用时)
  - 回退:直接从微信搜索抓取页面
- **最新结果**:默认返回 5 篇最新文章(可配置)
- **时间过滤**:支持日期范围和时效性过滤
- **多种输出格式**:可使用文本、JSON 和 markdown 格式

## 前置条件
- **OpenClaw 网络工具**:需要 `web_search`、`web_fetch` 工具可用
- **API 密钥**(可选但推荐):
  - Brave Search API 密钥(用于主要搜索)
  - Tavily API 密钥(用于次要搜索,已在你的环境中配置)

## 用法

### 基本搜索
```bash
wechat-search "人工智能"
```

### 高级选项
```bash
# Return 10 results instead of default 5
wechat-search "机器学习" --max-results 10

# Search within past week
wechat-search "大模型" --past-week

# Custom date range
wechat-search "AI应用" --from 2026-01-01 --to 2026-02-01

# JSON output format
wechat-search "开源AI" --output json

# Force specific strategy
wechat-search "最新技术" --strategy tavily_only
```

## 配置
创建 `~/.openclaw/wechat-search-config.json` 以自定义行为:

```json
{
  "defaultMaxResults": 5,
  "maxResultsLimit": 20,
  "requestDelayMs": 5000,
  "cacheDurationHours": 1,
  "userAgent": "OpenClaw-WeChat-Search-Bot/1.0 (+https://github.com/your-username/wechat-search-skill)"
}
```

## 搜索策略详情

### 第 1 层:OpenClaw 网络搜索(Brave Search)
- 使用带 `site:mp.weixin.qq.com` 过滤条件的 Brave Search API
- 在配置了 API 密钥时最快且最可靠
- 遵循搜索引擎的索引和排名

### 第 2 层:Tavily Search API
- 当 Brave Search 不可用或失败时启用
- 使用 Tavily 的 AI 驱动搜索,并限制在微信站点范围内
- 提供高质量、相关性强且覆盖良好的结果

### 第 3 层:直接网页获取
- 两个 API 都不可用时的最终回退方案
- 直接从搜狗微信搜索抓取微信搜索结果
- 实施适当的延迟并遵守 robots.txt
- 解析 HTML 以提取文章元数据

## 合规与伦理
- **遵守 robots.txt**:检查并遵循 robots.txt 指令
- **速率限制**:请求之间最少延迟 5 秒
- **透明标识**:使用清晰的 User-Agent 字符串标识该机器人
- **仅限公开内容**:仅访问公开可用的文章
- **不保留数据**:不存储完整文章内容,仅保留元数据

## 错误处理
- 网络故障时自动重试(最多 3 次)
- 在三种搜索策略之间平滑回退
- 提供清晰的错误信息以便调试
- 优雅地处理 API 密钥缺失的情况

## 未来增强
- RSS 订阅源集成支持
- 文章内容摘要
- 作者/订阅管理
- 增强的过滤选项

本技巧旨在兼具实用性与责任感,在尊重平台规则和法律要求的前提下,提供对有价值的微信公众号内容的访问。