search-cluster

TotalClaw 作者 totalclaw

使用 Google CSE、GNews RSS、维基百科、Reddit 和 Scrapling 的聚合搜索聚合器。

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install totalclaw:totalclaw~1999azzar-search-cluster

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~1999azzar-search-cluster/file -o 1999azzar-search-cluster.md

## 概述（中文）

使用 Google CSE、GNews RSS、维基百科、Reddit 和 Scrapling 的聚合搜索聚合器。

## 原文

# Search Cluster (Industrial Standard v3.1)

A multi-provider search aggregator designed for high-availability and security.

## Installation
The scrapling provider requires a dedicated virtual environment.
1. Create a venv: python3 -m venv venv/scrapling
2. Install scrapling: venv/scrapling/bin/pip install scrapling
3. Provide the path to the venv binary in SCRAPLING_PYTHON_PATH.

## Security Posture
- Subprocess Isolation: Query inputs are passed as arguments to stealth_fetch.py.
- Strict TLS: Mandatory SSL verification on all providers.
- Sanitization: Integrated native internal scrubber (Path Neutral).

## Requirements and Environment
Declare these variables in your environment or vault:

| Variable | Requirement | Description |
|---|---|---|
| GOOGLE_API_KEY | Optional | API Key for Google Custom Search. |
| GOOGLE_CSE_ID | Optional | Search Engine ID for Google CSE. |
| SCRAPLING_PYTHON_PATH | Optional | Path to the scrapling venv python binary. |
| REDIS_HOST | Optional | Host for result caching. |
| REDIS_PORT | Optional | Port for result caching (Default: 6379). |
| SEARCH_USER_AGENT | Optional | Custom User-Agent string. |

## Providers
- google: Official Google Custom Search.
- wiki: Wikipedia OpenSearch API.
- reddit: Reddit JSON search API.
- gnews: Google News RSS aggregator.
- scrapling: Headless stealth scraping (via DuckDuckGo).

## Included Scripts
- scripts/search-cluster.py: Main entry point.
- scripts/stealth_fetch.py: Scrapling fetcher (REQUIRED for scrapling provider).

## Workflow
1. Execute: scripts/search-cluster.py all "<query>"
2. Output is structured JSON with source, title, link, and sanitized snippet.