external-ai-integration
通过浏览器自动化 (Chrome Relay) 和可选的 Hugging Face API,利用外部 AI 模型(ChatGPT、Claude、Hugging Face 等)作为工具。当您需要使用外部 LLM 来增强助手的能力以进行推理、总结、代码生成或其他任务而不产生孤立的子代理时,请使用。
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install totalclaw:totalclaw~konscious0beast-external-ai-integrationcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~konscious0beast-external-ai-integration/file -o konscious0beast-external-ai-integration.md## 概述(中文)
通过浏览器自动化 (Chrome Relay) 和可选的 Hugging Face API,利用外部 AI 模型(ChatGPT、Claude、Hugging Face 等)作为工具。当您需要使用外部 LLM 来增强助手的能力以进行推理、总结、代码生成或其他任务而不产生孤立的子代理时,请使用。
## 原文
# External AI Integration Skill
This skill provides patterns for using external AI models as **tools** that the assistant can call on‑demand. It extends existing browser‑automation and API‑integration skills, enabling the assistant to:
- **Automate interactions** with ChatGPT, Claude, Gemini, or other web‑based LLMs via Chrome Relay (browser automation).
- **Call Hugging Face Inference API** for models hosted on Hugging Face Spaces (text‑generation, summarization, translation, etc.).
- **Integrate external reasoning** into the assistant's own workflow—e.g., asking ChatGPT for a second opinion, using Claude for detailed analysis, or leveraging Hugging Face for domain‑specific tasks.
- **Avoid spawning isolated sub‑agents** by treating external models as tools, keeping control and context within the main assistant session.
## When to use
- You need additional reasoning power, a different model's perspective, or a specialized model (e.g., code generation, translation) that your primary model lacks.
- The task benefits from a second opinion or parallel evaluation (e.g., reviewing code, analyzing strategy).
- You want to use a model with a larger context window, better coding ability, or specific domain knowledge (Claude, ChatGPT, Hugging Face models).
- You are asked to “integrate external AI via browser” or “use ChatGPT/Claude as a tool”.
- You need to call Hugging Face Inference API for a specific model (e.g., summarization, sentiment analysis) and incorporate the result into your response.
## Core patterns
### 1. Browser Automation (Chrome Relay) for Web‑Based LLMs
Use Chrome Relay to automate interactions with ChatGPT, Claude, Gemini, or any other web‑based LLM that requires a browser interface.
**Prerequisites:**
- Chrome Relay extension installed and a tab attached (user must click the OpenClaw Browser Relay toolbar icon).
- The target LLM website (e.g., `chatgpt.com`, `claude.ai`) already logged in (session cookies present).
- Basic familiarity with the browser automation playbook (`memory/patterns/playbooks.md` – “Browser Automation (Chrome Relay)”).
**Steps:**
1. **Attach to the Chrome Relay profile** (`profile="chrome"`).
2. **Navigate to the target LLM** (or reuse an already‑open tab).
3. **Take a snapshot** to locate the input field and send button (use `refs="aria"` for stable references).
4. **Type the prompt** into the input field and submit (click send button or press Enter).
5. **Wait for the response** (poll for a new element, detect typing indicators, or use a fixed timeout).
6. **Extract the response text** from the appropriate DOM element.
7. **Return the response** to the assistant's workflow.
**Example workflow:**
```python
# This is a conceptual example; actual implementation uses browser tool calls.
def ask_chatgpt(prompt):
# 1. Ensure Chrome Relay is attached
browser(action="open", profile="chrome", targetUrl="https://chatgpt.com")
# 2. Snapshot to get references
snap = browser(action="snapshot", refs="aria")
# 3. Find input field (aria role="textbox") and send button
input_ref = snap.find_element(role="textbox", name="Message")
send_ref = snap.find_element(role="button", name="Send")
# 4. Type prompt and click send
browser(action="act", request={"kind":"type", "ref":input_ref, "text":prompt})
browser(action="act", request={"kind":"click", "ref":send_ref})
# 5. Wait for response (simplified)
time.sleep(10)
# 6. Snapshot again, extract response from last message bubble
snap2 = browser(action="snapshot", refs="aria")
response_element = snap2.find_last_message()
return response_element.text
```
**Key considerations:**
- **Session persistence:** The attached tab must stay logged in; avoid actions that log out.
- **Rate limits:** Be aware of the LLM's rate limits and usage policies.
- **Error handling:** Detect captchas, “network error” messages, or “try again” buttons and fall back gracefully.
- **Multi‑turn conversations:** Maintain conversation context by keeping the same tab and not refreshing.
### 2. Hugging Face Inference API Integration
For models hosted on Hugging Face Spaces or the Inference API, you can call them directly via HTTP requests.
**Prerequisites:**
- Hugging Face API token (stored in 1Password or environment variable).
- Model identifier (e.g., `"gpt2"`, `"google/flan-t5-large"`, `"microsoft/DialoGPT-medium"`).
- Knowledge of the model's expected input/output format.
**Steps:**
1. **Retrieve the API token** (use 1Password skill or read from `~/.huggingface/token`).
2. **Construct the request** (URL, headers, JSON payload).
3. **Send the request** via `curl` or `exec` with `requests` Python module.
4. **Parse the response** and extract the generated text.
5. **Handle errors** (rate limits, model loading, invalid token).
**Example script (using curl):**
```bash
#!/bin/bash
set -e
MODEL="google/flan-t5-large"
PROMPT="Translate English to German: How are you?"
API_TOKEN=$(op read "op://Personal/HuggingFace/api_token")
curl -s "https://api-inference.huggingface.co/models/$MODEL" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"inputs\": \"$PROMPT\"}" | jq -r '.[0].generated_text'
```
**Example Python function (using requests):**
```python
import requests
import os
def hf_inference(model, inputs, parameters=None):
api_token = os.getenv("HF_TOKEN") # or retrieve via 1Password
url = f"https://api-inference.huggingface.co/models/{model}"
headers = {"Authorization": f"Bearer {api_token}"}
payload = {"inputs": inputs}
if parameters:
payload.update(parameters)
resp = requests.post(url, headers=headers, json=payload)
resp.raise_for_status()
return resp.json()
```
**Key considerations:**
- **Cost:** Inference API may have costs; monitor usage.
- **Model readiness:** Some models need to be loaded; include `{"options":{"wait_for_model":true}}` in parameters.
- **Output format:** Response structure varies by model; inspect with a test call first.
### 3. Orchestrating External AI as a Tool
Instead of spawning a sub‑agent, the assistant calls external AI within its own reasoning flow.
**Pattern:**
1. **Determine need:** Decide which external model is appropriate (ChatGPT for creative tasks, Claude for analysis, Hugging Face for specialized models).
2. **Prepare the prompt:** Format the prompt with clear instructions, context, and expected output format.
3. **Call the tool:** Use browser automation for web‑based LLMs or API call for Hugging Face.
4. **Integrate the result:** Parse, validate, and incorporate the external response into your own answer.
5. **Fallback:** If the external call fails, continue with your own reasoning or try an alternative.
**Example decision logic:**
```python
def external_ai_assist(task_type, prompt):
if task_type == "code_review":
# Use Claude via browser automation
return ask_claude(prompt)
elif task_type == "translation":
# Use Hugging Face translation model
return hf_inference("Helsinki-NLP/opus-mt-en-de", prompt)
elif task_type == "creative_writing":
# Use ChatGPT via browser automation
return ask_chatgpt(prompt)
else:
raise ValueError(f"No external AI configured for {task_type}")
```
### 4. Prompt Engineering for External Models
External models may require different prompting styles than the assistant's native model.
- **ChatGPT/Claude:** Use conversational style, system prompts, and markdown formatting.
- **Hugging Face models:** Follow the model's expected input format (e.g., `"Translate English to German: ..."` for T5).
- **Include context:** Provide necessary background, constraints, and examples in the prompt.
- **Specify output format:** Ask for JSON, bullet points, code block