external-ai-integration

TotalClaw 作者 totalclaw

通过浏览器自动化 (Chrome Relay) 和可选的 Hugging Face API,利用外部 AI 模型(ChatGPT、Claude、Hugging Face 等)作为工具。当您需要使用外部 LLM 来增强助手的能力以进行推理、总结、代码生成或其他任务而不产生孤立的子代理时,请使用。

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install totalclaw:totalclaw~konscious0beast-external-ai-integration
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~konscious0beast-external-ai-integration/file -o konscious0beast-external-ai-integration.md
## 概述(中文)

通过浏览器自动化 (Chrome Relay) 和可选的 Hugging Face API,利用外部 AI 模型(ChatGPT、Claude、Hugging Face 等)作为工具。当您需要使用外部 LLM 来增强助手的能力以进行推理、总结、代码生成或其他任务而不产生孤立的子代理时,请使用。

## 原文

# External AI Integration Skill

This skill provides patterns for using external AI models as **tools** that the assistant can call on‑demand. It extends existing browser‑automation and API‑integration skills, enabling the assistant to:

- **Automate interactions** with ChatGPT, Claude, Gemini, or other web‑based LLMs via Chrome Relay (browser automation).
- **Call Hugging Face Inference API** for models hosted on Hugging Face Spaces (text‑generation, summarization, translation, etc.).
- **Integrate external reasoning** into the assistant's own workflow—e.g., asking ChatGPT for a second opinion, using Claude for detailed analysis, or leveraging Hugging Face for domain‑specific tasks.
- **Avoid spawning isolated sub‑agents** by treating external models as tools, keeping control and context within the main assistant session.

## When to use

- You need additional reasoning power, a different model's perspective, or a specialized model (e.g., code generation, translation) that your primary model lacks.
- The task benefits from a second opinion or parallel evaluation (e.g., reviewing code, analyzing strategy).
- You want to use a model with a larger context window, better coding ability, or specific domain knowledge (Claude, ChatGPT, Hugging Face models).
- You are asked to “integrate external AI via browser” or “use ChatGPT/Claude as a tool”.
- You need to call Hugging Face Inference API for a specific model (e.g., summarization, sentiment analysis) and incorporate the result into your response.

## Core patterns

### 1. Browser Automation (Chrome Relay) for Web‑Based LLMs

Use Chrome Relay to automate interactions with ChatGPT, Claude, Gemini, or any other web‑based LLM that requires a browser interface.

**Prerequisites:**
- Chrome Relay extension installed and a tab attached (user must click the OpenClaw Browser Relay toolbar icon).
- The target LLM website (e.g., `chatgpt.com`, `claude.ai`) already logged in (session cookies present).
- Basic familiarity with the browser automation playbook (`memory/patterns/playbooks.md` – “Browser Automation (Chrome Relay)”).

**Steps:**

1. **Attach to the Chrome Relay profile** (`profile="chrome"`).
2. **Navigate to the target LLM** (or reuse an already‑open tab).
3. **Take a snapshot** to locate the input field and send button (use `refs="aria"` for stable references).
4. **Type the prompt** into the input field and submit (click send button or press Enter).
5. **Wait for the response** (poll for a new element, detect typing indicators, or use a fixed timeout).
6. **Extract the response text** from the appropriate DOM element.
7. **Return the response** to the assistant's workflow.

**Example workflow:**

```python
# This is a conceptual example; actual implementation uses browser tool calls.
def ask_chatgpt(prompt):
    # 1. Ensure Chrome Relay is attached
    browser(action="open", profile="chrome", targetUrl="https://chatgpt.com")
    # 2. Snapshot to get references
    snap = browser(action="snapshot", refs="aria")
    # 3. Find input field (aria role="textbox") and send button
    input_ref = snap.find_element(role="textbox", name="Message")
    send_ref = snap.find_element(role="button", name="Send")
    # 4. Type prompt and click send
    browser(action="act", request={"kind":"type", "ref":input_ref, "text":prompt})
    browser(action="act", request={"kind":"click", "ref":send_ref})
    # 5. Wait for response (simplified)
    time.sleep(10)
    # 6. Snapshot again, extract response from last message bubble
    snap2 = browser(action="snapshot", refs="aria")
    response_element = snap2.find_last_message()
    return response_element.text
```

**Key considerations:**
- **Session persistence:** The attached tab must stay logged in; avoid actions that log out.
- **Rate limits:** Be aware of the LLM's rate limits and usage policies.
- **Error handling:** Detect captchas, “network error” messages, or “try again” buttons and fall back gracefully.
- **Multi‑turn conversations:** Maintain conversation context by keeping the same tab and not refreshing.

### 2. Hugging Face Inference API Integration

For models hosted on Hugging Face Spaces or the Inference API, you can call them directly via HTTP requests.

**Prerequisites:**
- Hugging Face API token (stored in 1Password or environment variable).
- Model identifier (e.g., `"gpt2"`, `"google/flan-t5-large"`, `"microsoft/DialoGPT-medium"`).
- Knowledge of the model's expected input/output format.

**Steps:**

1. **Retrieve the API token** (use 1Password skill or read from `~/.huggingface/token`).
2. **Construct the request** (URL, headers, JSON payload).
3. **Send the request** via `curl` or `exec` with `requests` Python module.
4. **Parse the response** and extract the generated text.
5. **Handle errors** (rate limits, model loading, invalid token).

**Example script (using curl):**

```bash
#!/bin/bash
set -e

MODEL="google/flan-t5-large"
PROMPT="Translate English to German: How are you?"
API_TOKEN=$(op read "op://Personal/HuggingFace/api_token")

curl -s "https://api-inference.huggingface.co/models/$MODEL" \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"inputs\": \"$PROMPT\"}" | jq -r '.[0].generated_text'
```

**Example Python function (using requests):**

```python
import requests
import os

def hf_inference(model, inputs, parameters=None):
    api_token = os.getenv("HF_TOKEN")  # or retrieve via 1Password
    url = f"https://api-inference.huggingface.co/models/{model}"
    headers = {"Authorization": f"Bearer {api_token}"}
    payload = {"inputs": inputs}
    if parameters:
        payload.update(parameters)
    resp = requests.post(url, headers=headers, json=payload)
    resp.raise_for_status()
    return resp.json()
```

**Key considerations:**
- **Cost:** Inference API may have costs; monitor usage.
- **Model readiness:** Some models need to be loaded; include `{"options":{"wait_for_model":true}}` in parameters.
- **Output format:** Response structure varies by model; inspect with a test call first.

### 3. Orchestrating External AI as a Tool

Instead of spawning a sub‑agent, the assistant calls external AI within its own reasoning flow.

**Pattern:**

1. **Determine need:** Decide which external model is appropriate (ChatGPT for creative tasks, Claude for analysis, Hugging Face for specialized models).
2. **Prepare the prompt:** Format the prompt with clear instructions, context, and expected output format.
3. **Call the tool:** Use browser automation for web‑based LLMs or API call for Hugging Face.
4. **Integrate the result:** Parse, validate, and incorporate the external response into your own answer.
5. **Fallback:** If the external call fails, continue with your own reasoning or try an alternative.

**Example decision logic:**

```python
def external_ai_assist(task_type, prompt):
    if task_type == "code_review":
        # Use Claude via browser automation
        return ask_claude(prompt)
    elif task_type == "translation":
        # Use Hugging Face translation model
        return hf_inference("Helsinki-NLP/opus-mt-en-de", prompt)
    elif task_type == "creative_writing":
        # Use ChatGPT via browser automation
        return ask_chatgpt(prompt)
    else:
        raise ValueError(f"No external AI configured for {task_type}")
```

### 4. Prompt Engineering for External Models

External models may require different prompting styles than the assistant's native model.

- **ChatGPT/Claude:** Use conversational style, system prompts, and markdown formatting.
- **Hugging Face models:** Follow the model's expected input format (e.g., `"Translate English to German: ..."` for T5).
- **Include context:** Provide necessary background, constraints, and examples in the prompt.
- **Specify output format:** Ask for JSON, bullet points, code block