external-ai-integration

ClawSkills 作者 clawskills

Leverage external AI models (ChatGPT, Claude, Hugging Face, etc.) as tools via browser automation (Chrome Relay) and optional Hugging Face API. Use when you need to augment the assistant's capabilities with external LLMs for reasoning, summarization, code generation, or other tasks without spawning isolated sub‑agents.

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install clawskills:clawskills~konscious0beast-external-ai-integration

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Aclawskills~konscious0beast-external-ai-integration/file -o konscious0beast-external-ai-integration.md

# External AI Integration Skill

This skill provides patterns for using external AI models as **tools** that the assistant can call on‑demand. It extends existing browser‑automation and API‑integration skills, enabling the assistant to:

- **Automate interactions** with ChatGPT, Claude, Gemini, or other web‑based LLMs via Chrome Relay (browser automation).
- **Call Hugging Face Inference API** for models hosted on Hugging Face Spaces (text‑generation, summarization, translation, etc.).
- **Integrate external reasoning** into the assistant's own workflow—e.g., asking ChatGPT for a second opinion, using Claude for detailed analysis, or leveraging Hugging Face for domain‑specific tasks.
- **Avoid spawning isolated sub‑agents** by treating external models as tools, keeping control and context within the main assistant session.

## When to use

- You need additional reasoning power, a different model's perspective, or a specialized model (e.g., code generation, translation) that your primary model lacks.
- The task benefits from a second opinion or parallel evaluation (e.g., reviewing code, analyzing strategy).
- You want to use a model with a larger context window, better coding ability, or specific domain knowledge (Claude, ChatGPT, Hugging Face models).
- You are asked to “integrate external AI via browser” or “use ChatGPT/Claude as a tool”.
- You need to call Hugging Face Inference API for a specific model (e.g., summarization, sentiment analysis) and incorporate the result into your response.

## Core patterns

### 1. Browser Automation (Chrome Relay) for Web‑Based LLMs

Use Chrome Relay to automate interactions with ChatGPT, Claude, Gemini, or any other web‑based LLM that requires a browser interface.

**Prerequisites:**
- Chrome Relay extension installed and a tab attached (user must click the OpenClaw Browser Relay toolbar icon).
- The target LLM website (e.g., `chatgpt.com`, `claude.ai`) already logged in (session cookies present).
- Basic familiarity with the browser automation playbook (`memory/patterns/playbooks.md` – “Browser Automation (Chrome Relay)”).

**Steps:**

1. **Attach to the Chrome Relay profile** (`profile="chrome"`).
2. **Navigate to the target LLM** (or reuse an already‑open tab).
3. **Take a snapshot** to locate the input field and send button (use `refs="aria"` for stable references).
4. **Type the prompt** into the input field and submit (click send button or press Enter).
5. **Wait for the response** (poll for a new element, detect typing indicators, or use a fixed timeout).
6. **Extract the response text** from the appropriate DOM element.
7. **Return the response** to the assistant's workflow.

**Example workflow:**

```python
# This is a conceptual example; actual implementation uses browser tool calls.
def ask_chatgpt(prompt):
    # 1. Ensure Chrome Relay is attached
    browser(action="open", profile="chrome", targetUrl="https://chatgpt.com")
    # 2. Snapshot to get references
    snap = browser(action="snapshot", refs="aria")
    # 3. Find input field (aria role="textbox") and send button
    input_ref = snap.find_element(role="textbox", name="Message")
    send_ref = snap.find_element(role="button", name="Send")
    # 4. Type prompt and click send
    browser(action="act", request={"kind":"type", "ref":input_ref, "text":prompt})
    browser(action="act", request={"kind":"click", "ref":send_ref})
    # 5. Wait for response (simplified)
    time.sleep(10)
    # 6. Snapshot again, extract response from last message bubble
    snap2 = browser(action="snapshot", refs="aria")
    response_element = snap2.find_last_message()
    return response_element.text
```

**Key considerations:**
- **Session persistence:** The attached tab must stay logged in; avoid actions that log out.
- **Rate limits:** Be aware of the LLM's rate limits and usage policies.
- **Error handling:** Detect captchas, “network error” messages, or “try again” buttons and fall back gracefully.
- **Multi‑turn conversations:** Maintain conversation context by keeping the same tab and not refreshing.

### 2. Hugging Face Inference API Integration

For models hosted on Hugging Face Spaces or the Inference API, you can call them directly via HTTP requests.

**Prerequisites:**
- Hugging Face API token (stored in 1Password or environment variable).
- Model identifier (e.g., `"gpt2"`, `"google/flan-t5-large"`, `"microsoft/DialoGPT-medium"`).
- Knowledge of the model's expected input/output format.

**Steps:**

1. **Retrieve the API token** (use 1Password skill or read from `~/.huggingface/token`).
2. **Construct the request** (URL, headers, JSON payload).
3. **Send the request** via `curl` or `exec` with `requests` Python module.
4. **Parse the response** and extract the generated text.
5. **Handle errors** (rate limits, model loading, invalid token).

**Example script (using curl):**

```bash
#!/bin/bash
set -e

MODEL="google/flan-t5-large"
PROMPT="Translate English to German: How are you?"
API_TOKEN=$(op read "op://Personal/HuggingFace/api_token")

curl -s "https://api-inference.huggingface.co/models/$MODEL" \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"inputs\": \"$PROMPT\"}" | jq -r '.[0].generated_text'
```

**Example Python function (using requests):**

```python
import requests
import os

def hf_inference(model, inputs, parameters=None):
    api_token = os.getenv("HF_TOKEN")  # or retrieve via 1Password
    url = f"https://api-inference.huggingface.co/models/{model}"
    headers = {"Authorization": f"Bearer {api_token}"}
    payload = {"inputs": inputs}
    if parameters:
        payload.update(parameters)
    resp = requests.post(url, headers=headers, json=payload)
    resp.raise_for_status()
    return resp.json()
```

**Key considerations:**
- **Cost:** Inference API may have costs; monitor usage.
- **Model readiness:** Some models need to be loaded; include `{"options":{"wait_for_model":true}}` in parameters.
- **Output format:** Response structure varies by model; inspect with a test call first.

### 3. Orchestrating External AI as a Tool

Instead of spawning a sub‑agent, the assistant calls external AI within its own reasoning flow.

**Pattern:**

1. **Determine need:** Decide which external model is appropriate (ChatGPT for creative tasks, Claude for analysis, Hugging Face for specialized models).
2. **Prepare the prompt:** Format the prompt with clear instructions, context, and expected output format.
3. **Call the tool:** Use browser automation for web‑based LLMs or API call for Hugging Face.
4. **Integrate the result:** Parse, validate, and incorporate the external response into your own answer.
5. **Fallback:** If the external call fails, continue with your own reasoning or try an alternative.

**Example decision logic:**

```python
def external_ai_assist(task_type, prompt):
    if task_type == "code_review":
        # Use Claude via browser automation
        return ask_claude(prompt)
    elif task_type == "translation":
        # Use Hugging Face translation model
        return hf_inference("Helsinki-NLP/opus-mt-en-de", prompt)
    elif task_type == "creative_writing":
        # Use ChatGPT via browser automation
        return ask_chatgpt(prompt)
    else:
        raise ValueError(f"No external AI configured for {task_type}")
```

### 4. Prompt Engineering for External Models

External models may require different prompting styles than the assistant's native model.

- **ChatGPT/Claude:** Use conversational style, system prompts, and markdown formatting.
- **Hugging Face models:** Follow the model's expected input format (e.g., `"Translate English to German: ..."` for T5).
- **Include context:** Provide necessary background, constraints, and examples in the prompt.
- **Specify output format:** Ask for JSON, bullet points, code blocks, etc.

**Example prompt for code review:**

```
You are an expert software engineer reviewing the following code snippet. Please:
1. Identify potential bugs or secur