Voice Agent

TotalClaw 作者 georges91560 v1.0.0

为 Wesley 提供自主语音层:通过 ElevenLabs 自动配置、克隆声音、文本转语音,以及可选的 Twilio 对话式外呼/接听能力。

源码 ↗

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install totalclaw:georges91560~voice-agent-v1
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Ageorges91560~voice-agent-v1/file -o voice-agent-v1.md
Git 仓库获取源码
git clone https://github.com/openclaw/skills/commit/88926549fd5ce76a1e826f4479f8bd823dff08e3
---
name: voice-agent
description: |
  使用 ElevenLabs 为代理提供完整的语音层。克隆
  校长的声音,从任何文本(VSL、播客、
  视频脚本、培育序列),并部署对话式人工智能
  通过 Twilio 进行自动呼入和呼出呼叫的代理。自配置
  通过虚拟桌面自主导航 elevenlabs.io — 无需手册
  需要 API 密钥设置。当客服人员需要讲话、呼叫线索、
  回答潜在客户,或大规模制作音频内容。
version: 1.0.0
author: Wesley Armando (Georges Andronescu)
license: MIT
metadata:
  openclaw:
    emoji: "🎙️"
    security_level: L2
    always: false
    required_paths:
      read:
        - /workspace/voice/config.json
        - /workspace/voice/scripts/
        - /workspace/voice/samples/
        - /workspace/.learnings/LEARNINGS.md
      write:
        - /workspace/voice/config.json
        - /workspace/voice/scripts/
        - /workspace/voice/output/
        - /workspace/voice/calls/
        - /workspace/.learnings/LEARNINGS.md
        - /workspace/.learnings/ERRORS.md
        - /workspace/AUDIT.md
    network_behavior:
      makes_requests: true
      request_targets:
        - https://elevenlabs.io (dashboard navigation via virtual-desktop)
        - https://api.elevenlabs.io (ElevenLabs REST API — requires ELEVENLABS_API_KEY)
        - https://api.twilio.com (Twilio API — optional, requires TWILIO credentials)
      uses_agent_telegram: true
      telegram_note: >
        Notifies principal when voice clone is ready, audio file generated,
        or call completed. Reports call transcripts and outcomes.
    always: false
    requires:
      skills:
        - virtual-desktop
      optional_skills:
        - acquisition-master
        - funnel-builder
      bins:
        - python3
        - ffmpeg
---

## 概述(中文)

为 Wesley 提供自主语音层:通过 ElevenLabs 自动配置、克隆声音、文本转语音,以及可选的 Twilio 对话式外呼/接听能力。

## 技能正文

# Voice Agent — Wesley 的自主语音层

> "任何房间里最可信的声音,都是听起来像你自己的那个。"

Agent 不只是写内容,还要把它说出来。
本技能为 Wesley 赋予声音——他自己的声音——并支持规模化部署。

---

## 本技能做什么

```
LAYER 1 — VOICE SETUP (self-configuring)
  Navigates elevenlabs.io autonomously via virtual-desktop
  Logs in via Google OAuth or email/password
  Creates API key, clones voice, configures agent
  Writes all credentials to .env automatically

LAYER 2 — TEXT TO SPEECH
  Converts any text to MP3 using Wesley's cloned voice
  VSL scripts, podcast intros, video narrations
  Email audio versions, social audio clips

LAYER 3 — CONVERSATIONAL AGENT (with Twilio)
  Outbound calls to leads — automated follow-up
  Inbound calls — answers 24/7, qualifies, reports
  Natural turn-taking, handles objections, books calls
```

---

## 阶段 1 — 自动配置

Agent 在首次使用时自动运行此阶段。
它通过 virtual-desktop 导航 ElevenLabs 并获取自己的凭据。

### 步骤 1 — 登录检测

```
The agent checks /workspace/voice/config.json for existing credentials.

IF ELEVENLABS_API_KEY already in config.json:
  → Skip to Phase 2 (already configured)

IF NOT configured:
  → Run the self-configuration sequence below
```

### 步骤 2 — 连接 ElevenLabs

```
OPTION A — Google OAuth (preferred — zero credentials required)
  Condition: virtual-desktop has an active Google session

  Process:
  1. virtual-desktop opens https://elevenlabs.io/app/sign-in
  2. Clicks "Continue with Google"
  3. Google session is already active in the browser
  4. ElevenLabs dashboard loads automatically
  5. Proceed to API key creation

OPTION B — Email / Password
  Condition: ELEVENLABS_EMAIL and ELEVENLABS_PASSWORD in .env

  Process:
  1. virtual-desktop opens https://elevenlabs.io/app/sign-in
  2. Fills email field with ELEVENLABS_EMAIL
  3. Fills password field with ELEVENLABS_PASSWORD
  4. Clicks "Sign in"
  5. ElevenLabs dashboard loads
  6. Proceed to API key creation

FALLBACK — Manual
  If neither option works:
  Log to AUDIT.md: "ElevenLabs login failed — manual setup required"
  Notify principal via Telegram with exact steps to follow
```

### 步骤 3 — 创建 API Key

```
Navigation path (2026 ElevenLabs UI):
  Dashboard → bottom-left corner → "Developers"
  → Tab "API Keys"
  → Button "Create API Key"
  → Name: "wesley-agent"
  → Click "Create"
  → Copy the generated key (shown only once)
  → Write to /workspace/voice/config.json:
    { "ELEVENLABS_API_KEY": "sk_..." }
  → Also write to .env:
    ELEVENLABS_API_KEY=sk_...
```

### 步骤 4 — 克隆声音

```
Requires: 3 MP3 files in /workspace/voice/samples/
  Minimum: 30 seconds each, clear audio, no background noise
  Optimal: 3-5 minutes total, varied sentences

Navigation path:
  Dashboard → "Voices" → "Add Voice"
  → "Voice Clone" → "Instant Voice Clone"
  → Upload files from /workspace/voice/samples/
  → Name: "Wesley"
  → Click "Create Voice Clone"
  → Wait for processing (usually < 30 seconds)
  → Copy the Voice ID from the voice card
  → Write to config.json: { "ELEVENLABS_VOICE_ID": "abc123..." }

IF no MP3 files in /workspace/voice/samples/:
  → Log to AUDIT.md: "Voice samples missing"
  → Notify principal via Telegram:
    "To clone your voice, record 3 audio clips of 30-60 seconds each
     (read any text naturally), save as MP3, and upload to
     /workspace/voice/samples/
     Then run voice-agent again."
  → Pause and wait for samples
```

### 步骤 5 — 创建对话式 Agent(可选 — 用于通话)

```
Only runs if TWILIO_ACCOUNT_SID is in .env

Navigation path:
  Dashboard → "Agents" → "Create Agent"
  → Name: "Wesley Sales Agent"
  → Voice: select "Wesley" (the cloned voice)
  → System prompt: read from /workspace/voice/templates/agent_prompt.md
  → Save → Copy Agent ID
  → Write to config.json: { "ELEVENLABS_AGENT_ID": "agent_..." }

Then connect Twilio:
  Dashboard → "Agents" → select "Wesley Sales Agent"
  → "Phone Numbers" tab → "Add Phone Number"
  → Enter TWILIO_ACCOUNT_SID + TWILIO_AUTH_TOKEN
  → Select TWILIO_PHONE_NUMBER
  → ElevenLabs configures Twilio automatically
  → Write to config.json: { "TWILIO_CONFIGURED": true }
```

### 配置完成

```
When all steps are done, config.json contains:
{
  "ELEVENLABS_API_KEY": "sk_...",
  "ELEVENLABS_VOICE_ID": "...",
  "ELEVENLABS_AGENT_ID": "...",     ← if Twilio configured
  "TWILIO_CONFIGURED": true,         ← if Twilio configured
  "setup_date": "YYYY-MM-DD",
  "voice_name": "Wesley"
}

Telegram notification:
"🎙️ Voice Agent configured and ready.
 Voice: Wesley (cloned)
 TTS: active
 Calls: [active / not configured]"
```

---

## 声音克隆 — 完整参考

本节为 Agent 提供克隆 principal 声音所需的全部命令与导航步骤。
提供两条路径 — 根据上下文选择合适的一条。

---

### 开始前需要准备

```
AUDIO SAMPLES — required for voice cloning
  Minimum : 1 file × 30 seconds
  Recommended : 3 files × 1-2 minutes each
  Optimal (Professional Clone) : 30+ minutes total

  Quality requirements :
  → Clear voice, no background noise or music
  → Natural speech rhythm (not reading robotically)
  → Consistent microphone distance
  → Format : MP3, WAV, M4A, FLAC all accepted
  → No multiple speakers in the same file

  Where to put them :
  /workspace/voice/samples/sample_01.mp3
  /workspace/voice/samples/sample_02.mp3
  /workspace/voice/samples/sample_03.mp3

MINIMUM PLAN REQUIRED
  Instant Voice Clone (IVC) : Starter plan ($5/month) or above
  Professional Voice Clone (PVC) : Creator plan ($22/month) or above
```

---

### 路径 A — 终端 / API(最快 — 无需浏览器)

当 `config.json` 中已有 `ELEVENLABS_API_KEY` 时使用此路径。
Agent 直接调用 API,无需 virtual-desktop。

#### 步骤 1 — 安装 SDK

```bash
pip install elevenlabs --break-system-packages
pip install requests --break-system-packages
```

#### 步骤 2 — 验证 API key 是否可用

```bash
curl -s https://api.elevenlabs.io/v1/user   -H "xi-api-key: $ELEVENLABS_API_KEY" | python3 -m json.tool
# Expected: JSON with subscription info
# If 401 error: API key is wrong or expired
```

#### 步骤 3 — 通过 Python SDK 克隆声音

```python
from elevenlabs.client import ElevenLabs
import json, os

client = ElevenLabs(api_key=os.environ["ELEVENLABS_API_KEY"])

# Create Instant Voice Clone
voice = client.voices.ivc.create(
    name="Wesley",
    description="Wesley Armando — principal voice for autonomous agent",
    files=[
        "/workspace/voice/samples/sample_01.mp3",
        "/workspace/voice/samples/sample_02.mp3",
        "/workspace/voice/samples/sample_03.mp3",
    ],
)

print(f"Voice ID: {voice.voice_id}")
print(f"Name: {voice.name}")

# Save to config.json
config_path = "/workspace/voice/c