Voice Agent
为 Wesley 提供自主语音层:通过 ElevenLabs 自动配置、克隆声音、文本转语音,以及可选的 Twilio 对话式外呼/接听能力。
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install totalclaw:georges91560~voice-agent-v1cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Ageorges91560~voice-agent-v1/file -o voice-agent-v1.mdGit 仓库获取源码
git clone https://github.com/openclaw/skills/commit/88926549fd5ce76a1e826f4479f8bd823dff08e3---
name: voice-agent
description: |
使用 ElevenLabs 为代理提供完整的语音层。克隆
校长的声音,从任何文本(VSL、播客、
视频脚本、培育序列),并部署对话式人工智能
通过 Twilio 进行自动呼入和呼出呼叫的代理。自配置
通过虚拟桌面自主导航 elevenlabs.io — 无需手册
需要 API 密钥设置。当客服人员需要讲话、呼叫线索、
回答潜在客户,或大规模制作音频内容。
version: 1.0.0
author: Wesley Armando (Georges Andronescu)
license: MIT
metadata:
openclaw:
emoji: "🎙️"
security_level: L2
always: false
required_paths:
read:
- /workspace/voice/config.json
- /workspace/voice/scripts/
- /workspace/voice/samples/
- /workspace/.learnings/LEARNINGS.md
write:
- /workspace/voice/config.json
- /workspace/voice/scripts/
- /workspace/voice/output/
- /workspace/voice/calls/
- /workspace/.learnings/LEARNINGS.md
- /workspace/.learnings/ERRORS.md
- /workspace/AUDIT.md
network_behavior:
makes_requests: true
request_targets:
- https://elevenlabs.io (dashboard navigation via virtual-desktop)
- https://api.elevenlabs.io (ElevenLabs REST API — requires ELEVENLABS_API_KEY)
- https://api.twilio.com (Twilio API — optional, requires TWILIO credentials)
uses_agent_telegram: true
telegram_note: >
Notifies principal when voice clone is ready, audio file generated,
or call completed. Reports call transcripts and outcomes.
always: false
requires:
skills:
- virtual-desktop
optional_skills:
- acquisition-master
- funnel-builder
bins:
- python3
- ffmpeg
---
## 概述(中文)
为 Wesley 提供自主语音层:通过 ElevenLabs 自动配置、克隆声音、文本转语音,以及可选的 Twilio 对话式外呼/接听能力。
## 技能正文
# Voice Agent — Wesley 的自主语音层
> "任何房间里最可信的声音,都是听起来像你自己的那个。"
Agent 不只是写内容,还要把它说出来。
本技能为 Wesley 赋予声音——他自己的声音——并支持规模化部署。
---
## 本技能做什么
```
LAYER 1 — VOICE SETUP (self-configuring)
Navigates elevenlabs.io autonomously via virtual-desktop
Logs in via Google OAuth or email/password
Creates API key, clones voice, configures agent
Writes all credentials to .env automatically
LAYER 2 — TEXT TO SPEECH
Converts any text to MP3 using Wesley's cloned voice
VSL scripts, podcast intros, video narrations
Email audio versions, social audio clips
LAYER 3 — CONVERSATIONAL AGENT (with Twilio)
Outbound calls to leads — automated follow-up
Inbound calls — answers 24/7, qualifies, reports
Natural turn-taking, handles objections, books calls
```
---
## 阶段 1 — 自动配置
Agent 在首次使用时自动运行此阶段。
它通过 virtual-desktop 导航 ElevenLabs 并获取自己的凭据。
### 步骤 1 — 登录检测
```
The agent checks /workspace/voice/config.json for existing credentials.
IF ELEVENLABS_API_KEY already in config.json:
→ Skip to Phase 2 (already configured)
IF NOT configured:
→ Run the self-configuration sequence below
```
### 步骤 2 — 连接 ElevenLabs
```
OPTION A — Google OAuth (preferred — zero credentials required)
Condition: virtual-desktop has an active Google session
Process:
1. virtual-desktop opens https://elevenlabs.io/app/sign-in
2. Clicks "Continue with Google"
3. Google session is already active in the browser
4. ElevenLabs dashboard loads automatically
5. Proceed to API key creation
OPTION B — Email / Password
Condition: ELEVENLABS_EMAIL and ELEVENLABS_PASSWORD in .env
Process:
1. virtual-desktop opens https://elevenlabs.io/app/sign-in
2. Fills email field with ELEVENLABS_EMAIL
3. Fills password field with ELEVENLABS_PASSWORD
4. Clicks "Sign in"
5. ElevenLabs dashboard loads
6. Proceed to API key creation
FALLBACK — Manual
If neither option works:
Log to AUDIT.md: "ElevenLabs login failed — manual setup required"
Notify principal via Telegram with exact steps to follow
```
### 步骤 3 — 创建 API Key
```
Navigation path (2026 ElevenLabs UI):
Dashboard → bottom-left corner → "Developers"
→ Tab "API Keys"
→ Button "Create API Key"
→ Name: "wesley-agent"
→ Click "Create"
→ Copy the generated key (shown only once)
→ Write to /workspace/voice/config.json:
{ "ELEVENLABS_API_KEY": "sk_..." }
→ Also write to .env:
ELEVENLABS_API_KEY=sk_...
```
### 步骤 4 — 克隆声音
```
Requires: 3 MP3 files in /workspace/voice/samples/
Minimum: 30 seconds each, clear audio, no background noise
Optimal: 3-5 minutes total, varied sentences
Navigation path:
Dashboard → "Voices" → "Add Voice"
→ "Voice Clone" → "Instant Voice Clone"
→ Upload files from /workspace/voice/samples/
→ Name: "Wesley"
→ Click "Create Voice Clone"
→ Wait for processing (usually < 30 seconds)
→ Copy the Voice ID from the voice card
→ Write to config.json: { "ELEVENLABS_VOICE_ID": "abc123..." }
IF no MP3 files in /workspace/voice/samples/:
→ Log to AUDIT.md: "Voice samples missing"
→ Notify principal via Telegram:
"To clone your voice, record 3 audio clips of 30-60 seconds each
(read any text naturally), save as MP3, and upload to
/workspace/voice/samples/
Then run voice-agent again."
→ Pause and wait for samples
```
### 步骤 5 — 创建对话式 Agent(可选 — 用于通话)
```
Only runs if TWILIO_ACCOUNT_SID is in .env
Navigation path:
Dashboard → "Agents" → "Create Agent"
→ Name: "Wesley Sales Agent"
→ Voice: select "Wesley" (the cloned voice)
→ System prompt: read from /workspace/voice/templates/agent_prompt.md
→ Save → Copy Agent ID
→ Write to config.json: { "ELEVENLABS_AGENT_ID": "agent_..." }
Then connect Twilio:
Dashboard → "Agents" → select "Wesley Sales Agent"
→ "Phone Numbers" tab → "Add Phone Number"
→ Enter TWILIO_ACCOUNT_SID + TWILIO_AUTH_TOKEN
→ Select TWILIO_PHONE_NUMBER
→ ElevenLabs configures Twilio automatically
→ Write to config.json: { "TWILIO_CONFIGURED": true }
```
### 配置完成
```
When all steps are done, config.json contains:
{
"ELEVENLABS_API_KEY": "sk_...",
"ELEVENLABS_VOICE_ID": "...",
"ELEVENLABS_AGENT_ID": "...", ← if Twilio configured
"TWILIO_CONFIGURED": true, ← if Twilio configured
"setup_date": "YYYY-MM-DD",
"voice_name": "Wesley"
}
Telegram notification:
"🎙️ Voice Agent configured and ready.
Voice: Wesley (cloned)
TTS: active
Calls: [active / not configured]"
```
---
## 声音克隆 — 完整参考
本节为 Agent 提供克隆 principal 声音所需的全部命令与导航步骤。
提供两条路径 — 根据上下文选择合适的一条。
---
### 开始前需要准备
```
AUDIO SAMPLES — required for voice cloning
Minimum : 1 file × 30 seconds
Recommended : 3 files × 1-2 minutes each
Optimal (Professional Clone) : 30+ minutes total
Quality requirements :
→ Clear voice, no background noise or music
→ Natural speech rhythm (not reading robotically)
→ Consistent microphone distance
→ Format : MP3, WAV, M4A, FLAC all accepted
→ No multiple speakers in the same file
Where to put them :
/workspace/voice/samples/sample_01.mp3
/workspace/voice/samples/sample_02.mp3
/workspace/voice/samples/sample_03.mp3
MINIMUM PLAN REQUIRED
Instant Voice Clone (IVC) : Starter plan ($5/month) or above
Professional Voice Clone (PVC) : Creator plan ($22/month) or above
```
---
### 路径 A — 终端 / API(最快 — 无需浏览器)
当 `config.json` 中已有 `ELEVENLABS_API_KEY` 时使用此路径。
Agent 直接调用 API,无需 virtual-desktop。
#### 步骤 1 — 安装 SDK
```bash
pip install elevenlabs --break-system-packages
pip install requests --break-system-packages
```
#### 步骤 2 — 验证 API key 是否可用
```bash
curl -s https://api.elevenlabs.io/v1/user -H "xi-api-key: $ELEVENLABS_API_KEY" | python3 -m json.tool
# Expected: JSON with subscription info
# If 401 error: API key is wrong or expired
```
#### 步骤 3 — 通过 Python SDK 克隆声音
```python
from elevenlabs.client import ElevenLabs
import json, os
client = ElevenLabs(api_key=os.environ["ELEVENLABS_API_KEY"])
# Create Instant Voice Clone
voice = client.voices.ivc.create(
name="Wesley",
description="Wesley Armando — principal voice for autonomous agent",
files=[
"/workspace/voice/samples/sample_01.mp3",
"/workspace/voice/samples/sample_02.mp3",
"/workspace/voice/samples/sample_03.mp3",
],
)
print(f"Voice ID: {voice.voice_id}")
print(f"Name: {voice.name}")
# Save to config.json
config_path = "/workspace/voice/c