webchat-voice-full-stack

TotalClaw 作者 totalclaw

用于 OpenClaw WebChat 本地语音输入的一步式全栈安装程序语音到文本。按顺序协调三个重点技能：本地 STT 后端 (faster-whisper-local-service)、HTTPS/WSS 反向代理 (webchat-https-proxy)、和语音 UI 麦克风控件 (webchat-voice-gui)。包括一键通、连续录制快捷方式、VU 表和本地化 UI（EN/DE/ZH）。专为透明、用户级部署而设计，具有显式、可逆的功能仅更改（systemd 用户服务、控制 UI 资产注入、网关允许来源更新）。所有子技能脚本的SHA256完整性验证执行之前——部署因任何校验和不匹配而中止。无外部初始模型下载后，可进行遥测，并且无需重复使用 API 成本。关键词：语音输入、麦克风、WebChat、语音转文本、STT、本地转录、耳语、全栈、一键式、语音按钮、一键通、PTT、键盘快捷方式、i18n、HTTPS、WSS、完整性验证、校验和。

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install totalclaw:totalclaw~neldar-webchat-voice-full-stack

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~neldar-webchat-voice-full-stack/file -o neldar-webchat-voice-full-stack.md

## 概述（中文）

用于 OpenClaw WebChat 本地语音输入的一步式全栈安装程序
语音到文本。按顺序协调三个重点技能： 本地 STT 后端
(faster-whisper-local-service)、HTTPS/WSS 反向代理 (webchat-https-proxy)、
和语音 UI 麦克风控件 (webchat-voice-gui)。包括一键通、
连续录制快捷方式、VU 表和本地化 UI（EN/DE/ZH）。
专为透明、用户级部署而设计，具有显式、可逆的功能
仅更改（systemd 用户服务、控制 UI 资产注入、网关
允许来源更新）。所有子技能脚本的SHA256完整性验证
执行之前——部署因任何校验和不匹配而中止。无外部
初始模型下载后，可进行遥测，并且无需重复使用 API 成本。关键词：
语音输入、麦克风、WebChat、语音转文本、STT、本地转录、
耳语、全栈、一键式、语音按钮、一键通、PTT、键盘
快捷方式、i18n、HTTPS、WSS、完整性验证、校验和。

## 原文

# WebChat Voice Full Stack

Meta-installer that orchestrates three standalone skills in the correct order:

1. **`faster-whisper-local-service`** — local STT backend (HTTP on 127.0.0.1:18790)
2. **`webchat-https-proxy`** — HTTPS/WSS reverse proxy for Control UI + WebSocket + transcription
3. **`webchat-voice-gui`** — mic button, VU meter, keyboard shortcuts, i18n for WebChat

## Prerequisites

All three skills must be installed before running this meta-installer:

```bash
npx clawhub install faster-whisper-local-service
npx clawhub install webchat-https-proxy
npx clawhub install webchat-voice-gui
```

Additionally required on the system:
- Python 3.10+
- `gst-launch-1.0` (GStreamer, from OS packages)
- Internet access on first run (model download ~1.5 GB for `medium`)

## Deploy

```bash
bash scripts/deploy.sh
```

Optional overrides (passed through to downstream scripts):

```bash
VOICE_HOST=10.0.0.42 VOICE_HTTPS_PORT=8443 TRANSCRIBE_PORT=18790 WHISPER_LANGUAGE=auto bash scripts/deploy.sh
```

## What this does (via downstream scripts)

This skill does **not** contain deployment logic itself. It calls `deploy.sh` from each sub-skill:

### Step 1: faster-whisper-local-service
- Creates Python venv, installs `faster-whisper==1.1.1`
- Writes `transcribe-server.py` with input validation (magic-byte check, size limit)
- Creates systemd user service `openclaw-transcribe.service`
- Downloads model weights on first run (~1.5 GB for medium)

### Step 2: webchat-https-proxy
- Copies `https-server.py` to workspace
- Adds HTTPS origin to `gateway.controlUi.allowedOrigins`
- Creates systemd user service `openclaw-voice-https.service`
- Auto-generates self-signed TLS cert (TLS 1.2+ enforced)

### Step 3: webchat-voice-gui
- Copies `voice-input.js` and injects `<script>` tag into Control UI
- Installs gateway startup hook for update safety
- Optional interactive language selection

For full details, security notes, and uninstall instructions, see each skill's SKILL.md.

## Security posture (why these changes are expected)

This is a **meta-installer**, so it coordinates downstream skills and applies only the minimum required local changes:

- **Persistence:** creates user-level systemd services so STT/proxy survive reboot (`openclaw-transcribe`, `openclaw-voice-https`)
- **UI enablement:** injects one explicit `<script>` tag for `voice-input.js` in Control UI
- **Gateway compatibility:** appends one HTTPS origin to `gateway.controlUi.allowedOrigins`

Safety characteristics:
- all changes are documented and reversible via uninstall scripts
- no root/sudo required (user scope only)
- no hidden background tasks beyond documented services
- no outbound telemetry or data exfiltration behavior

### Integrity verification

Before executing any sub-skill script, `deploy.sh` verifies SHA256 checksums of **all** sub-skill scripts against `scripts/checksums.sha256`. If any script was modified after installation (e.g. by a registry update or tampering), deployment **aborts** with a clear error.

**Workflow:**
1. `npx clawhub install <sub-skill>` — fetch from registry
2. Audit the scripts manually or via code review
3. `bash scripts/rehash.sh` — record trusted checksums
4. `bash scripts/deploy.sh` — verify checksums, then deploy

**Dry-run verification** (no deployment):
```bash
VERIFY_ONLY=true bash scripts/deploy.sh
```

**After a sub-skill update:**
1. Review the changed scripts
2. `bash scripts/rehash.sh` to update the trusted baseline
3. Commit the updated `checksums.sha256`

## Verify

```bash
bash scripts/status.sh
```

## Uninstall

Uninstall each skill separately (in reverse order):

```bash
# 1. Voice GUI (hook, UI injection, workspace files)
bash skills/webchat-voice-gui/scripts/uninstall.sh

# 2. HTTPS Proxy (service, gateway config, certs)
bash skills/webchat-https-proxy/scripts/uninstall.sh

# 3. STT Backend (service, venv)
systemctl --user stop openclaw-transcribe.service
systemctl --user disable openclaw-transcribe.service
rm -f ~/.config/systemd/user/openclaw-transcribe.service
systemctl --user daemon-reload
```

## Notes

- This meta-skill is a convenience wrapper. All actual logic lives in the three sub-skills.
- Review each sub-skill's scripts and security notes before running.
- The `WORKSPACE` and `SKILLS_DIR` paths are configurable via environment variables.