virtual-remote-desktop

TotalClaw 作者 totalclaw

基于 KasmVNC 的无头 Linux 虚拟桌面,具有人工智能优先的自动化和人工切换。当大多数步骤都是自动化的,但用户必须手动干预验证码/风险控制/登录批准,然后返回自动化时使用。包括移动/桌面接管和浏览器移动/桌面渲染的需求驱动设置。

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install totalclaw:totalclaw~zhangxin15435-virtual-remote-desktop
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~zhangxin15435-virtual-remote-desktop/file -o zhangxin15435-virtual-remote-desktop.md
# Virtual Remote Desktop (KasmVNC edition)

## What this skill is for

Use this when the workflow is:
1. AI runs browser automation most of the time
2. Captcha / risk-control / MFA appears
3. User takes over remotely for 1-3 minutes
4. AI continues automatically

This version replaces x11vnc+noVNC with **KasmVNC** and keeps computer-use style action scripts for AI control.

## Core commands

### 0) Install KasmVNC (one-time)

```bash
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/install_kasmvnc.sh
```

This installer also prepares required runtime tools:
- `fluxbox` (lightweight desktop)
- `xdotool` + `scrot` (computer-use actions)
- `xauth`

### 1) Requirement-driven start (recommended)

Before starting, always confirm these user requirements:
1. takeover device: **手机** or **电脑**
2. website render mode: **手机网页** or **桌面网页**
3. access mode: **本地隧道** (`127.0.0.1`) or **临时公网** (`0.0.0.0`)
4. network quality: **弱网 / 普通 / 良好**

Use guided script (interactive Q&A):

```bash
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd_guided.sh
```

Preview config without starting:

```bash
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd_guided.sh --dry-run
```

### 1.1) Direct start (manual env)

```bash
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh
```

Important env vars:
- `AUTO_LAUNCH_URL` (optional): open target page automatically
- `KASM_BIND` (default `127.0.0.1`, safer)
- `AUTO_STOP_IDLE_SECS` (default `900`)
- `BROWSER_MOBILE_MODE=1` (launch browser with mobile emulation)
- `BROWSER_DEVICE=iphone14pro|pixel7|ipad`

Example:

```bash
AUTO_LAUNCH_URL="https://example.com/login" \
AUTO_STOP_IDLE_SECS=1200 \
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh
```

Mobile-friendly VNC stream example (better phone takeover UX):

```bash
MOBILE_MODE=1 MOBILE_PRESET=phone \
AUTO_STOP_IDLE_SECS=900 \
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh
```

Mobile stream options:
- `MOBILE_MODE=1` enables mobile defaults
- `MOBILE_PRESET=phone|tablet` sets default resolution (`960x540` / `1280x720`)
- `KASM_MAX_FPS` can be lowered further (e.g. `18`) on weak networks

Browser mobile emulation (website renders as mobile page):

```bash
AUTO_LAUNCH_URL="https://example.com" \
BROWSER_MOBILE_MODE=1 BROWSER_DEVICE=iphone14pro \
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh
```

Notes:
- Browser mobile emulation changes UA + viewport + touch behavior.
- It is different from `MOBILE_MODE` (which optimizes VNC stream size).

### 2) Check status / health

```bash
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/status_vrd.sh
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/health_vrd.sh
```

### 3) Stop desktop

```bash
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/stop_vrd.sh
```

---

## AI automation actions (computer-use style)

All actions run on the active virtual display from `pids.env`.

```bash
# screenshot (base64)
bash scripts/action_screenshot.sh

# click / type / key / scroll
bash scripts/action_click.sh 500 420 left
bash scripts/action_type.sh "hello"
bash scripts/action_key.sh "ctrl+l"
bash scripts/action_scroll.sh down 4

# helpers
bash scripts/action_mouse_move.sh 800 300
bash scripts/action_cursor_position.sh
bash scripts/action_wait.sh 2
```

Recommended loop:
1. `action_screenshot.sh`
2. analyze
3. `action_click/type/key/...`
4. screenshot verify
5. repeat

---

## Best handoff pattern (AI ↔ User)

When captcha/risk-control appears:
1. AI sends user the KasmVNC URL + username/password
2. User manually solves challenge
3. User replies “done”
4. AI runs screenshot + validation step
5. AI resumes automation

This avoids full manual operation while keeping recovery fast.

---

## UX presets

### Preset A: safe local tunnel (recommended)
- `KASM_BIND=127.0.0.1`
- user connects via SSH tunnel
- best for security

### Preset B: temporary public takeover
- `KASM_BIND=0.0.0.0`
- short `AUTO_STOP_IDLE_SECS` (e.g. 300)
- use only for urgent remote intervention

---

## Notes

- KasmVNC uses HTTPS + per-user auth (username/password).
- This skill stores runtime files in `~/.openclaw/vrd-data` by default.
- Browser profile persistence is controlled by `CHROME_PROFILE_DIR`.
- If captcha frequency is high, keep one long-lived profile to reduce repeated challenges.