Keldron — GPU & Hardware Monitor Agent
GPU monitoring with risk intelligence. Local + cloud fleet monitoring, health tracking, proactive alerts, and AI-powered fleet analytics.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install skilldb:horne-ra~keldron-agentcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/skilldb%3Ahorne-ra~keldron-agent/file -o keldron-agent.mdGit 仓库获取源码
git clone https://github.com/openclaw/skills/commit/86d204a451a2eefb1e5697fff21c88ef6533f70d# Keldron Agent — GPU Monitoring with Risk Intelligence
## 1. Overview
Keldron Agent is a vendor-neutral GPU monitoring agent that runs locally and exposes real-time telemetry and risk scores via a Prometheus endpoint. It supports Apple Silicon (M1–M5), NVIDIA consumer GPUs (RTX 3090/4090/5090), NVIDIA datacenter (H100/B200), AMD GPUs, and any Linux machine.
**Dual mode:** Use the **local agent** on `localhost:9100` for fast, real-time, single-device queries (works offline). Use **Keldron Cloud** (`https://api.keldron.ai`) for fleet overview, historical telemetry, analytics, and proactive fleet monitoring when an API key is configured.
**No sudo required on any platform** for the agent binary. On Linux, Docker may require `sudo` or membership in the `docker` group — see [Docker post-install](https://docs.docker.com/engine/install/linux-postinstall/) or rootless Docker if you hit permission errors.
Use this skill when the user wants to:
- Monitor GPU temperature, power, utilization, or memory
- Get risk assessments for their GPU
- Track power costs
- Set up alerts or watch a fleet (via cloud polling)
- Open the real dashboard (local or cloud)
- Ask fleet, history, or analytics questions
### Mode detection (run at the start of an interaction)
Field names **differ by endpoint** — see [Cloud API field names](#cloud-api-field-names-jq-reference) before writing `jq` filters.
```bash
# Check 1: Is the local agent running?
LOCAL_AGENT=$(curl -sf localhost:9100/healthz 2>/dev/null | jq -r '.status' 2>/dev/null)
# Check 2: Is cloud configured? (env → ~/.keldron/credentials from login → YAML)
CLOUD_KEY="${KELDRON_CLOUD_API_KEY:-}"
CLOUD_ENDPOINT=""
if [ -z "$CLOUD_KEY" ] && [ -f ~/.keldron/credentials ] && command -v jq &>/dev/null; then
CLOUD_KEY=$(jq -r '.api_key // ""' ~/.keldron/credentials 2>/dev/null)
CLOUD_ENDPOINT=$(jq -r '.endpoint // ""' ~/.keldron/credentials 2>/dev/null)
fi
if [ -z "$CLOUD_KEY" ]; then
if command -v yq &>/dev/null; then
CLOUD_KEY=$(yq '.cloud.api_key // ""' ~/.config/keldron/keldron-agent.yaml 2>/dev/null)
CLOUD_ENDPOINT=$(yq '.cloud.endpoint // ""' ~/.config/keldron/keldron-agent.yaml 2>/dev/null)
else
CLOUD_KEY=$(grep -A3 'cloud:' ~/.config/keldron/keldron-agent.yaml 2>/dev/null \
| grep 'api_key:' | awk '{print $2}' | tr -d "\"'" | xargs 2>/dev/null)
CLOUD_ENDPOINT=$(grep -A3 'cloud:' ~/.config/keldron/keldron-agent.yaml 2>/dev/null \
| grep 'endpoint:' | awk '{print $2}' | tr -d "\"'" | xargs 2>/dev/null)
fi
fi
CLOUD_ENDPOINT="${CLOUD_ENDPOINT:-https://api.keldron.ai}"
if [ -z "$CLOUD_KEY" ]; then
echo "No cloud API key found. Run keldron-agent login, set KELDRON_CLOUD_API_KEY (non-interactive login or streaming), or add cloud.api_key to ~/.config/keldron/keldron-agent.yaml. Sign up at https://app.keldron.ai"
fi
# Check 3: Does cloud respond? (only if we have a key)
CLOUD_OK=""
if [ -n "$CLOUD_KEY" ]; then
CLOUD_OK=$(curl -sf "${CLOUD_ENDPOINT}/health" 2>/dev/null | jq -r '.status' 2>/dev/null)
fi
```
**Mode priority:**
- **Both** (healthy local + cloud) → Cloud for fleet, history, analytics, proactive fleet loops; local for instantaneous single-device metrics.
- **Cloud only** → Use cloud for everything that needs the API; mention local agent if they want lower-latency realtime on that machine.
- **Local only** → Use `localhost:9100` for realtime; naturally mention cloud for history, fleet, and alerts: *"That needs historical data — connect at app.keldron.ai."*
- **Neither** → Run the [Auto-setup flow](#3-auto-setup-flow).
---
## 2. Installation
Prefer a [GitHub release](https://github.com/keldron-ai/keldron-agent/releases) binary (`keldron-agent`). To build from source, clone the repo and run `make build` (requires Go and Node.js for the full dashboard).
### Mac (Apple Silicon)
```bash
curl -sfL https://github.com/keldron-ai/keldron-agent/releases/latest/download/keldron-agent-darwin-arm64 -o keldron-agent
chmod +x keldron-agent
```
### Linux (AMD64)
```bash
curl -sfL https://github.com/keldron-ai/keldron-agent/releases/latest/download/keldron-agent-linux-amd64 -o keldron-agent
chmod +x keldron-agent
```
### Linux (ARM64)
```bash
curl -sfL https://github.com/keldron-ai/keldron-agent/releases/latest/download/keldron-agent-linux-arm64 -o keldron-agent
chmod +x keldron-agent
```
### Linux (Docker)
```bash
docker rm -f keldron-agent 2>/dev/null || true
docker run -d --name keldron-agent --restart unless-stopped \
-p 9100:9100 -p 9200:9200 -p 8081:8081 \
-e KELDRON_OUTPUT_PROMETHEUS_HOST=0.0.0.0 \
-e KELDRON_API_HOST=0.0.0.0 \
-e KELDRON_HEALTH_BIND=0.0.0.0:8081 \
ghcr.io/keldron-ai/keldron-agent:latest
```
### Verify installation
```bash
./keldron-agent --version
```
---
## 3. Auto-setup flow
**Trigger phrases:** "monitor my hardware", "set up monitoring", "install keldron", "get started", "help me set up"
### Step 1: Detect environment
```bash
OS=$(uname -s)
if [ "$OS" = "Darwin" ]; then
CHIP=$(sysctl -n machdep.cpu.brand_string 2>/dev/null || echo "unknown")
echo "Detected: macOS with $CHIP"
elif command -v nvidia-smi &>/dev/null; then
GPU=$(nvidia-smi --query-gpu=name --format=csv,noheader 2>/dev/null | head -1)
echo "Detected: Linux with NVIDIA $GPU"
else
echo "Detected: Linux (generic thermal monitoring)"
fi
```
### Step 2: Check if agent is running
```bash
if curl -sf localhost:9100/healthz | jq -e '.status == "healthy"' &>/dev/null; then
echo "Keldron agent is already running. Skipping to Step 4."
exit 0
fi
```
### Step 3: Install agent
Download the release binary for the OS/arch into the current directory, then start in local mode. Example:
```bash
ARCH=$(uname -m)
if [ "$OS" = "Darwin" ]; then
BINARY="keldron-agent-darwin-arm64"
elif [ "$ARCH" = "x86_64" ]; then
BINARY="keldron-agent-linux-amd64"
else
BINARY="keldron-agent-linux-arm64"
fi
if [ "$OS" = "Darwin" ]; then
curl -sfL "https://github.com/keldron-ai/keldron-agent/releases/latest/download/${BINARY}" -o keldron-agent
chmod +x keldron-agent
./keldron-agent --local &
sleep 3
fi
if [ "$OS" = "Linux" ]; then
if command -v docker &>/dev/null; then
docker rm -f keldron-agent 2>/dev/null || true
if ! docker run -d --name keldron-agent --restart unless-stopped \
-p 9100:9100 -p 9200:9200 -p 8081:8081 \
-e KELDRON_OUTPUT_PROMETHEUS_HOST=0.0.0.0 \
-e KELDRON_API_HOST=0.0.0.0 \
-e KELDRON_HEALTH_BIND=0.0.0.0:8081 \
ghcr.io/keldron-ai/keldron-agent:latest; then
echo "Error: Failed to start keldron-agent container. Check Docker permissions and network."
exit 1
fi
else
curl -sfL "https://github.com/keldron-ai/keldron-agent/releases/latest/download/${BINARY}" -o keldron-agent
chmod +x keldron-agent
./keldron-agent --local &
fi
sleep 3
fi
curl -sf localhost:9100/healthz | jq -e '.status == "healthy"'
```
Report initial readings (temperature, utilization, risk score) from [Quick status queries](#8-quick-status-queries-local-real-time) once healthy.
### Step 4: Offer cloud connection
Guide the user conversationally:
1. **Account:** Ask: *Do you have a Keldron Cloud account? You can sign up free at https://app.keldron.ai*
2. **If yes:** *Run `keldron-agent login` — you can use email/password or paste your API key (option 2 in the menu).*
3. **If no:** *Sign up at https://app.keldron.ai (GitHub login available), then run `keldron-agent login`.*
4. **Verify:** *Run `keldron-agent whoami` to confirm you're connected.*
5. **Restart** the agent so it picks up credentials and begins streaming.
If they are not yet interested, summarize value:
```bash
if [ -n "$CLOUD_KEY" ]; then
echo "Cloud already connected."
else
echo "Want to connect to Keldron Cloud?"
echo " • 180-day telemetry history"
echo " • Fleet analytics and device comparison"
echo " • Device health tracking"
echo " • Proactive fleet alerts"
echo " • Dashboard at app.keldron.ai"
fi
```
### Step 5 optional: env o