grok-imagine-image-pro

ClawSkills 作者 clawskills

Generiert hochwertige Bilder mit xAI Grok/Flux API. Use when user asks for image generation ("mach a Bild von...", "generier PNG...", "Bild erstellen") or image editing ("ändere das Bild", "mach daraus..."). Handles Prompts, styles, aspect ratios, edits, batch generation. Outputs PNG via base64 or file.

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install clawskills:clawskills~nixeifoit-grok-imagine-image-pro

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Aclawskills~nixeifoit-grok-imagine-image-pro/file -o nixeifoit-grok-imagine-image-pro.md

# Grok Imagine Image Pro

API Key: `$XAI_API_KEY` (already configured)
Save dir: `~/.openclaw/media/` (resolves to `/data/.openclaw/media/` — allowed for Telegram sending)

## Available Models

- `grok-imagine-image` — standard quality, faster
- `grok-imagine-image-pro` — higher quality (default for generation)

## 1. Image Generation

```bash
curl -s https://api.x.ai/v1/images/generations \
  -H "Authorization: Bearer $XAI_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "grok-imagine-image-pro",
    "prompt": "<PROMPT>",
    "n": 1,
    "response_format": "b64_json"
  }' | python3 -c "
import json, sys, base64, os, time
os.makedirs(os.path.expanduser('~/.openclaw/media'), exist_ok=True)
r = json.load(sys.stdin)
ts = int(time.time())
for i, img in enumerate(r['data']):
    img_data = base64.b64decode(img['b64_json'])
    fpath = os.path.expanduser(f'~/.openclaw/media/generated_{ts}_{i}.png')
    with open(fpath, 'wb') as f:
        f.write(img_data)
    print(fpath)
"
```

### Aspect Ratios

Add `"aspect_ratio": "<ratio>"` to the JSON body. Supported values:

| Ratio | Use case |
|-------|----------|
| `1:1` | Social media, thumbnails |
| `16:9` / `9:16` | Widescreen, mobile stories |
| `4:3` / `3:4` | Presentations, portraits |
| `3:2` / `2:3` | Photography |
| `2:1` / `1:2` | Banners, headers |
| `auto` | Model picks best ratio (default) |

### Batch Generation

Set `"n": <count>` (1-10) to generate multiple images in one request.

## 2. Image Editing / Style Transfer

Edit an existing image by providing a source image plus an edit prompt.
Uses the same `/v1/images/generations` endpoint with an added `image_url` field.

**Do NOT use `/v1/images/edits` with multipart — xAI requires JSON.**

**IMPORTANT:** For local files, use Python to build the payload JSON file, then curl with `@file`.
Inline base64 in curl args causes "Argument list too long" for images >~100KB.

**NOTE:** This is NOT true image editing — the API generates a new image inspired by the source.
It cannot make pixel-precise edits (e.g. changing only a car's color while keeping everything else identical).

### Edit from local file (recommended approach):

```bash
python3 -c "
import json, base64
with open('<SOURCE_PATH>', 'rb') as f:
    b64 = base64.b64encode(f.read()).decode()
payload = {
    'model': 'grok-imagine-image',
    'prompt': '<EDIT_PROMPT>',
    'image_url': f'data:image/png;base64,{b64}',
    'n': 1,
    'response_format': 'b64_json'
}
with open('/tmp/img_edit_payload.json', 'w') as f:
    json.dump(payload, f)
print('Payload ready')
" && \
curl -s https://api.x.ai/v1/images/generations \
  -H "Authorization: Bearer $XAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d @/tmp/img_edit_payload.json | python3 -c "
import json, sys, base64, os, time
os.makedirs(os.path.expanduser('~/.openclaw/media'), exist_ok=True)
r = json.load(sys.stdin)
img_data = base64.b64decode(r['data'][0]['b64_json'])
fpath = os.path.expanduser(f'~/.openclaw/media/edited_{int(time.time())}.png')
with open(fpath, 'wb') as f:
    f.write(img_data)
print(fpath)
"
```

### Edit from URL:

```bash
curl -s https://api.x.ai/v1/images/generations \
  -H "Authorization: Bearer $XAI_API_KEY" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "grok-imagine-image",
    "prompt": "<EDIT_PROMPT>",
    "image_url": "<PUBLIC_IMAGE_URL>",
    "n": 1,
    "response_format": "b64_json"
  }' | python3 -c "
import json, sys, base64, os, time
os.makedirs(os.path.expanduser('~/.openclaw/media'), exist_ok=True)
r = json.load(sys.stdin)
img_data = base64.b64decode(r['data'][0]['b64_json'])
fpath = os.path.expanduser(f'~/.openclaw/media/edited_{int(time.time())}.png')
with open(fpath, 'wb') as f:
    f.write(img_data)
print(fpath)
"
```

### Style Transfer Examples

Use editing with a style prompt, e.g.:
- "Render this as an oil painting in impressionist style"
- "Make this a pencil sketch with detailed shading"
- "Convert to pop art with bold colors"
- "Watercolor painting with soft edges"

## 3. Sending to Telegram

```
message tool: action=send, channel=telegram, target=<id>,
  message="<caption>", filePath=~/.openclaw/media/<file>.png
```

- Always include `message` field (required even for media-only sends)
- Allowed media paths: `/tmp/`, `~/.openclaw/media/`, `~/.openclaw/agents/`

## Notes

- Do NOT pass `size` parameter — returns 400
- Aspect ratio: pass `aspect_ratio` in JSON body (not `size`)
- Editing: use `image_url` field in the generations endpoint (NOT the edits endpoint with multipart)
- **Always use `"response_format": "b64_json"`** — URL format returns temporary URLs that often 403
- For large images: build payload with Python → save to `/tmp/` → curl with `@file` syntax
- Max 10 images per request
- Images are subject to content moderation
- Editing is style-transfer/reimagination, NOT pixel-precise inpainting