image-gen
Generate images using multiple AI models — Midjourney (via Legnext.ai), Flux, Nano Banana Pro (Gemini), Ideogram, Recraft, and more via fal.ai. Intelligently routes to the best model based on use case.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install clawskills:clawskills~wells1137-image-gencURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Aclawskills~wells1137-image-gen/file -o wells1137-image-gen.md# Image Generation Skill
This skill generates images using the best AI model for each use case. **Model selection is the most important decision** — read the dispatch logic carefully before generating.
---
## 🧠 Intelligent Dispatch Logic
**Always select the model based on the user's actual need, not just the request surface.**
### Decision Tree
```
Does the request involve MULTIPLE images that share characters, scenes, or story continuity?
├─ YES → Use NANO BANANA (Gemini)
│ Reason: Gemini understands context holistically; supports reference_images
│ for character/scene consistency across a series (storyboard, comic, sequence)
│
└─ NO → Is it a SINGLE standalone image?
├─ Artistic / cinematic / painterly / highly detailed?
│ → Use MIDJOURNEY
│
├─ Photorealistic / portrait / product photo?
│ → Use FLUX PRO
│
├─ Contains TEXT (logo, poster, sign, infographic)?
│ → Use IDEOGRAM
│
├─ Vector / icon / flat design / brand asset?
│ → Use RECRAFT
│
├─ Quick draft / fast iteration (speed priority)?
│ → Use FLUX SCHNELL (<2s)
│
└─ General purpose / balanced?
→ Use FLUX DEV
```
### Model Capability Matrix
| Model | ID | Artistic | Photorealism | Text | Context Continuity | Speed | Cost |
|---|---|---|---|---|---|---|---|
| **Midjourney** | `midjourney` | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐ | ❌ (no context) | ~30s | ~$0.05 |
| **Nano Banana Pro** | `nano-banana` | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ~20s | $0.15 |
| **Flux Pro** | `flux-pro` | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ❌ | ~5s | ~$0.05 |
| **Flux Dev** | `flux-dev` | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐ | ❌ | ~8s | ~$0.03 |
| **Flux Schnell** | `flux-schnell` | ⭐⭐ | ⭐⭐⭐ | ⭐⭐ | ❌ | <2s | ~$0.003 |
| **Ideogram v3** | `ideogram` | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ❌ | ~10s | ~$0.08 |
| **Recraft v3** | `recraft` | ⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐ | ❌ | ~8s | ~$0.04 |
| **SDXL Lightning** | `sdxl` | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ | ❌ | ~3s | ~$0.01 |
### When to Use Nano Banana (Critical)
Use **Nano Banana** whenever the user's request involves:
- **Storyboard / 分镜图**: Multiple frames that tell a story with the same characters
- **Comic strip / 漫画**: Sequential panels with consistent characters
- **Character series**: Multiple images of the same person/character in different poses or scenes
- **Scene continuation**: "Now show the same girl in the forest" (referencing a previous image)
- **Style consistency**: A set of images that must share the same visual style/world
Nano Banana uses Google's Gemini 3 Pro multimodal architecture, which understands context holistically rather than keyword-matching. It supports up to 14 reference images for maintaining character and scene consistency.
---
## How to Use This Skill
1. **Analyze the request**: Is it a single image or a series? Does it need context continuity?
2. **Select model**: Use the decision tree above.
3. **Enhance the prompt**: Add style, lighting, and quality descriptors appropriate for the model.
4. **Inform the user**: Tell them which model you're using and why, and that generation has started.
5. **Run the script**: Use `exec` tool with sufficient timeout.
6. **Deliver the result**: Send image URL(s) to the user.
---
## Calling the Generation Script
```bash
node {baseDir}/generate.js \
--model <model_id> \
--prompt "<enhanced prompt>" \
[--aspect-ratio <ratio>] \
[--num-images <1-4>] \
[--negative-prompt "<negative prompt>"] \
[--reference-images "<url1,url2,...>"]
```
**Parameters:**
- `--model`: One of `midjourney`, `flux-pro`, `flux-dev`, `flux-schnell`, `sdxl`, `nano-banana`, `ideogram`, `recraft`
- `--prompt`: The image generation prompt (required)
- `--aspect-ratio`: e.g. `16:9`, `1:1`, `9:16`, `4:3`, `3:4` (default: `1:1`)
- `--num-images`: 1-4 (default: `1`; Midjourney always returns 4 regardless)
- `--negative-prompt`: Things to avoid (not supported by Midjourney)
- `--reference-images`: Comma-separated image URLs for context/character consistency (**Nano Banana only**)
- `--mode`: Midjourney speed: `turbo` (default, ~20-40s), `fast` (~30-60s), `relax` (free but slow)
**exec timeout**: Set at least **120 seconds** for Midjourney and Nano Banana; 30 seconds is sufficient for Flux Schnell.
---
## ⚡ Midjourney Workflow (Sync Mode — No --async)
Always use sync mode (no `--async`). The script waits internally until complete.
```bash
node {baseDir}/generate.js \
--model midjourney \
--prompt "<enhanced prompt>" \
--aspect-ratio 16:9
```
### Understanding Midjourney Output
```json
{
"success": true,
"model": "midjourney",
"jobId": "xxxxxxxx-...",
"imageUrl": "https://cdn.legnext.ai/temp/....png",
"imageUrls": [
"https://cdn.legnext.ai/mj/xxxx_0.png",
"https://cdn.legnext.ai/mj/xxxx_1.png",
"https://cdn.legnext.ai/mj/xxxx_2.png",
"https://cdn.legnext.ai/mj/xxxx_3.png"
]
}
```
**CRITICAL — image field meanings:**
| Field | What it is | When to use |
|---|---|---|
| `imageUrl` | A **2×2 grid composite** of all 4 images | Send as **preview** so user can see all options |
| `imageUrls[0]` | Image 1 (top-left) | Send when user wants image 1 |
| `imageUrls[1]` | Image 2 (top-right) | Send when user wants image 2 |
| `imageUrls[2]` | Image 3 (bottom-left) | Send when user wants image 3 |
| `imageUrls[3]` | Image 4 (bottom-right) | Send when user wants image 4 |
**"放大第N张" / "要第N张" / "give me image N" = send `imageUrls[N-1]` directly. Do NOT call generate.js again.**
### Midjourney Interaction Flow
**After generation:**
> 🎨 生成完成!这是 4 张图的预览:
> [预览图](imageUrl)
> 你喜欢哪一张?回复 1、2、3 或 4,我直接发给你高清单图。
**When user picks image N:**
> 这是第 N 张的单独高清图:
> [图片 N](imageUrls[N-1])
---
## 🤖 Nano Banana (Gemini) Workflow
Use for storyboards, character series, and any context-dependent multi-image generation.
### Single image (no reference)
```bash
node {baseDir}/generate.js \
--model nano-banana \
--prompt "<detailed scene description>" \
--aspect-ratio 16:9
```
### With reference images (character/scene consistency)
```bash
node {baseDir}/generate.js \
--model nano-banana \
--prompt "<scene description, referencing the character/style from the reference images>" \
--aspect-ratio 16:9 \
--reference-images "https://url-of-previous-image-1.png,https://url-of-previous-image-2.png"
```
**How to build a storyboard series:**
1. Generate the **first frame** without reference images (establishes the character/scene)
2. Use the first frame's URL as `--reference-images` for the **second frame**
3. For subsequent frames, use the most recent 1-3 images as references to maintain consistency
4. Keep the character description consistent across all prompts
**Example storyboard workflow:**
```
Frame 1: node generate.js --model nano-banana --prompt "A young girl with red hair, wearing a blue dress, sitting under a magical treehouse in an enchanted forest, warm golden light, storybook illustration style" --aspect-ratio 16:9
Frame 2: node generate.js --model nano-banana --prompt "The same red-haired girl in blue dress climbing the rope ladder up to the treehouse, excited expression, enchanted forest background, same storybook illustration style" --aspect-ratio 16:9 --reference-images "<frame1_url>"
Frame 3: node generate.js --model nano-banana --prompt "Inside the magical treehouse, the red-haired girl discovers a glowing book on a wooden shelf, wonder on her face, warm candlelight, same storybook illustration style" --aspect-ratio 16:9 --reference-images "<frame1_url>,<frame2_url>"
```
### Nano Banana Output
```json
{
"success": true,
"model": "nano-banana",
"images": ["https://v3b.fal.media/files/...png"],
"imageUrl": "https://v3b.fal.media/files/...png"
}
```
Send `imageUrl` directly to the user (no grid, single image).
---
## Other Models
### Flux Pro / Dev / Schnell
Best for photorealistic standalone images. Output format same as Nano Banana (s