Monet AI
Monet AI - Comprehensive AI content generation API for AI agents. Video generation (Sora, Veo, Doubao Seedance, Wan, Hailuo, Kling), image generation (GPT-4o, Nano Banana, Seedream, Flux, Imagen, Ideogram), and music generation (MiniMax Music). Build intelligent workflows with multi-model AI generation capabilities.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install skilldb:seekton~monet-aicURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/skilldb%3Aseekton~monet-ai/file -o monet-ai.mdGit 仓库获取源码
git clone https://github.com/openclaw/skills/commit/ee7d4aa047719a83311c5e8cd973a79cf86c145b# Monet AI Skill
Comprehensive AI content generation API designed for AI agents. Monet AI provides unified access to state-of-the-art AI generation models for video (Sora, Veo, Doubao Seedance, Wan, Hailuo, Kling), image (GPT-4o, Nano Banana, Seedream, Flux, Imagen, Ideogram), and music (MiniMax Music) generation. Build intelligent workflows that combine multiple AI capabilities for automated content creation pipelines.
## When to Use
Use this skill when:
- **Video Generation**: Create AI-generated videos from text prompts using state-of-the-art models
- Sora: OpenAI's video generation model for high-quality, realistic videos
- Veo: Google's video generation model
- Doubao Seedance: ByteDance's AI video model with audio-visual sync
- Wan: Alibaba's video generation model with excellent localization support
- Hailuo: Fast video generation with good quality-speed balance
- Kling: Kuaishou's video generation model
- **Image Generation**: Generate images from text descriptions with various artistic styles
- GPT-4o: OpenAI's multimodal model for image generation
- Nano Banana: Google's image model with ultra-high character consistency
- Seedream: ByteDance's intelligent visual reasoning model
- Wan: Alibaba's visual model for high-quality and expressive image generation
- Flux: High-quality photorealistic and artistic image generation
- Imagen: Google's text-to-image model
- Ideogram: Specialized in text rendering and precise composition
- **Music Generation**: Create original music and audio from text descriptions
- MiniMax Music: AI music generation with support for custom lyrics and text-to-music conversion
- **AI Agent Integration**: Build intelligent workflows that combine multiple AI generation capabilities for automated content creation pipelines
## Getting API Key
1. Visit https://monet.vision to register an account
2. After login, go to https://monet.vision/skills/keys to create an API Key
3. Configure the API Key in environment variables or code
If you don't have an API Key, ask your owner to apply at monet.vision.
## Quick Start
### Create a Video Generation Task
```bash
curl -X POST https://monet.vision/api/v1/tasks/async \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MONET_API_KEY" \
-d '{
"type": "video",
"input": {
"model": "sora-2",
"prompt": "A cat running in the park",
"duration": 5,
"aspect_ratio": "16:9"
},
"idempotency_key": "unique-key-123"
}'
```
> ⚠️ **Important**: `idempotency_key` is **required**. Use a unique value (e.g., UUID) to prevent duplicate task creation if the request is retried.
Response:
```json
{
"id": "task_abc123",
"status": "pending",
"type": "video",
"created_at": "2026-02-27T10:00:00Z"
}
```
### Get Task Status and Result
Task processing is asynchronous. You need to poll the task status until it becomes `success` or `failed`. **Recommended polling interval: 5 seconds**.
```bash
curl https://monet.vision/api/v1/tasks/task_abc123 \
-H "Authorization: Bearer $MONET_API_KEY"
```
Response when completed:
```json
{
"id": "task_abc123",
"status": "success",
"type": "video",
"outputs": [
{
"model": "sora-2",
"status": "success",
"progress": 100,
"url": "https://files.monet.vision/..."
}
],
"created_at": "2026-02-27T10:00:00Z",
"updated_at": "2026-02-27T10:01:30Z"
}
```
**Example: Poll until completion**
```typescript
const TASK_ID = "task_abc123";
const MONET_API_KEY = process.env.MONET_API_KEY;
async function pollTask() {
while (true) {
const response = await fetch(
`https://monet.vision/api/v1/tasks/${TASK_ID}`,
{
headers: {
Authorization: `Bearer ${MONET_API_KEY}`,
},
},
);
const data = await response.json();
const status = data.status;
if (status === "success") {
console.log("Task completed successfully!");
console.log(JSON.stringify(data, null, 2));
break;
} else if (status === "failed") {
console.log("Task failed!");
console.log(JSON.stringify(data, null, 2));
break;
} else {
console.log(`Task status: ${status}, waiting...`);
await new Promise((resolve) => setTimeout(resolve, 5000)); // Wait 5 seconds
}
}
}
pollTask();
```
## Supported Models
### Video Generation
#### Sora (OpenAI)
**sora-2** - Sora 2
_OpenAI latest video generation model_
- 🎯 **Use Cases**: Video projects requiring OpenAI's latest technology
- ⏱️ **Duration**: 10-15 seconds
- 🎵 **Features**: Audio generation support, reference image support
```typescript
{
model: "sora-2",
prompt: string, // Required
images?: string[], // Optional: Reference images
duration?: 10 | 15, // Optional, default: 10
aspect_ratio?: "16:9" | "9:16"
}
```
**sora-2-pro** - Sora 2 Pro
_Perfect quality for cinematic scenes_
- 🎯 **Use Cases**: Professional film, advertising, and high-end production
- ⏱️ **Duration**: 15-25 seconds
- 🎵 **Features**: Audio generation support, reference image support
```typescript
{
model: "sora-2-pro",
prompt: string,
images?: string[],
duration?: 15 | 25, // Optional, default: 15
aspect_ratio?: "16:9" | "9:16"
}
```
#### Veo (Google)
**veo-3-1-fast** - Google Veo 3.1 Fast
_Ultra-fast video generation_
- 🎯 **Use Cases**: Video projects requiring fast generation
- ⏱️ **Duration**: 8 seconds
- 📺 **Resolution**: 1080p with audio generation support
```typescript
{
model: "veo-3-1-fast",
prompt: string,
images?: string[], // Reference images
aspect_ratio?: "16:9" | "9:16"
}
```
**veo-3-1** - Google Veo 3.1
_Advanced AI video with sound_
- 🎯 **Use Cases**: Professional-grade video production
- ⏱️ **Duration**: 8 seconds
- 📺 **Resolution**: 1080p with audio generation support
```typescript
{
model: "veo-3-1",
prompt: string,
images?: string[],
aspect_ratio?: "16:9" | "9:16"
}
```
**veo-3-fast** - Google Veo 3 Fast
_30% faster than standard Veo 3_
- 🎯 **Use Cases**: Video projects requiring rapid iteration
- ⏱️ **Duration**: 8 seconds
- 📺 **Resolution**: 1080p, supports negative prompts
```typescript
{
model: "veo-3-fast",
prompt: string,
images?: string[],
negative_prompt?: string // Specify unwanted content
}
```
**veo-3** - Google Veo 3
_High-quality video generation_
- 🎯 **Use Cases**: Standard high-quality video production
- ⏱️ **Duration**: 8 seconds
- 📺 **Resolution**: 1080p, supports negative prompts
```typescript
{
model: "veo-3",
prompt: string,
images?: string[],
negative_prompt?: string
}
```
#### Wan
**wan-2-6** - Wan 2.6
_Multi-shot and automatic audio_
- 🎯 **Use Cases**: Video production requiring multi-shot switching
- ⏱️ **Duration**: 5-15 seconds
- 📺 **Resolution**: 720p-1080p with audio generation support
```typescript
{
model: "wan-2-6",
prompt: string,
images?: string[],
duration?: 5 | 10 | 15,
resolution?: "720p" | "1080p",
aspect_ratio?: "16:9" | "9:16" | "4:3" | "3:4" | "1:1",
shot_type?: "single" | "multi" // Single/multi-shot switching
}
```
**wan-2-5** - Wan 2.5
_Supports automatic audio generation_
- 🎯 **Use Cases**: Quickly generating videos with audio
- ⏱️ **Duration**: 5-10 seconds
- 📺 **Resolution**: 480p-1080p with audio support
```typescript
{
model: "wan-2-5",
prompt: string,
images?: string[],
duration?: 5 | 10,
resolution?: "480p" | "720p" | "1080p",
aspect_ratio?: "16:9" | "9:16" | "4:3" | "3:4" | "1:1"
}
```
**wan-2-2-flash** - Wan 2.2 Flash
_Instruction understanding, controllable camera movement_
- 🎯 **Use Cases**: Scenarios requiring precise camera movement control
- ⏱️ **Duration**: 5-10 seconds
- 📺 **Resolution**: 480p-1080p
```typescript
{
model: "wan-2-2-flash",
prompt: string,
images?: string[],
duration?: 5 | 10,
resolution?: "480p" | "720p" | "1080p",
negative_prompt?: string
}
```
**wan-2-2**