Firecrawl

ClawSkills 作者 maton v1.0.0

Firecrawl API integration with managed authentication. Scrape, crawl, map, and search web content. Use this skill when users want to extract content from websites, crawl entire sites, map URLs, or search the web. For other third party apps, use the api-gateway skill (https://clawhub.ai/byungkyu/api-gateway).

源码 ↗

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install clawskills:byungkyu~firecrawl-api
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Abyungkyu~firecrawl-api/file -o firecrawl-api.md
Git 仓库获取源码
git clone https://github.com/openclaw/skills/commit/ee2a37b8a2b01854f8068e6484a2d49115a7547b
# Firecrawl

Access the Firecrawl API with managed authentication. Scrape webpages, crawl entire websites, map site URLs, and search the web with full content extraction.

## Quick Start

```bash
# Scrape a webpage
python <<'EOF'
import urllib.request, os, json
data = json.dumps({"url": "https://example.com", "formats": ["markdown"]}).encode()
req = urllib.request.Request('https://gateway.maton.ai/firecrawl/v2/scrape', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

## Base URL

```
https://gateway.maton.ai/firecrawl/{native-api-path}
```

Replace `{native-api-path}` with the actual Firecrawl API endpoint path. The gateway proxies requests to `api.firecrawl.dev` and automatically injects your API key.

## Authentication

All requests require the Maton API key in the Authorization header:

```
Authorization: Bearer $MATON_API_KEY
```

**Environment Variable:** Set your API key as `MATON_API_KEY`:

```bash
export MATON_API_KEY="YOUR_API_KEY"
```

### Getting Your API Key

1. Sign in or create an account at [maton.ai](https://maton.ai)
2. Go to [maton.ai/settings](https://maton.ai/settings)
3. Copy your API key

## Connection Management

Manage your Firecrawl connections at `https://ctrl.maton.ai`.

### List Connections

```bash
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://ctrl.maton.ai/connections?app=firecrawl&status=ACTIVE')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

### Create Connection

```bash
python <<'EOF'
import urllib.request, os, json
data = json.dumps({'app': 'firecrawl'}).encode()
req = urllib.request.Request('https://ctrl.maton.ai/connections', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

### Get Connection

```bash
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://ctrl.maton.ai/connections/{connection_id}')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

**Response:**
```json
{
  "connection": {
    "connection_id": "b5449045-2dcd-4e99-816f-65f80511affb",
    "status": "ACTIVE",
    "creation_time": "2026-03-11T09:49:09.917114Z",
    "last_updated_time": "2026-03-11T09:49:27.616143Z",
    "url": "https://connect.maton.ai/?session_token=...",
    "app": "firecrawl",
    "metadata": {},
    "method": "API_KEY"
  }
}
```

### Delete Connection

```bash
python <<'EOF'
import urllib.request, os, json
req = urllib.request.Request('https://ctrl.maton.ai/connections/{connection_id}', method='DELETE')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

### Specifying Connection

If you have multiple Firecrawl connections, specify which one to use with the `Maton-Connection` header:

```bash
python <<'EOF'
import urllib.request, os, json
data = json.dumps({"url": "https://example.com"}).encode()
req = urllib.request.Request('https://gateway.maton.ai/firecrawl/v2/scrape', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
req.add_header('Maton-Connection', 'b5449045-2dcd-4e99-816f-65f80511affb')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

If omitted, the gateway uses the default (oldest) active connection.

## API Reference

### Scrape

```bash
POST /firecrawl/v2/scrape
```

Extract content from a single webpage.

**Required Parameters:**
- `url` (string): The webpage URL to scrape

**Optional Parameters:**
- `formats` (array): Output formats - "markdown", "html", "json", "screenshot", "links" (default: ["markdown"])
- `onlyMainContent` (boolean): Extract only main content, exclude headers/footers (default: true)
- `includeTags` (array): HTML tags to include
- `excludeTags` (array): HTML tags to exclude
- `waitFor` (integer): Milliseconds to wait before scraping (default: 0)
- `timeout` (integer): Request timeout in ms (default: 30000, max: 300000)
- `mobile` (boolean): Emulate mobile device (default: false)
- `actions` (array): Browser actions to perform before scraping
- `headers` (object): Custom HTTP headers
- `blockAds` (boolean): Block ads and cookie banners (default: true)

**Example:**
```bash
python <<'EOF'
import urllib.request, os, json
data = json.dumps({
    "url": "https://docs.firecrawl.dev",
    "formats": ["markdown", "html"],
    "onlyMainContent": True,
    "waitFor": 1000
}).encode()
req = urllib.request.Request('https://gateway.maton.ai/firecrawl/v2/scrape', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

**Response:**
```json
{
  "success": true,
  "data": {
    "markdown": "# Example Domain\n\nThis domain is for use in documentation...",
    "metadata": {
      "title": "Example Domain",
      "language": "en",
      "sourceURL": "https://example.com",
      "url": "https://example.com/",
      "statusCode": 200,
      "contentType": "text/html",
      "creditsUsed": 1
    }
  }
}
```

### Crawl (Start)

```bash
POST /firecrawl/v2/crawl
```

Start crawling an entire website. Returns a crawl ID for status polling.

**Required Parameters:**
- `url` (string): The base URL to start crawling from

**Optional Parameters:**
- `limit` (integer): Maximum pages to crawl (default: 10000)
- `maxDepth` (integer): Maximum crawl depth
- `includePaths` (array): Regex patterns for URLs to include
- `excludePaths` (array): Regex patterns for URLs to exclude
- `allowSubdomains` (boolean): Enable subdomain crawling
- `allowExternalLinks` (boolean): Follow external links
- `scrapeOptions` (object): Options for each page scrape (formats, onlyMainContent, etc.)
- `webhook` (string): Webhook URL for completion notification

**Example:**
```bash
python <<'EOF'
import urllib.request, os, json
data = json.dumps({
    "url": "https://example.com",
    "limit": 10,
    "scrapeOptions": {
        "formats": ["markdown"]
    }
}).encode()
req = urllib.request.Request('https://gateway.maton.ai/firecrawl/v2/crawl', data=data, method='POST')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
req.add_header('Content-Type', 'application/json')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

**Response:**
```json
{
  "success": true,
  "id": "019cdc53-0acf-76ec-a80c-3ead753b2730",
  "url": "https://api.firecrawl.dev/v1/crawl/019cdc53-0acf-76ec-a80c-3ead753b2730"
}
```

### Crawl (Get Status)

```bash
GET /firecrawl/v2/crawl/{id}
```

Get the status and results of a crawl job.

**Path Parameters:**
- `id` (string): The crawl job ID

**Example:**
```bash
python <<'EOF'
import urllib.request, os, json
crawl_id = "019cdc53-0acf-76ec-a80c-3ead753b2730"
req = urllib.request.Request(f'https://gateway.maton.ai/firecrawl/v2/crawl/{crawl_id}')
req.add_header('Authorization', f'Bearer {os.environ["MATON_API_KEY"]}')
print(json.dumps(json.load(urllib.request.urlopen(req)), indent=2))
EOF
```

**Response:**
```json
{
  "success": true,
  "status": "completed",
  "completed": 2,
  "total": 2,
  "creditsUsed": 2,
  "expiresAt": "2026-03-12T09:56:00.000Z",
  "data": [
    {
      "markdown": "# Example Domain\n\nThis domain is for use in documentation...",
      "metadata": {
        "title": "Example Domain",
        "sourceURL": "https://example.com",
        "statusCode": 200
      }
    }