Veryfi Documents AI
Real-time OCR and data extraction API by Veryfi (https://veryfi.com). Extract structured data from receipts, invoices, bank statements, W-9s, purchase orders, bills of lading, and any other document. Use when you need to OCR documents, extract fields, parse receipts/invoices, bank statements, classify documents, detect fraud, or get raw OCR text from any document.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install skilldb:dbirulia~documents-aicURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/skilldb%3Adbirulia~documents-ai/file -o documents-ai.mdGit 仓库获取源码
git clone https://github.com/openclaw/skills/commit/392584f13cee99004ffdd52101c9be193c2c51d1# Documents AI by Veryfi
Real-time OCR and data extraction API — extract structured data from receipts, invoices, bank statements, W-9s, purchase orders, and more, with document classification, fraud detection, and raw OCR text output.
> **Get your API key:** https://app.veryfi.com/api/settings/keys/
> **Learn more:** https://veryfi.com
## Quick Start
For Receipts and Invoices:
```bash
curl -X POST "https://api.veryfi.com/api/v8/partner/documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@invoice.pdf"
```
Response:
```json
{
"id": 62047612,
"created_date": "2026-02-19",
"currency_code": "USD",
"date": "2026-02-18 14:22:00",
"document_type": "receipt",
"category": "Meals & Entertainment",
"is_duplicate": false,
"vendor": {
"name": "Starbucks",
"address": "123 Main St, San Francisco, CA 94105"
},
"line_items": [
{
"id": 1,
"order": 0,
"description": "Caffe Latte Grande",
"quantity": 1,
"price": 5.95,
"total": 5.95,
"type": "food"
}
],
"subtotal": 5.95,
"tax": 0.52,
"total": 6.47,
"payment": {
"type": "visa",
"card_number": "1234"
},
"ocr_text": "STARBUCKS\n123 Main St...",
"img_url": "https://scdn.veryfi.com/documents/...",
"pdf_url": "https://scdn.veryfi.com/documents/..."
}
```
For Bank Statements:
```bash
curl -X POST "https://api.veryfi.com/api/v8/partner/bank-statements/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@bank-statement.pdf"
```
Response:
```json
{
"id": 4820193,
"created_date": "2026-02-19T12:45:00.000000Z",
"bank_name": "Chase",
"bank_address": "270 Park Avenue, New York, NY 10017",
"account_holder_name": "Jane Doe",
"account_holder_address": "456 Oak Ave, San Francisco, CA 94110",
"account_number": "****7890",
"account_type": "Checking",
"routing_number": "021000021",
"currency_code": "USD",
"statement_date": "2026-01-31",
"period_start_date": "2026-01-01",
"period_end_date": "2026-01-31",
"beginning_balance": 12500.00,
"ending_balance": 11835.47,
"accounts": [
{
"number": "****7890",
"beginning_balance": 12500.00,
"ending_balance": 11835.47,
"summaries": [
{ "name": "Total Deposits", "total": 3200.00 },
{ "name": "Total Withdrawals", "total": 3864.53 }
],
"transactions": [
{
"order": 0,
"date": "2026-01-05",
"description": "Direct Deposit - ACME Corp Payroll",
"credit_amount": 3200.00,
"debit_amount": null,
"balance": 15700.00,
"category": "Income"
},
{
"order": 1,
"date": "2026-01-12",
"description": "Rent Payment - 456 Oak Ave",
"credit_amount": null,
"debit_amount": 2800.00,
"balance": 12900.00,
"category": "Housing"
},
{
"order": 2,
"date": "2026-01-20",
"description": "PG&E Utility Bill",
"credit_amount": null,
"debit_amount": 1064.53,
"balance": 11835.47,
"category": "Utilities"
}
]
}
],
"pdf_url": "https://scdn.veryfi.com/bank-statements/...",
"img_url": "https://scdn.veryfi.com/bank-statements/..."
}
```
## Setup
### 1. Get Your API Key
```bash
# Visit API Auth Credentials page
https://app.veryfi.com/api/settings/keys/
```
Save your API keys:
```bash
export VERYFI_CLIENT_ID="your_client_id_here"
export VERYFI_USERNAME="your_username_here"
export VERYFI_API_KEY="your_api_key_here"
```
### 2. OpenClaw Configuration (Optional)
**Recommended: Use environment variables** (most secure):
```json5
{
skills: {
entries: {
"veryfi-documents-ai": {
enabled: true,
// Keys loaded from environment variables:
// VERYFI_CLIENT_ID, VERYFI_USERNAME, VERYFI_API_KEY
},
},
},
}
```
**Alternative: Store in config file** (use with caution):
```json5
{
skills: {
entries: {
"veryfi-documents-ai": {
enabled: true,
env: {
VERYFI_CLIENT_ID: "your_client_id_here",
VERYFI_USERNAME: "your_username_here",
VERYFI_API_KEY: "your_api_key_here",
},
},
},
},
}
```
**Security Note:** If storing API keys in `~/.openclaw/openclaw.json`:
- Set file permissions: `chmod 600 ~/.openclaw/openclaw.json`
- Never commit this file to version control
- Prefer environment variables or your agent's secret store when possible
- Rotate keys regularly and limit API key permissions if supported
## Common Tasks
### Extract data from a Receipt or Invoice (file upload)
```bash
curl -X POST "https://api.veryfi.com/api/v8/partner/documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@invoice.pdf"
```
### Extract data from a Receipt or Invoice (base64)
When your agent already has the document as base64-encoded content (e.g., received via API, email attachment, or tool output), use `file_data` instead of uploading a file:
```bash
# Encode the file first
BASE64_DATA=$(base64 -i invoice.pdf)
curl -X POST "https://api.veryfi.com/api/v8/partner/documents/" \
-H "Content-Type: application/json" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-d "{
\"file_name\": \"invoice.pdf\",
\"file_data\": \"$BASE64_DATA\"
}"
```
### Extract data from a URL
```bash
curl -X POST "https://api.veryfi.com/api/v8/partner/documents/" \
-H "Content-Type: application/json" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-d '{
"file_url": "https://example.com/invoice.pdf"
}'
```
### Extract data from a Passport
```bash
curl -X POST "https://api.veryfi.com/api/v8/partner/any-documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@passport.jpg" \
-F "blueprint_name=passport"
```
### Extract data from Checks
```bash
curl -X POST "https://api.veryfi.com/api/v8/partner/checks/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@check.jpg"
```
### Extract data from W-9s
```bash
curl -X POST "https://api.veryfi.com/api/v8/partner/w9s/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@w9.pdf"
```
### Extract data from W-2s and W-8s
W-2 and W-8 forms do not have dedicated endpoints. Use the `any-documents` endpoint with the appropriate blueprint:
```bash
# W-2
curl -X POST "https://api.veryfi.com/api/v8/partner/any-documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@w2.pdf" \
-F "blueprint_name=w2"
# W-8
curl -X POST "https://api.veryfi.com/api/v8/partner/any-documents/" \
-H "Content-Type: multipart/form-data" \
-H "Client-Id: $VERYFI_CLIENT_ID" \
-H "Authorization: apikey $VERYFI_USERNAME:$VERYFI_API_KEY" \
-F "file=@w8.pdf" \
-F "blueprint_name=w8"
```
> **Note:** W-2 and W-8 appear as classification types (via `/classify/`) but their extraction is handled through the Any Document endpoint. Do **not** POST to `/api/v8/partner/w2s/` or `/api/v8/partner/w8s/` — those endpoints do not exist.
### Get Raw OCR Text from a Document
All extraction endpoints return an `ocr_text` field in the response containing the raw text content of the document as a plain string. This is usef