ocr

TotalClaw 作者 totalclaw

光学字符识别（OCR）工具，支持从PDF和图像中提取中英文文本。使用案例：(1) 从扫描的 PDF 中提取文本，(2) 从图像中识别文本，(3) 从发票、合同和其他文档中提取文本内容

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install totalclaw:totalclaw~roamerxv-ocr-python

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~roamerxv-ocr-python/file -o roamerxv-ocr-python.md

## 概述（中文）

光学字符识别（OCR）工具，支持从PDF和图像中提取中英文文本。使用案例：(1) 从扫描的 PDF 中提取文本，(2) 从图像中识别文本，(3) 从发票、合同和其他文档中提取文本内容

## 原文

# OCR Text Recognition

This skill uses PaddleOCR for text recognition, supporting both Chinese and English.

## Quick Start

### Basic Usage

Perform OCR recognition directly on image or PDF files:

```python
from paddleocr import PaddleOCR

ocr = PaddleOCR(lang='ch')
result = ocr.predict("file_path.jpg")
```

## Dependency Installation

Install dependencies before first use:

```bash
pip3 install paddlepaddle paddleocr
```

## Output Format

Recognition results return JSON containing:
- `rec_texts`: List of recognized text
- `rec_scores`: Confidence score for each text

## Typical Use Cases

1. **PDF Scans**: Use PyMuPDF to extract images first, then OCR
2. **Image Text Recognition**: Perform OCR directly on images
3. **Multi-page PDFs**: Process page by page

## Scripts

Common scripts are located in the `scripts/` directory.