nano-pdf-edit

TotalClaw 作者 totalclaw v1.0.1

使用由 Google 的 Gemini 3 Pro Image (Nano Banana) 提供支持的 nano-pdf CLI 工具,使用自然语言直观地编辑 PDF 文件。每当用户想要使用 AI 编辑、修改或更新 PDF 幻灯片或页面时,都可以使用此技能,包括修复拼写错误、更新图表/图形、更改颜色或品牌、添加新幻灯片、修改文本或对 PDF 幻灯片或报告进行任何视觉更改。当用户提及“nano-pdf”、“nano pdf”、“编辑我的 pdf”、“更新我的幻灯片”、“修复我的幻灯片”或想要对 PDF 内容进行 AI 驱动的更改时也会触发。即使用户只是对 PDF 文件说“更改第 3 页上的标题”或“修复幻灯片 5 上的拼写错误”,此技能也适用。请勿用于提取文本、合并/拆分 PDF、填写表单或其他非可视 PDF 操作。

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install totalclaw:totalclaw~ps06756-nano-banana-pdf-skill
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~ps06756-nano-banana-pdf-skill/file -o ps06756-nano-banana-pdf-skill.md
## 概述(中文)

使用由 Google 的 Gemini 3 Pro Image (Nano Banana) 提供支持的 nano-pdf CLI 工具,使用自然语言直观地编辑 PDF 文件。每当用户想要使用 AI 编辑、修改或更新 PDF 幻灯片或页面时,都可以使用此技能,包括修复拼写错误、更新图表/图形、更改颜色或品牌、添加新幻灯片、修改文本或对 PDF 幻灯片或报告进行任何视觉更改。当用户提及“nano-pdf”、“nano pdf”、“编辑我的 pdf”、“更新我的幻灯片”、“修复我的幻灯片”或想要对 PDF 内容进行 AI 驱动的更改时也会触发。即使用户只是对 PDF 文件说“更改第 3 页上的标题”或“修复幻灯片 5 上的拼写错误”,此技能也适用。请勿用于提取文本、合并/拆分 PDF、填写表单或其他非可视 PDF 操作。

## 原文

# Nano PDF Editing Skill

Edit PDF files with natural language prompts using the **nano-pdf** CLI tool.

Nano-PDF converts PDF pages to images, sends them to Google's Gemini 3 Pro Image with your edit instructions, then stitches the AI-edited pages back into the PDF — preserving searchable text via OCR re-hydration.

**Source**: https://github.com/gavrielc/Nano-PDF

## Prerequisites

Before running any nano-pdf command, ensure the following dependencies are installed. If any are missing, install them before proceeding:

1. **nano-pdf** — `pip install nano-pdf` (or use `uvx nano-pdf` to run without installing)
2. **poppler** — PDF-to-image rendering (`brew install poppler` on macOS / `sudo apt-get install poppler-utils` on Linux)
3. **tesseract** — OCR for text layer restoration (`brew install tesseract` on macOS / `sudo apt-get install tesseract-ocr` on Linux)
4. **GEMINI_API_KEY** — A **paid** Google Gemini API key (free tier does not support image generation). Get one at https://aistudio.google.com/api-keys — then `export GEMINI_API_KEY="your_key"`

## Two Commands

### `nano-pdf edit` — Modify existing pages

```bash
nano-pdf edit <file.pdf> <page> "<prompt>" [<page> "<prompt>" ...] [options]
```

Pages are 1-indexed. Multiple page+prompt pairs can be provided and are processed in parallel.

### `nano-pdf add` — Insert new AI-generated slides

```bash
nano-pdf add <file.pdf> <position> "<prompt>" [options]
```

Position 0 inserts at the beginning. The new slide automatically matches the visual style of the existing deck. Document context is enabled by default for `add`.

## Options Reference

For full details on all available flags, read `references/options.md` in this skill directory.

Key flags:
- `--output "new.pdf"` — Output filename (default: `edited_<original>.pdf`)
- `--resolution "4K"` — `4K` (default), `2K`, or `1K`
- `--style-refs "1,5"` — Pages to use as style references
- `--use-context` / `--no-use-context` — Include full PDF text as model context
- `--disable-google-search` — Prevent model from using Google Search

## Workflow

When a user asks to edit a PDF:

1. **Check dependencies** — Ensure nano-pdf, poppler, tesseract, and GEMINI_API_KEY are available. If any are missing, tell the user what to install and stop.
2. **Identify the edit** — Determine which page(s) need changes and what the prompt should be
3. **Choose the right command** — `edit` for modifying existing pages, `add` for inserting new ones
4. **Pick appropriate options**:
   - Use `--style-refs` if the user wants a specific visual style from certain pages
   - Use `--use-context` when editing multiple pages that need to be consistent
   - Use `--resolution "2K"` if speed matters more than quality
5. **Run nano-pdf** and present the output PDF to the user

## Prompt Writing Tips

The quality of the edit depends heavily on the prompt. Follow these guidelines:

- **Be specific**: "Change the title from 'Overview' to 'Q3 Summary'" beats "update the title"
- **Reference visible elements**: "The bar chart on the left side" helps the model locate what to change
- **One focused change per prompt**: For complex edits, use multiple page+prompt pairs
- **Mention what to preserve**: "Keep the layout the same but change the header color to blue"
- **Use style refs for consistency**: When updating branding across pages, point at a reference page

## Examples

For a comprehensive set of examples covering common use cases (typos, charts, branding, adding slides, batch edits), read `references/examples.md` in this skill directory.

Quick reference:

```bash
# Fix a typo on page 3
nano-pdf edit report.pdf 3 "Fix 'recieve' to 'receive'"

# Update chart data
nano-pdf edit deck.pdf 12 "Update the revenue chart to show Q3 at $2.5M"

# Multi-page branding update
nano-pdf edit slides.pdf \
  1 "Change header background to dark blue, text to white" \
  2 "Update the logo to show 'NewCorp' instead of 'OldCorp'" \
  --style-refs "1" --output branded.pdf

# Add a new title slide at the beginning
nano-pdf add deck.pdf 0 "Title slide: 'Annual Review 2025' with subtitle 'Building the Future'"

# Add a summary slide after page 5 using document context
nano-pdf add deck.pdf 5 "Summary slide with key takeaways as bullet points"
```

## Troubleshooting

| Issue | Solution |
|-------|----------|
| `Missing system dependencies` | Install missing deps (see Prerequisites above), restart terminal |
| `GEMINI_API_KEY not found` | `export GEMINI_API_KEY="your_key"` |
| `PAID API key required` | Enable billing at https://aistudio.google.com/api-keys |
| Style mismatch | Use `--style-refs "1,3"` pointing at pages with desired style |
| Slow processing | Use `--resolution "2K"` or `"1K"` |
| Bad OCR / text layer | Use `--resolution "4K"` for better OCR accuracy |
| Model ignores part of prompt | Break into smaller, focused edits across multiple runs |