nano-pdf-edit
使用由 Google 的 Gemini 3 Pro Image (Nano Banana) 提供支持的 nano-pdf CLI 工具,使用自然语言直观地编辑 PDF 文件。每当用户想要使用 AI 编辑、修改或更新 PDF 幻灯片或页面时,都可以使用此技能,包括修复拼写错误、更新图表/图形、更改颜色或品牌、添加新幻灯片、修改文本或对 PDF 幻灯片或报告进行任何视觉更改。当用户提及“nano-pdf”、“nano pdf”、“编辑我的 pdf”、“更新我的幻灯片”、“修复我的幻灯片”或想要对 PDF 内容进行 AI 驱动的更改时也会触发。即使用户只是对 PDF 文件说“更改第 3 页上的标题”或“修复幻灯片 5 上的拼写错误”,此技能也适用。请勿用于提取文本、合并/拆分 PDF、填写表单或其他非可视 PDF 操作。
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install totalclaw:totalclaw~ps06756-nano-banana-pdf-skillcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~ps06756-nano-banana-pdf-skill/file -o ps06756-nano-banana-pdf-skill.md## 概述(中文) 使用由 Google 的 Gemini 3 Pro Image (Nano Banana) 提供支持的 nano-pdf CLI 工具,使用自然语言直观地编辑 PDF 文件。每当用户想要使用 AI 编辑、修改或更新 PDF 幻灯片或页面时,都可以使用此技能,包括修复拼写错误、更新图表/图形、更改颜色或品牌、添加新幻灯片、修改文本或对 PDF 幻灯片或报告进行任何视觉更改。当用户提及“nano-pdf”、“nano pdf”、“编辑我的 pdf”、“更新我的幻灯片”、“修复我的幻灯片”或想要对 PDF 内容进行 AI 驱动的更改时也会触发。即使用户只是对 PDF 文件说“更改第 3 页上的标题”或“修复幻灯片 5 上的拼写错误”,此技能也适用。请勿用于提取文本、合并/拆分 PDF、填写表单或其他非可视 PDF 操作。 ## 原文 # Nano PDF Editing Skill Edit PDF files with natural language prompts using the **nano-pdf** CLI tool. Nano-PDF converts PDF pages to images, sends them to Google's Gemini 3 Pro Image with your edit instructions, then stitches the AI-edited pages back into the PDF — preserving searchable text via OCR re-hydration. **Source**: https://github.com/gavrielc/Nano-PDF ## Prerequisites Before running any nano-pdf command, ensure the following dependencies are installed. If any are missing, install them before proceeding: 1. **nano-pdf** — `pip install nano-pdf` (or use `uvx nano-pdf` to run without installing) 2. **poppler** — PDF-to-image rendering (`brew install poppler` on macOS / `sudo apt-get install poppler-utils` on Linux) 3. **tesseract** — OCR for text layer restoration (`brew install tesseract` on macOS / `sudo apt-get install tesseract-ocr` on Linux) 4. **GEMINI_API_KEY** — A **paid** Google Gemini API key (free tier does not support image generation). Get one at https://aistudio.google.com/api-keys — then `export GEMINI_API_KEY="your_key"` ## Two Commands ### `nano-pdf edit` — Modify existing pages ```bash nano-pdf edit <file.pdf> <page> "<prompt>" [<page> "<prompt>" ...] [options] ``` Pages are 1-indexed. Multiple page+prompt pairs can be provided and are processed in parallel. ### `nano-pdf add` — Insert new AI-generated slides ```bash nano-pdf add <file.pdf> <position> "<prompt>" [options] ``` Position 0 inserts at the beginning. The new slide automatically matches the visual style of the existing deck. Document context is enabled by default for `add`. ## Options Reference For full details on all available flags, read `references/options.md` in this skill directory. Key flags: - `--output "new.pdf"` — Output filename (default: `edited_<original>.pdf`) - `--resolution "4K"` — `4K` (default), `2K`, or `1K` - `--style-refs "1,5"` — Pages to use as style references - `--use-context` / `--no-use-context` — Include full PDF text as model context - `--disable-google-search` — Prevent model from using Google Search ## Workflow When a user asks to edit a PDF: 1. **Check dependencies** — Ensure nano-pdf, poppler, tesseract, and GEMINI_API_KEY are available. If any are missing, tell the user what to install and stop. 2. **Identify the edit** — Determine which page(s) need changes and what the prompt should be 3. **Choose the right command** — `edit` for modifying existing pages, `add` for inserting new ones 4. **Pick appropriate options**: - Use `--style-refs` if the user wants a specific visual style from certain pages - Use `--use-context` when editing multiple pages that need to be consistent - Use `--resolution "2K"` if speed matters more than quality 5. **Run nano-pdf** and present the output PDF to the user ## Prompt Writing Tips The quality of the edit depends heavily on the prompt. Follow these guidelines: - **Be specific**: "Change the title from 'Overview' to 'Q3 Summary'" beats "update the title" - **Reference visible elements**: "The bar chart on the left side" helps the model locate what to change - **One focused change per prompt**: For complex edits, use multiple page+prompt pairs - **Mention what to preserve**: "Keep the layout the same but change the header color to blue" - **Use style refs for consistency**: When updating branding across pages, point at a reference page ## Examples For a comprehensive set of examples covering common use cases (typos, charts, branding, adding slides, batch edits), read `references/examples.md` in this skill directory. Quick reference: ```bash # Fix a typo on page 3 nano-pdf edit report.pdf 3 "Fix 'recieve' to 'receive'" # Update chart data nano-pdf edit deck.pdf 12 "Update the revenue chart to show Q3 at $2.5M" # Multi-page branding update nano-pdf edit slides.pdf \ 1 "Change header background to dark blue, text to white" \ 2 "Update the logo to show 'NewCorp' instead of 'OldCorp'" \ --style-refs "1" --output branded.pdf # Add a new title slide at the beginning nano-pdf add deck.pdf 0 "Title slide: 'Annual Review 2025' with subtitle 'Building the Future'" # Add a summary slide after page 5 using document context nano-pdf add deck.pdf 5 "Summary slide with key takeaways as bullet points" ``` ## Troubleshooting | Issue | Solution | |-------|----------| | `Missing system dependencies` | Install missing deps (see Prerequisites above), restart terminal | | `GEMINI_API_KEY not found` | `export GEMINI_API_KEY="your_key"` | | `PAID API key required` | Enable billing at https://aistudio.google.com/api-keys | | Style mismatch | Use `--style-refs "1,3"` pointing at pages with desired style | | Slow processing | Use `--resolution "2K"` or `"1K"` | | Bad OCR / text layer | Use `--resolution "4K"` for better OCR accuracy | | Model ignores part of prompt | Break into smaller, focused edits across multiple runs |