research-paper-writing
Write ML papers for NeurIPS/ICML/ICLR: design→submit.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install hermes:hermes~research-paper-writingcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/hermes%3Ahermes~research-paper-writing/file -o research-paper-writing.md# Research Paper Writing Pipeline
End-to-end pipeline for producing publication-ready ML/AI research papers targeting **NeurIPS, ICML, ICLR, ACL, AAAI, and COLM**. This skill covers the full research lifecycle: experiment design, execution, monitoring, analysis, paper writing, review, revision, and submission.
This is **not a linear pipeline** — it is an iterative loop. Results trigger new experiments. Reviews trigger new analysis. The agent must handle these feedback loops.
<!-- ascii-guard-ignore -->
```
┌─────────────────────────────────────────────────────────────┐
│ RESEARCH PAPER PIPELINE │
│ │
│ Phase 0: Project Setup ──► Phase 1: Literature Review │
│ │ │ │
│ ▼ ▼ │
│ Phase 2: Experiment Phase 5: Paper Drafting ◄──┐ │
│ Design │ │ │
│ │ ▼ │ │
│ ▼ Phase 6: Self-Review │ │
│ Phase 3: Execution & & Revision ──────────┘ │
│ Monitoring │ │
│ │ ▼ │
│ ▼ Phase 7: Submission │
│ Phase 4: Analysis ─────► (feeds back to Phase 2 or 5) │
│ │
└─────────────────────────────────────────────────────────────┘
```
<!-- ascii-guard-ignore-end -->
---
## When To Use This Skill
Use this skill when:
- **Starting a new research paper** from an existing codebase or idea
- **Designing and running experiments** to support paper claims
- **Writing or revising** any section of a research paper
- **Preparing for submission** to a specific conference or workshop
- **Responding to reviews** with additional experiments or revisions
- **Converting** a paper between conference formats
- **Writing non-empirical papers** — theory, survey, benchmark, or position papers (see [Paper Types Beyond Empirical ML](#paper-types-beyond-empirical-ml))
- **Designing human evaluations** for NLP, HCI, or alignment research
- **Preparing post-acceptance deliverables** — posters, talks, code releases
## Core Philosophy
1. **Be proactive.** Deliver complete drafts, not questions. Scientists are busy — produce something concrete they can react to, then iterate.
2. **Never hallucinate citations.** AI-generated citations have ~40% error rate. Always fetch programmatically. Mark unverifiable citations as `[CITATION NEEDED]`.
3. **Paper is a story, not a collection of experiments.** Every paper needs one clear contribution stated in a single sentence. If you can't do that, the paper isn't ready.
4. **Experiments serve claims.** Every experiment must explicitly state which claim it supports. Never run experiments that don't connect to the paper's narrative.
5. **Commit early, commit often.** Every completed experiment batch, every paper draft update — commit with descriptive messages. Git log is the experiment history.
### Proactivity and Collaboration
**Default: Be proactive. Draft first, ask with the draft.**
| Confidence Level | Action |
|-----------------|--------|
| **High** (clear repo, obvious contribution) | Write full draft, deliver, iterate on feedback |
| **Medium** (some ambiguity) | Write draft with flagged uncertainties, continue |
| **Low** (major unknowns) | Ask 1-2 targeted questions via `clarify`, then draft |
| Section | Draft Autonomously? | Flag With Draft |
|---------|-------------------|-----------------|
| Abstract | Yes | "Framed contribution as X — adjust if needed" |
| Introduction | Yes | "Emphasized problem Y — correct if wrong" |
| Methods | Yes | "Included details A, B, C — add missing pieces" |
| Experiments | Yes | "Highlighted results 1, 2, 3 — reorder if needed" |
| Related Work | Yes | "Cited papers X, Y, Z — add any I missed" |
**Block for input only when**: target venue unclear, multiple contradictory framings, results seem incomplete, explicit request to review first.
---
## Phase 0: Project Setup
**Goal**: Establish the workspace, understand existing work, identify the contribution.
### Step 0.1: Explore the Repository
```bash
# Understand project structure
ls -la
find . -name "*.py" | head -30
find . -name "*.md" -o -name "*.txt" | xargs grep -l -i "result\|conclusion\|finding"
```
Look for:
- `README.md` — project overview and claims
- `results/`, `outputs/`, `experiments/` — existing findings
- `configs/` — experimental settings
- `.bib` files — existing citations
- Draft documents or notes
### Step 0.2: Organize the Workspace
Establish a consistent workspace structure:
```
workspace/
paper/ # LaTeX source, figures, compiled PDFs
experiments/ # Experiment runner scripts
code/ # Core method implementation
results/ # Raw experiment results (auto-generated)
tasks/ # Task/benchmark definitions
human_eval/ # Human evaluation materials (if needed)
```
### Step 0.3: Set Up Version Control
```bash
git init # if not already
git remote add origin <repo-url>
git checkout -b paper-draft # or main
```
**Git discipline**: Every completed experiment batch gets committed with a descriptive message. Example:
```
Add Monte Carlo constrained results (5 runs, Sonnet 4.6, policy memo task)
Add Haiku baseline comparison: autoreason vs refinement baselines at cheap model tier
```
### Step 0.4: Identify the Contribution
Before writing anything, articulate:
- **The What**: What is the single thing this paper contributes?
- **The Why**: What evidence supports it?
- **The So What**: Why should readers care?
> Propose to the scientist: "Based on my understanding, the main contribution is: [one sentence]. The key results show [Y]. Is this the framing you want?"
### Step 0.5: Create a TODO List
Use the `todo` tool to create a structured project plan:
```
Research Paper TODO:
- [ ] Define one-sentence contribution
- [ ] Literature review (related work + baselines)
- [ ] Design core experiments
- [ ] Run experiments
- [ ] Analyze results
- [ ] Write first draft
- [ ] Self-review (simulate reviewers)
- [ ] Revise based on review
- [ ] Submission prep
```
Update this throughout the project. It serves as the persistent state across sessions.
### Step 0.6: Estimate Compute Budget
Before running experiments, estimate total cost and time:
```
Compute Budget Checklist:
- [ ] API costs: (model price per token) × (estimated tokens per run) × (number of runs)
- [ ] GPU hours: (time per experiment) × (number of experiments) × (number of seeds)
- [ ] Human evaluation costs: (annotators) × (hours) × (hourly rate)
- [ ] Total budget ceiling and contingency (add 30-50% for reruns)
```
Track actual spend as experiments run:
```python
# Simple cost tracker pattern
import json, os
from datetime import datetime
COST_LOG = "results/cost_log.jsonl"
def log_cost(experiment: str, model: str, input_tokens: int, output_tokens: int, cost_usd: float):
entry = {
"timestamp": datetime.now().isoformat(),
"experiment": experiment,
"model": model,
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"cost_usd": cost_usd,
}
with open(COST_LOG, "a") as f:
f.write(json.dumps(entry) + "\n")
```
**When budget is tight**: Run pilot experiments (1-2 seeds, subset of tasks) before committing to full sweeps. Use cheaper models for debugging pipelines, then switch to target models for final runs.
### Step 0.7: Multi-Author Coordination
Most papers have 3-10 authors. Establish workflows early:
| Workflow | Tool | When to Use |
|----------|------|-------------|
| **Overleaf** | Browser-based | Multiple authors editing simultaneously, no git experience |
| **