long-research

ClawSkills 作者 clawskills

[BETA] Deep research that actually reads pages instead of summarizing search results. Tell it how long to research (10 min, 2 hours, all night) and it works the full duration — searching, reading every result, following leads, cracking forums, cross-verifying findings, and writing progressively to a research file. Tree-style exploration: each page read spawns new searches, like a human researcher. Enforced read-to-search ratio prevents shallow search-spamming. Wall-clock time commitment — it won't finish early. Self-audit gate blocks delivery until quality checks pass. Works with web_search, web_fetch, and browser-use for JS-heavy sites.

安装 / 下载方式

TotalClaw CLI推荐

totalclaw install clawskills:clawskills~vanya1210-long-research

cURL直接下载，无需登录

curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Aclawskills~vanya1210-long-research/file -o vanya1210-long-research.md

# Long Research

> ⚠️ **BETA** — This skill works as described but is under active development. Review the notes below before installing.

## Dependencies

- **web_search** — any search provider (Perplexity Sonar, Brave, etc.)
- **web_fetch** — built into most agent frameworks
- **browser-use** (REQUIRED) — install via `pip install browser-use && browser-use install`. Used for JS-heavy sites, login-gated forums, and retailer pricing. The skill will not function fully without it.

## Privacy & Trust Notes

**browser-use remote mode:**
The browser-use cascade tries 3 modes in order: `chromium` (local, free) → `real` (local, free) → `remote` (cloud-hosted, burns API credits). Remote mode sends page content to browser-use.com's cloud infrastructure. If you care about privacy, you can:
- Remove `remote` from the cascade and only use local modes
- Run in Interactive mode so the agent asks before escalating to remote
- The skill works fine with local-only browser modes for most sites

**Login-gated forums and browser profiles:**
The skill includes patterns for extracting content from login-gated forums using `--profile` flags. Profiles persist cookies locally on your machine in browser-use's default profile directory. The agent does NOT attempt automated login — it uses browser-use's cookie persistence from previous manual sessions. If you haven't logged into a forum manually via browser-use before, the agent won't have access. No credentials are stored or transmitted by this skill.

**Filesystem writes:**
Research output is written to `research/[topic]-[date].md` in your agent's working directory. Progressive writes happen every 3-5 tool calls as crash recovery. Files stay on your local machine — nothing is uploaded. If running in a shared environment, configure your agent's working directory to a safe location.

**Sub-agent prompt injection surface:**
The skill mandates pasting full instructions into sub-agent task prompts. This means the entire SKILL.md (including your research query) is sent to whatever model provider your sub-agent uses. If you use external/remote model providers, be aware that your research queries and the full skill text are transmitted to those services. This is standard for any agent skill that delegates to sub-agents — but worth noting if your research topics are sensitive.

**Recommended for first-time users:**
- Run in **Interactive** mode (not Autonomous) until you're comfortable with the skill's behavior
- Start with a short duration (10 min) to see how it works
- Review the research file output before trusting findings

## Activation

When the user invokes this skill (e.g. "do long research", "research X", "pull up long research"), **immediately start the pre-flight checklist**. Do NOT ask "what topic?" separately — go straight into the pre-flight questions. If the user already provided a topic in their message, pre-fill it and ask the remaining questions.

## Pre-Flight Checklist (MANDATORY — NO EXCEPTIONS)

Before starting ANY research, confirm all of these with the user:

1. **Topic** — What exactly to research. Get specific. If not provided yet, ask now.
2. **Duration** — How long to spend. Minimum 10 minutes, can be hours.
3. **Autonomy** — Choose one. Default: Autonomous.
- **Autonomous** — No questions, log everything, report at end.
- **Interactive** — May ask clarifying questions during research. Pauses for answers.
- **Interactive-continue** — May ask clarifying questions, but if user doesn't reply within ~2 minutes, continue research with best judgment. Never block on silence.
4. **Tools** — List in priority order (highest priority first). The plan block MUST reflect this ordering. Default priority:
1. **web_search** — discovery, overviews, forum consensus
2. **web_fetch** — reading specific review/forum/article pages
3. **browser-use** (REQUIRED dependency) — retailer pricing (Amazon, etc.), JS-heavy sites, anything web_fetch can't reach. See `references/browser-use-patterns.md`

User may reorder (e.g., browser-use first for pricing-heavy research). Always include all three in the plan block — never omit browser-use.
5. **Scheduling** — Now or delayed? If delayed, note time + timezone.
6. **Output** — Where to deliver (this chat, specific topic, file only) and format (summary, full report, comparison table).
7. **Clarifying questions** — Ask anything about the user's situation BEFORE starting. Don't assume hardware, location, budget, preferences, etc.

⛔ **GATE: You MUST post the plan block below and receive explicit user approval before making ANY research tool calls.** Do not start research based on implied approval, partial answers, or "just get going" energy. The plan block IS the gate.

Post this and wait for approval:
```
📋 Research Plan
• Topic: [what]
• Duration: [X] minutes/hours
• Mode: Autonomous / Interactive / Interactive-continue
• Tools: [list in priority order, e.g. web_search → web_fetch → browser-use]
• Start: Now / [scheduled time]
• Output: [where and format]
• Questions: [anything unclear about user's situation]

Proceed? (yes to start)
```

If the user answers clarifying questions but doesn't say "proceed/yes/go", re-post the updated plan and ask again.

---

## Spawning (CRITICAL — read this before delegating to a sub-agent)

⛔ **The sub-agent MUST have the full skill instructions in its task prompt.** The root cause of most research failures is the orchestrator paraphrasing the skill instead of injecting it. A sub-agent that doesn't read this file will ignore every gate, every ratio, every enforcement rule.

### How to spawn a research sub-agent:

1. **Read this entire SKILL.md** into context (you're already doing this if you're reading this).
2. **Construct the task prompt** with these MANDATORY sections in this order:

```
TASK PROMPT TEMPLATE (paste in this order — critical rules at TOP and BOTTOM):
─────────────────────────────────────────────────────────────────────────────
## ⛔ CRITICAL RULES (read first, enforce always)
1. READ > SEARCH. After Seed, you cannot search unless you've read something since your last search.
2. NEVER start synthesis while time remains. Run `date +%s` and check END_TS.
3. browser-use cascade: chromium → real → remote. Try ALL 3 before giving up on any source.
4. web_search returns URLS, not answers. Ignore Perplexity's synthesis text. Extract only the citation URLs.
5. "Not found" IS a valid finding. If you can't find what was asked, say so honestly. Do NOT answer an adjacent question.

## Research Task
[The user's EXACT question, quoted verbatim. Do not paraphrase, soften, or broaden it.]

## Task Anchor (re-read every 5 tool calls)
Your job is to answer THIS question: "[exact question again]"
Every 5 tool calls, ask yourself: "Does my last action directly serve THIS question?"
If no → STOP current branch → re-orient.

## User Context
[Any relevant context — location, timeline, constraints. Keep separate from the task.]

## Research Rules
[Paste the FULL Execution section from SKILL.md — the complete Read-Driven Loop,
browser-use Cascade, Forum-Cracking Playbook, Progressive Writing, Time Enforcement,
Negative Results, Self-Audit, etc.]

## ⛔ REMINDER (read last, enforce always)
- You MUST read more pages than you search. Track: reads vs searches.
- You MUST try all 3 browser modes before giving up.
- You MUST NOT start synthesis while time remains.
- You MUST deliver honest negative results, not drift to adjacent questions.
- You MUST run the self-audit before delivering.
─────────────────────────────────────────────────────────────────────────────
```

3. **Never summarize the rules.** Paste them. The sub-agent needs the actual enforcement language.
4. **Quote the user's question verbatim** in both Task and Task Anchor. Do not rephrase "find real life experience where X happened" into "research whether X is common."
5. **Put critical rules at both TOP and BOTTOM** of the prompt — primacy and recency effects in attention mean mi