orchata-rag
Knowledge management and RAG platform with tree-based document indexing. Use this skill to search, browse, and manage Orchata knowledge bases via MCP tools.
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install github:LeoYeAI~openclaw-master-skills~orchatacURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/github%3ALeoYeAI~openclaw-master-skills~orchata/file -o orchata.md# Orchata Skills
This document describes how to effectively use Orchata, a RAG (Retrieval-Augmented Generation) platform with tree-based document indexing. Load this into your context to interact with Orchata knowledge bases.
## What is Orchata?
Orchata is a knowledge management platform that:
- **Organizes documents into Spaces** - Logical containers for related content
- **Uses tree-based indexing** - Documents are parsed into hierarchical structures with sections, summaries, and page ranges
- **Provides semantic search** - Find relevant content using natural language queries
- **Exposes MCP tools** - AI assistants can directly manage and query knowledge bases
## Core Concepts
### Spaces
A **Space** is a container for related documents. Think of it as a folder with semantic search capabilities.
- Each space has a `name`, `description`, and optional `icon`
- Descriptions are used by `smart_query` to recommend relevant spaces
- Spaces can be archived (soft-deleted)
### Documents
A **Document** is content within a space. Supported formats include:
- PDF (text-based and scanned with OCR)
- Word documents (.docx)
- Excel spreadsheets (.xlsx)
- PowerPoint presentations (.pptx)
- Markdown files (.md)
- Plain text files (.txt)
- Images (PNG, JPG, etc.)
**Document Status:**
| Status | Description |
| ------ | ----------- |
| `PENDING` | Uploaded, waiting for processing |
| `PROCESSING` | Being parsed and indexed |
| `COMPLETED` | Ready for queries |
| `FAILED` | Processing error occurred |
**Important:** Only query documents with `status: "COMPLETED"`. Other statuses won't return results.
### Document Trees
Documents are indexed into **hierarchical tree structures**:
- Each tree has nodes representing sections/chapters
- Nodes contain: `title`, `summary`, `startPage`, `endPage`, `textContent`
- Trees enable precise navigation of large documents
### Queries
Two types of queries are available:
1. **`query_spaces`** - Search document content using tree-based reasoning
2. **`smart_query`** - Discover which spaces are relevant for a query
---
## MCP Tools Reference
### Space Management
#### list_spaces
List all knowledge spaces in the organization.
```text
list_spaces
list_spaces with status="active"
list_spaces with page=1 pageSize=20
```
**Parameters:**
- `page` (number, optional): Page number (default: 1)
- `pageSize` (number, optional): Items per page (default: 10)
- `status` (string, optional): Filter by `active`, `archived`, or `all`
---
#### manage_space
Create, get, update, or delete a space.
```text
manage_space with action="create" name="Product Docs" description="Technical documentation"
manage_space with action="create" name="Legal" description="Case files" icon="briefcase"
manage_space with action="get" id="space_abc123"
manage_space with action="update" id="space_abc123" description="Updated description"
manage_space with action="delete" id="space_abc123"
```
**Parameters:**
- `action` (string, required): `create`, `get`, `update`, or `delete`
- `id` (string): Space ID (required for get/update/delete)
- `name` (string): Space name (required for create)
- `description` (string, optional): Space description
- `icon` (string, optional): Icon name. Defaults to "folder"
- `slug` (string, optional): URL-friendly identifier
- `isArchived` (boolean, optional): Archive status (for update)
**Valid Icons:**
`folder`, `book`, `file-text`, `database`, `package`, `archive`, `briefcase`, `inbox`, `layers`, `box`
If an invalid icon is provided, the tool returns an error with the list of valid options.
---
### Document Management
#### list_documents
List documents in a space.
```text
list_documents with spaceId="space_abc123"
list_documents with spaceId="space_abc123" status="completed"
list_documents with spaceId="space_abc123" status="all"
```
**Parameters:**
- `spaceId` (string, required): Space ID
- `page` (number, optional): Page number
- `pageSize` (number, optional): Items per page (max: 100)
- `status` (string, optional): Filter by status. Values: `pending`, `processing`, `completed`, `failed`, or `all`. Omitting returns all documents.
**Note:** Status values are case-insensitive (`completed` and `COMPLETED` both work).
---
#### save_document
Upload or upsert documents (single or batch).
**Single document:**
```text
save_document with spaceId="space_abc123" filename="guide.md" content="# Guide\n\nContent here..."
```
**Batch upload:**
```text
save_document with spaceId="space_abc123" documents=[{"filename": "doc1.md", "content": "..."}, {"filename": "doc2.md", "content": "..."}]
```
**Parameters:**
- `spaceId` (string, required): Space ID
- `filename` (string): Filename (required for single)
- `content` (string): Content (required for single)
- `documents` (array, optional): Array of `{filename, content, metadata}` for batch
- `metadata` (object, optional): Custom key-value pairs
---
#### get_document
Get document content by ID or filename. Returns processed markdown text.
```text
get_document with spaceId="space_abc123" id="doc_xyz789"
get_document with spaceId="space_abc123" filename="guide.md"
get_document with spaceId="*" filename="guide.md"
```
**Parameters:**
- `spaceId` (string, required): Space ID, or `*` to search all spaces (requires filename)
- `id` (string, optional): Document ID
- `filename` (string, optional): Filename
**Notes:**
- Either `id` or `filename` is required
- Use `spaceId="*"` to search all spaces when you know the filename but not the space
- For completed documents, returns the extracted markdown text (not raw PDF binary)
- When using `*`, the response includes the `spaceId` where the document was found
---
#### update_document
Update document content or metadata.
```text
update_document with spaceId="space_abc123" id="doc_xyz789" content="New content..."
update_document with spaceId="space_abc123" id="doc_xyz789" append=true content="Additional content"
```
**Parameters:**
- `spaceId` (string, required): Space ID
- `id` (string, required): Document ID
- `content` (string, optional): New content
- `metadata` (object, optional): New metadata
- `append` (boolean, optional): Append instead of replace
- `separator` (string, optional): Separator for append mode
---
#### delete_document
Permanently delete a document.
```text
delete_document with spaceId="space_abc123" id="doc_xyz789"
```
**Parameters:**
- `spaceId` (string, required): Space ID
- `id` (string, required): Document ID
---
### Query Tools
#### query_spaces
Search documents across one or more spaces using tree-based reasoning.
```text
query_spaces with query="How do I authenticate API requests?"
query_spaces with query="installation guide" spaceIds="space_abc123"
query_spaces with query="error handling" spaceIds=["space_abc", "space_def"] topK=10
```
**Parameters:**
- `query` (string, required): Natural language search query
- `spaceIds` (string or array, optional): Space ID(s) to search. Omit or use `*` for all spaces
- `topK` (number, optional): Maximum results (default: 10)
- `compact` (boolean, optional): Use compact format (default: false). See **When to Use Compact** below.
**When to Use Compact:**
| Mode | When to use | What you get |
| ---- | ----------- | ------------ |
| `compact=false` (default) | **Most queries.** Any time you need actual data, facts, numbers, dates, or details from documents. | Full results with document metadata, tree context, page ranges, and complete content. |
| `compact=true` | Broad discovery queries where you only need to know *which* documents are relevant, not their content. | Minimal results: just content snippet, source filename, and score. |
**Rule of thumb:** Default to `compact=false`. Only use `compact=true` when you're browsing/surveying and don't need the actual content yet.
**Response (compact=true format):**
```json
{
"results": [
{
"content": "Relevant text content...",
"source": "filename.pdf",
"score": 0.95
}