canonical-data-map
OpenClaw Greek Accounting 系统中所有路径、命名约定和数据格式的单一事实来源。参考文档。
安装 / 下载方式
TotalClaw CLI推荐
totalclaw install totalclaw:totalclaw~satoshistackalotto-canonical-data-mapcURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/totalclaw%3Atotalclaw~satoshistackalotto-canonical-data-map/file -o satoshistackalotto-canonical-data-map.md## 概述(中文)
OpenClaw Greek Accounting 系统中所有路径、命名约定和数据格式的单一事实来源。参考文档。
## 原文
# Canonical Data Directory Map
## OpenClaw Greek Accounting System — v1.1
## Setup
This skill is a reference document — it defines the directory structure and naming conventions used by all other Greek accounting skills. No binaries or credentials required.
```bash
# Set the data directory (all skills read this)
export OPENCLAW_DATA_DIR="/data"
# Initialize the full directory structure
mkdir -p $OPENCLAW_DATA_DIR/{incoming/{invoices,receipts,statements,government},processing,clients,compliance/{vat,efka,mydata,e1,e3},banking/{imports/{alpha,nbg,eurobank,piraeus},processing,reconciliation},ocr/{incoming,output},reports,auth,system/{logs,process-locks},backups}
```
This document defines the complete file system architecture for the OpenClaw Greek Accounting system. It is the authoritative reference for all path decisions. No skill may introduce a new top-level directory or deviate from the naming conventions defined here without a version update to this document.
**v1.1 change:** Added `/data/memory/` — agent episodic memory, failure logs, pattern store, GitHub proposal queue, and rate-limit state. Owner: `memory-feedback` (Skill 19, Phase 4). All Phase 3B+ skills must include episode and failure log hooks that write into this tree.
---
## Root Structure
```
/data/
╔══ incoming/ # All raw input — documents arriving into the system
╔══ processing/ # Temporary working space — files mid-pipeline
╔══ clients/ # Canonical client records — the source of truth
╔══ compliance/ # Government filings and submissions
╔══ banking/ # Bank statement processing pipeline
╔══ ocr/ # OCR processing pipeline
╔══ efka/ # EFKA/social security processing pipeline
╔══ reports/ # Generated reports for human consumption
╔══ exports/ # Data exports leaving the system
╔══ imports/ # Bulk data imports entering the system
╔══ dashboard/ # Dashboard state, config, cache, history
╔══ auth/ # Authentication and access control
╔══ backups/ # Encrypted system backups
╔══ gdpr-exports/ # GDPR subject access request exports
╔══ memory/ # Agent episodic memory, failure logs, learning patterns, proposals
└══ system/ # System-level files: logs, schema versions, locks
```
---
## 1. `/data/incoming/` — Raw Input
All documents entering the system land here first, regardless of source (email attachment, manual drop, scanner, bank download). Nothing in `/data/incoming/` is processed yet.
```
/data/incoming/
╔══ invoices/ # Supplier invoices (PDF, image)
╔══ receipts/ # Receipts (PDF, image, phone photo)
╔══ statements/ # Bank statements (PDF, CSV, OFX)
╔══ government/ # AADE/EFKA notifications and documents
╔══ payroll/ # Hour sheets, employee documents
╔══ tax-documents/ # Tax certificates, employer statements (βεβαιώσεις)
╔══ contracts/ # Contracts and legal documents
└══ other/ # Uncategorised — routed after classification
```
**Naming convention for incoming files:**
Files dropped here may arrive with any name. The system must NOT rename them on arrival — the original filename is preserved for audit purposes. The system assigns a canonical name only when moving to `/data/processing/`.
---
## 2. `/data/processing/` — In-Flight Pipeline
Temporary working space. Files here are mid-pipeline and may be incomplete. No other skill should read from `/data/processing/` as a final source — always read from `/data/clients/` or `/data/compliance/` for canonical data.
```
/data/processing/
╔══ ocr/ # OCR in progress
╚ ╔══ queued/ # Waiting for OCR
╚ ╔══ enhanced/ # Image pre-processing complete
╚ ╔══ extracted/ # Text extracted, not yet validated
╚ └══ validated/ # OCR output validated, ready to route
╔══ classification/ # Document type identification in progress
╔══ reconciliation/ # Bank reconciliation working files
╚ ╔══ matching/ # Transaction matching in progress
╚ └══ flagged/ # Items needing human review
╔══ compliance/ # Filing preparation working files
╚ ╔══ vat/ # VAT return preparation
╚ ╔══ efka/ # EFKA declaration preparation
╚ └══ mydata/ # myDATA submission preparation
└══ imports/ # Bulk import validation in progress
```
**Cleanup policy:** Files in `/data/processing/` are deleted or archived after the pipeline completes successfully. They are never the canonical record.
---
## 3. `/data/clients/` — Client Master Records
The single source of truth for all client data. Every other skill that needs client information reads from here. Only the `client-data-management` skill writes to this tree.
```
/data/clients/
╔══ _index.json # Global client index (name, AFM, status, assignee)
╔══ _audit-log.json # All access and change events across all clients
╔══ _schema-version.json # Current schema version for migration tracking
└══ {AFM}/ # One directory per client, keyed by AFM (e.g. EL123456789)
╔══ profile.json # Master client record
╔══ identifiers.json # AFM, GEMI, EFKA employer ID, IBANs
╔══ contacts.json # Contact persons
╔══ notes.json # Relationship notes and meeting logs
╔══ compliance/
╚ ╔══ filings.json # All completed filings (VAT, EFKA, E1, etc.)
╚ ╔══ obligations.json # Recurring obligation schedule
╚ └══ gaps.json # Missing/overdue filing log
╔══ documents/
╚ ╔══ registry.json # Metadata index of all documents for this client
╚ ╔══ pending.json # Documents awaiting processing or review
╚ └══ archive-index.json # References to archived documents
╔══ correspondence/
╚ └══ {YYYYMMDD}_{type}_{draft-id}_sent.json # Immutable sent communication records
╔══ comms-preferences.json # Client-specific salutation, contact, language overrides
╔══ payroll/
╚ └══ {YYYY-MM}/ # One folder per pay period
╚ ╔══ hours-input.csv # Raw hours data
╚ ╔══ calculations.json # Computed payroll data
╚ └══ {employee-slug}_payslip.pdf # Generated payslips
╔══ financial-statements/
╚ ╔══ index.json # All generated statements, versions, periods, status
╚ ╔══ {YYYY-MM}_pl_v{N}.json # P&L machine-readable
╚ ╔══ {YYYY-MM}_balance-sheet_v{N}.json # Balance sheet machine-readable
╚ ╔══ {YYYY-MM}_cash-flow_v{N}.json # Cash flow machine-readable
╚ └══ {YYYY-MM}_vat-summary_v{N}.json # VAT summary machine-readable
└══ gdpr/
╔══ consent.json # Consent records
╔══ retention-policy.json # Retention schedule for this client
└══ deletion-log.json # Record of any deletions performed
```
**AFM format:** Always `EL` + 9 digits, uppercase. Example: `EL123456789`. Never store without the `EL` prefix. Never use the 9-digit-only form as a directory name.
---
## 4. `/data/compliance/` — Government Filings
Stores the actual submission files (XML, PDF) generated for government platforms. The filing *record* lives in `/data/clients/{AFM}/compliance/filings.json` — this directory holds the *file artefacts* themselves.
```
/data/compliance/
╔══ vat/
╚ └══ {AFM}_{YYYY}{MM}_vat_return.xml # VAT return XML for TAXIS
╔══ mydata/
╚ └══ {AFM}_{YYYY}{MM}_{invoice-number}_mydata.xml
╔══ efka/
╚ └══ {AFM}_{YYYY}{MM}_efka_declaration.xml
╔══ e1/
╚ └══ {AFM}_{YYYY}_e1_form.xml # Individual tax returns
╔══ e3/
╚ └══ {AFM}_{YYYY}_e3_form.xml # Business activity statements
╔══ corporate-tax/
╚ └══ {AFM}_{YYYY}_corporate_tax.xml
└══ submissions/
└══ {AFM}