canonical-data-map

ClawSkills 作者 openclaw-greek-accounting v1.1.0

Single source of truth for all paths, naming conventions, and data formats across the OpenClaw Greek Accounting system. Reference document.

安装 / 下载方式

TotalClaw CLI推荐
totalclaw install clawskills:clawskills~satoshistackalotto-canonical-data-map
cURL直接下载,无需登录
curl -fsSL https://skills.taituai.com/api/skills/clawskills%3Aclawskills~satoshistackalotto-canonical-data-map/file -o satoshistackalotto-canonical-data-map.md
# Canonical Data Directory Map
## OpenClaw Greek Accounting System — v1.1

## Setup

This skill is a reference document — it defines the directory structure and naming conventions used by all other Greek accounting skills. No binaries or credentials required.

```bash
# Set the data directory (all skills read this)
export OPENCLAW_DATA_DIR="/data"

# Initialize the full directory structure
mkdir -p $OPENCLAW_DATA_DIR/{incoming/{invoices,receipts,statements,government},processing,clients,compliance/{vat,efka,mydata,e1,e3},banking/{imports/{alpha,nbg,eurobank,piraeus},processing,reconciliation},ocr/{incoming,output},reports,auth,system/{logs,process-locks},backups}
```

This document defines the complete file system architecture for the OpenClaw Greek Accounting system. It is the authoritative reference for all path decisions. No skill may introduce a new top-level directory or deviate from the naming conventions defined here without a version update to this document.

**v1.1 change:** Added `/data/memory/` — agent episodic memory, failure logs, pattern store, GitHub proposal queue, and rate-limit state. Owner: `memory-feedback` (Skill 19, Phase 4). All Phase 3B+ skills must include episode and failure log hooks that write into this tree.

---

## Root Structure

```
/data/
╔══ incoming/          # All raw input — documents arriving into the system
╔══ processing/        # Temporary working space — files mid-pipeline
╔══ clients/           # Canonical client records — the source of truth
╔══ compliance/        # Government filings and submissions
╔══ banking/           # Bank statement processing pipeline
╔══ ocr/               # OCR processing pipeline
╔══ efka/              # EFKA/social security processing pipeline
╔══ reports/           # Generated reports for human consumption
╔══ exports/           # Data exports leaving the system
╔══ imports/           # Bulk data imports entering the system
╔══ dashboard/         # Dashboard state, config, cache, history
╔══ auth/              # Authentication and access control
╔══ backups/           # Encrypted system backups
╔══ gdpr-exports/      # GDPR subject access request exports
╔══ memory/            # Agent episodic memory, failure logs, learning patterns, proposals
└══ system/            # System-level files: logs, schema versions, locks
```

---

## 1. `/data/incoming/` — Raw Input

All documents entering the system land here first, regardless of source (email attachment, manual drop, scanner, bank download). Nothing in `/data/incoming/` is processed yet.

```
/data/incoming/
╔══ invoices/          # Supplier invoices (PDF, image)
╔══ receipts/          # Receipts (PDF, image, phone photo)
╔══ statements/        # Bank statements (PDF, CSV, OFX)
╔══ government/        # AADE/EFKA notifications and documents
╔══ payroll/           # Hour sheets, employee documents
╔══ tax-documents/     # Tax certificates, employer statements (βεβαιώσεις)
╔══ contracts/         # Contracts and legal documents
└══ other/             # Uncategorised — routed after classification
```

**Naming convention for incoming files:**
Files dropped here may arrive with any name. The system must NOT rename them on arrival — the original filename is preserved for audit purposes. The system assigns a canonical name only when moving to `/data/processing/`.

---

## 2. `/data/processing/` — In-Flight Pipeline

Temporary working space. Files here are mid-pipeline and may be incomplete. No other skill should read from `/data/processing/` as a final source — always read from `/data/clients/` or `/data/compliance/` for canonical data.

```
/data/processing/
╔══ ocr/               # OCR in progress
╚   ╔══ queued/        # Waiting for OCR
╚   ╔══ enhanced/      # Image pre-processing complete
╚   ╔══ extracted/     # Text extracted, not yet validated
╚   └══ validated/     # OCR output validated, ready to route
╔══ classification/    # Document type identification in progress
╔══ reconciliation/    # Bank reconciliation working files
╚   ╔══ matching/      # Transaction matching in progress
╚   └══ flagged/       # Items needing human review
╔══ compliance/        # Filing preparation working files
╚   ╔══ vat/           # VAT return preparation
╚   ╔══ efka/          # EFKA declaration preparation
╚   └══ mydata/        # myDATA submission preparation
└══ imports/           # Bulk import validation in progress
```

**Cleanup policy:** Files in `/data/processing/` are deleted or archived after the pipeline completes successfully. They are never the canonical record.

---

## 3. `/data/clients/` — Client Master Records

The single source of truth for all client data. Every other skill that needs client information reads from here. Only the `client-data-management` skill writes to this tree.

```
/data/clients/
╔══ _index.json                    # Global client index (name, AFM, status, assignee)
╔══ _audit-log.json                # All access and change events across all clients
╔══ _schema-version.json           # Current schema version for migration tracking
└══ {AFM}/                         # One directory per client, keyed by AFM (e.g. EL123456789)
    ╔══ profile.json               # Master client record
    ╔══ identifiers.json           # AFM, GEMI, EFKA employer ID, IBANs
    ╔══ contacts.json              # Contact persons
    ╔══ notes.json                 # Relationship notes and meeting logs
    ╔══ compliance/
    ╚   ╔══ filings.json           # All completed filings (VAT, EFKA, E1, etc.)
    ╚   ╔══ obligations.json       # Recurring obligation schedule
    ╚   └══ gaps.json              # Missing/overdue filing log
    ╔══ documents/
    ╚   ╔══ registry.json          # Metadata index of all documents for this client
    ╚   ╔══ pending.json           # Documents awaiting processing or review
    ╚   └══ archive-index.json     # References to archived documents
    ╔══ correspondence/
    ╚   └══ {YYYYMMDD}_{type}_{draft-id}_sent.json  # Immutable sent communication records
    ╔══ comms-preferences.json     # Client-specific salutation, contact, language overrides
    ╔══ payroll/
    ╚   └══ {YYYY-MM}/             # One folder per pay period
    ╚       ╔══ hours-input.csv    # Raw hours data
    ╚       ╔══ calculations.json  # Computed payroll data
    ╚       └══ {employee-slug}_payslip.pdf   # Generated payslips
    ╔══ financial-statements/
    ╚   ╔══ index.json             # All generated statements, versions, periods, status
    ╚   ╔══ {YYYY-MM}_pl_v{N}.json               # P&L machine-readable
    ╚   ╔══ {YYYY-MM}_balance-sheet_v{N}.json     # Balance sheet machine-readable
    ╚   ╔══ {YYYY-MM}_cash-flow_v{N}.json         # Cash flow machine-readable
    ╚   └══ {YYYY-MM}_vat-summary_v{N}.json       # VAT summary machine-readable
    └══ gdpr/
        ╔══ consent.json           # Consent records
        ╔══ retention-policy.json  # Retention schedule for this client
        └══ deletion-log.json      # Record of any deletions performed
```

**AFM format:** Always `EL` + 9 digits, uppercase. Example: `EL123456789`. Never store without the `EL` prefix. Never use the 9-digit-only form as a directory name.

---

## 4. `/data/compliance/` — Government Filings

Stores the actual submission files (XML, PDF) generated for government platforms. The filing *record* lives in `/data/clients/{AFM}/compliance/filings.json` — this directory holds the *file artefacts* themselves.

```
/data/compliance/
╔══ vat/
╚   └══ {AFM}_{YYYY}{MM}_vat_return.xml      # VAT return XML for TAXIS
╔══ mydata/
╚   └══ {AFM}_{YYYY}{MM}_{invoice-number}_mydata.xml
╔══ efka/
╚   └══ {AFM}_{YYYY}{MM}_efka_declaration.xml
╔══ e1/
╚   └══ {AFM}_{YYYY}_e1_form.xml             # Individual tax returns
╔══ e3/
╚   └══ {AFM}_{YYYY}_e3_form.xml             # Business activity statements
╔══ corporate-tax/
╚   └══ {AFM}_{YYYY}_corporate_tax.xml
└══ submissions/
    └══ {AFM}_{YYYY}{MM}_{type}_submission-receipt.json   # Government confirmation recei