Grand Diomande Research · Full HTML Reader

Obsidian Vault Integration — Architecture & Operations Guide

| Store | Type | Weakness | |-------|------|----------| | `memory/*.md` files | Flat Markdown | No linking, manual curation, linear | | Kimi SQLite DB (`kimi_memory.db`) | Structured tables | Queryable but invisible, no graph | | Supabase | Cloud relational | API-only access, no browsing | | Orbit | Semantic memory | Black-box embeddings, no human navigation | | Discord threads | Chat messages | Ephemeral, unsearchable after scroll | | Plan files (`.claude/plans/`) | Task-scoped Markdown | Die when plans complete |

Language as Infrastructure architecture technical paper candidate score 74 .md

Full Public Reader

Obsidian Vault Integration — Architecture & Operations Guide

Version: 0.2.0
Created: 2026-03-01
Updated: 2026-03-01
Status: Phase 1-4 complete, Phase 5 deferred

---

1. Problem Statement

Knowledge generated by the agent ecosystem is fragmented across 6+ disconnected stores:

StoreTypeWeakness
`memory/*.md` filesFlat MarkdownNo linking, manual curation, linear
Kimi SQLite DB (`kimi_memory.db`)Structured tablesQueryable but invisible, no graph
SupabaseCloud relationalAPI-only access, no browsing
OrbitSemantic memoryBlack-box embeddings, no human navigation
Discord threadsChat messagesEphemeral, unsearchable after scroll
Plan files (`.claude/plans/`)Task-scoped MarkdownDie when plans complete

None of these stores link ideas together. A synthesis about "N'Ko voice keyboards" doesn't know it's related to a Pulse session that built the "Speak" project, which connects to the "voice recognition" entity in the knowledge graph.

Obsidian's `[[bidirectional links]]` solve this organically. Every `[[entity]]` reference creates a backlink, and over time the graph view reveals clusters, orphans, and unexpected connections between ideas. The `obsidian-headless` CLI (v0.0.3) makes the vault machine-writable without the desktop app.

---

2. Architecture Overview

                          ┌─────────────────────────────┐
                          │     Discord Message          │
                          └──────────┬──────────────────┘
                                     │
                          ┌──────────▼──────────────────┐
                          │  synthesis-preprocessor.js   │
                          │  (Clawdbot hook)             │
                          │                              │
                          │  callSynthesizer()           │
                          │      ↓ Kimi-K2 via Together  │
                          │  Returns structured JSON     │
                          │      ↓                       │
                          │  writeToVault() ◄── NEW      │
                          │  fire-and-forget spawn       │
                          └──────────┬──────────────────┘
                                     │ stdin JSON pipe
                          ┌──────────▼──────────────────┐
                          │  obsidian_vault_writer       │
                          │  writer.py --mode synthesis  │
                          │                              │
                          │  • YAML frontmatter          │
                          │  • [[wikilinks]] extraction  │
                          │  • Entity/Project stubs      │
                          └──────────┬──────────────────┘
                                     │ writes .md files
                          ┌──────────▼──────────────────┐
                          │  [home-path]           │
                          │  (cloud-vm filesystem)       │
                          └──────────┬──────────────────┘
                                     │
                          ┌──────────▼──────────────────┐
                          │  obsidian-sync.service       │
                          │  ob sync --continuous        │
                          │  (systemd, always running)   │
                          └──────────┬──────────────────┘
                                     │ Obsidian Sync (E2E encrypted)
                    ┌────────┬───────┼───────┬────────┐
                    ▼        ▼       ▼       ▼        ▼
                 iPhone    Mac1    Mac3    Mac4     Mac5
                 Obsidian  Obsidian        Obsidian  Obsidian
                 App       Desktop         Desktop   Desktop

Catch-up Path (Prefect)

  ┌──────────────────────────────────┐
  │  vault_sync.py (every 6h)       │
  │  Prefect flow on cloud-vm       │
  │                                  │
  │  1. Query kimi_memory.db for     │
  │     synthesis rows > last_sync   │
  │  2. Write missing vault notes    │
  │  3. Generate Daily/ summary      │
  │  4. Detect orphan notes          │
  └──────────────────────────────────┘

Multi-Agent Session Path

                    ┌──────────────────────────────────────┐
                    │         Agent Sessions                 │
                    │                                        │
      ┌─────────┐  │  ┌───────────┐  ┌────────┐  ┌──────┐ │
      │ Claude   │──┤  │ Codex     │  │Gemini  │  │ CAP  │ │
      │ Code     │  │  │ (OpenAI)  │  │CLI     │  │agents│ │
      └─────┬───┘  │  └─────┬─────┘  └───┬────┘  └──┬───┘ │
            │      │        │             │          │      │
            ▼      │        ▼             ▼          ▼      │
    SessionEnd     │   stdin JSON     stdin JSON  stdin JSON │
    hook fires     │      pipe           pipe       pipe    │
            │      │        │             │          │      │
            ▼      └────────┼─────────────┼──────────┼──────┘
    write_vault_            │             │          │
    session()               │             │          │
            │               │             │          │
            └───────────────┼─────────────┼──────────┘
                            ▼
                ┌──────────────────────────────┐
                │  obsidian_vault_writer        │
                │  writer.py --mode agent-session│
                │                              │
                │  Universal agent session JSON │
                │  ↓                            │
                │  Sessions/{date}-{type}-{id}.md│
                └──────────────────────────────┘

Prompt-Log Scanner (Prefect catch-up)

  [home-path]
      │
      ├── <session-id>/
      │   ├── metadata.json    ← scope, started_at, model
      │   └── prompts.jsonl    ← all prompts with tool calls
      │
      ▼
  vault_sync.py → scan_agent_sessions task (every 6h)
      │
      ├── Check each session dir modified since last_sync
      ├── Skip if vault note already exists (Sessions/*-claudecode-{id}.md)
      ├── Extract: goal, stats, files, project refs
      └── VaultWriter.write_agent_session() → Sessions/{date}-claudecode-{id}.md

Backfill Path (one-time)

  kimi_memory.db ──► backfill.py ──► [home-path]
      │                                  │
      ├─ synthesis_results (4 rows)      ├─ Inbox/ (4 notes)
      ├─ context_memory (11 rows)        ├─ Knowledge/ (3 index files)
      └─ knowledge_graph (5,562 triples) └─ Entities/ (53 notes, filtered)
                                             + 22 stubs from synthesis links

---

3. Component Details

3.1 Vault Writer Module

Location: `[home-path]`
Runtime: Python 3.14, standalone venv (`.venv/`), single dependency (`pyyaml`)

FileLinesPurpose
`__init__.py`7Package marker, version string
`config.py`50Constants: vault path, directory names, filter lists
`slugify.py`32Unicode-aware title → filesystem-safe slug
`templates.py`334f-string templates for all 8 note types (incl. agent session)
`linker.py`72Extracts `[[wikilinks]]` from synthesis JSON fields
`writer.py`451`VaultWriter` class + CLI entry point (4 modes)
`backfill.py`200One-time kimi_memory.db → vault migration
`requirements.txt`1`pyyaml>=6.0`

VaultWriter API

python
from obsidian_vault_writer.writer import VaultWriter

writer = VaultWriter(vault_path=Path("[home-path]))

# Write a synthesis result as an Inbox note
path = writer.write_synthesis(synthesis_dict, message_str, channel_str)

# Write a Pulse session summary
path = writer.write_session_summary(session_dict)

# Write a universal agent session note (Claude Code, Codex, Gemini, etc.)
path = writer.write_agent_session(session_dict)

# Write a daily rollup
path = writer.write_daily_summary(date_str, stats_dict)

# Create entity/project/concept stubs (idempotent, no-op if exists)
path = writer.ensure_entity("N'Ko keyboard")
path = writer.ensure_project("comp-core")
path = writer.ensure_concept("Voice Input", essence="...", tags=["NKo"])

CLI Modes

bash
# Synthesis (reads JSON from stdin)
echo '{"synthesis": {...}, "message": "..."}' | \
  python3 -m obsidian_vault_writer.writer --mode synthesis --channel discord

# Pulse session
python3 -m obsidian_vault_writer.writer --mode session --session-id abc12345

# Universal agent session (reads JSON from stdin)
echo '{"agent_type":"claude-code","session_id":"...","stats":{...}}' | \
  python3 -m obsidian_vault_writer.writer --mode agent-session

# Daily summary
python3 -m obsidian_vault_writer.writer --mode daily --date 2026-03-01

Link Extraction Logic (`linker.py`)

The linker scans three synthesis fields to generate `[[wikilinks]]`:

Source FieldWhat it ExtractsLink Target
`project_refs`Project names`Projects/{ref}.md`
`knowledge_connections`Subject and object of each triple`Entities/{name}.md`
`dream_seeds`Seed titles and tags`Concepts/{title}.md`, `Entities/{tag}.md`

Deduplication: entities that match a project_ref (case-insensitive) are excluded from the "Related" line to avoid redundancy.

3.2 Vault Directory Structure

Location: `[home-path]` on cloud-vm (synced to all devices)

[home-path]
├── Inbox/                  ← Real-time synthesis notes, organized by date
│   └── YYYY-MM-DD/
│       └── {HHMMSS}-{slug}.md
├── Projects/               ← One note per project_ref (evergreen index)
│   └── {project-name}.md
├── Entities/               ← One note per knowledge graph entity
│   └── {entity-name}.md      (accumulates backlinks over time)
├── Concepts/               ← Dream seed ideas that grow
│   └── {concept-title}.md
├── Sessions/               ← Agent session summaries (Pulse, Claude Code)
│   └── {date}-pulse-{short-id}.md
├── Daily/                  ← Prefect-generated daily rollups
│   └── YYYY-MM-DD.md
├── Knowledge/              ← Curated index notes
│   ├── Facts.md
│   ├── Preferences.md
│   └── Patterns.md
└── Templates/              ← Obsidian templates for manual note creation
    ├── Synthesis.md
    ├── Session.md
    ├── Entity.md
    ├── Project.md
    └── Daily.md

3.3 Note Format

Every note has YAML frontmatter for Obsidian metadata queries and typed categorization.

Synthesis Note (Inbox)

markdown
---
type: synthesis
intent: idea
confidence: 0.85
route: direct
channel: discord
project_refs: [nko-linguistics]
tags: [NKo, voice recognition, adaptive technology]
created: 2026-03-01T10:30:00Z
---

# Voice-Controlled NKo Keyboard

## Enriched Prompt
Context-rich version of the original message...

## Dream Seeds
- **Voice-Controlled NKo Keyboard** (energy: 0.7)
  Tags: NKo, voice recognition, adaptive technology

## Skill Chain
lin:nko → thk:quantum

## Knowledge Connections
- [[NKo keyboard]] --related to--> [[voice recognition]]
- [[adaptive technology]] --applies to--> [[keyboard design]]

## Learnings
### Facts
- **nko_voice_input**: Voice input for NKo requires tone-aware recognition

## Links
- Project: [[nko-linguistics]]
- Related: [[NKo keyboard]], [[voice recognition]], [[keyboard design]]

Entity Note (auto-generated stub)

markdown
---
type: entity
created: 2026-03-01T10:30:00Z
auto_generated: true
---

# voice recognition

> Auto-generated entity stub. Backlinks will accumulate as more notes
> reference this entity.

Over time, backlinks from multiple synthesis notes, sessions, and other entities converge on entity pages, creating emergent clusters in the graph view.

Entity Note (backfill, with relations)

markdown
---
type: entity
relations: 74
created: 2026-03-01T17:13:42Z
source: backfill
---

# Buf Barista

## Relations
- --works_on--> [[project template]]
- --created--> [[Spore]]
- --works_on--> [[Graph Kernel]]
- --works_on--> [[Prompt Synthesizer]]
...

3.4 Discord → Vault Pipeline

Modified file: `[home-path]`

The synthesis hook already intercepts all Discord messages, runs them through Kimi-K2 synthesis, and attaches structured metadata. Two additions were made:

Constants (top of file, after existing config)

javascript
const VAULT_WRITER_PATH = path.join(os.homedir(),
  'projects/obsidian_vault_writer/writer.py');
const VAULT_PYTHON_PATH = path.join(os.homedir(),
  'projects/obsidian_vault_writer/.venv/bin/python3');

`writeToVault()` function

javascript
function writeToVault(synthesis, messageContent, channel) {
  try {
    const proc = spawn(VAULT_PYTHON_PATH, [
      '-m', 'obsidian_vault_writer.writer',
      '--mode', 'synthesis',
      '--channel', channel || 'unknown',
    ], {
      cwd: path.join(os.homedir(), 'projects'),
      env: {
        ...process.env,
        VAULT_PATH: path.join(os.homedir(), 'obsidian-vault'),
      },
      timeout: 10000,
      stdio: ['pipe', 'ignore', 'ignore'],
    });
    proc.stdin.write(JSON.stringify({ synthesis, message: messageContent }));
    proc.stdin.end();
    proc.unref();
  } catch (e) { /* silent fail — vault write is non-critical */ }
}

Design constraints enforced:

ConstraintHow
Never block message flow`spawn()` + `proc.unref()` — Node doesn't wait for the child
No stdout/stderr noise`stdio: ['pipe', 'ignore', 'ignore']` — only stdin is open for JSON
Timeout protection`timeout: 10000` — kills writer after 10s if stuck
Silent failure`try/catch` with empty catch — vault write is non-critical
Data passingJSON piped to stdin — avoids shell escaping issues with template literals

Call site

Inserted after the "Synthesis complete" log (line ~207 in modified file), before the return statement that attaches synthesis to message metadata:

javascript
// Fire-and-forget: write synthesis to Obsidian vault
writeToVault(synthesis, context.content, context.channel);

3.5 Backfill System

File: `[home-path]`

One-time migration from `kimi_memory.db` (located at `[home-path]`).

Source Tables

TableRowsTargetNotes
`synthesis_results`4`Inbox/` notesJoins with `messages` for original content
`context_memory`11`Knowledge/` index filesGrouped by category: facts (4), preferences (4), patterns (3)
`knowledge_graph`5,562`Entities/` notesHeavy filtering required (see below)

Knowledge Graph Filtering

The knowledge graph contains significant noise from filesystem scanning and ephemeral context. Three filter layers clean the data:

Layer 1 — Predicate filter (`SKIP_PREDICATES`):
Removes filesystem-artifact predicates: `has_file`, `has_path`, `contains_file`, `located_at`, `has_directory`, `has_extension`, `lives_in`

Layer 2 — Exact subject filter (`SKIP_SUBJECTS_EXACT`):
Removes generic subjects that accumulate thousands of meaningless relations: `user` (2,154 relations), `context` (842), `system` (45), `project` (5), `app` (2), `node`, `assistant`, `server`, `client`, `service`, `module`

Layer 3 — Prefix subject filter (`SKIP_SUBJECT_PREFIXES`):
Removes subjects starting with: `user `, `context `, `system `, `assistant `, `/`, `~`, `.`, `http`

Layer 4 — Length filter:
Removes subjects shorter than 2 chars or longer than 100 chars.

Result: 5,367 triples filtered → 195 meaningful triples → 53 entity notes.

Case-Insensitive Deduplication

Subjects like `Clawdbot` and `clawdbot` are merged under a single entity note. The display name is chosen by preferring the variant with more uppercase characters (e.g., `Clawdbot` wins over `clawdbot`).

Backfill Results (2026-03-01)

Synthesis notes: 4
Context entries: 11 (across 3 Knowledge index files)
Entity notes:    53 (from knowledge graph)
                +22 (stubs from synthesis link extraction)
                ────
Total:           89 notes + 5 templates = 94 files

3.6 Prefect Vault Sync Flow

File: `[home-path]`
Schedule: Every 6 hours (`0 /6 `)
Deployment: Registered in `[home-path]`

Tasks

TaskPurposeRetry
`catch-up-vault-writes`Query `synthesis_results` newer than `last_sync`, write missing Inbox notes1 retry, 30s delay
`scan-agent-sessions`Scan `[home-path]` for Claude Code sessions without vault notes1 retry, 30s delay
`generate-daily-summary`Create `Daily/{yesterday}.md` with stats and note linksNone
`detect-orphan-notes`Scan vault for notes with zero incoming `[[links]]`None

State File

`[home-path]`:

json
{
  "last_sync": "2026-03-01T17:13:42.000000",
  "last_daily": "2026-02-28"
}

Agent Session Scanner (Task 2)

TaskPurposeRetry
`scan-agent-sessions`Scan `[home-path]` for sessions without vault notes1 retry, 30s delay

Scans each session directory modified since `last_sync`. For each session:
1. Check if vault note already exists (`Sessions/*-claudecode-{short_id}.md`)
2. Read `metadata.json` for scope, model, timestamps
3. Read `prompts.jsonl` for prompt text, tool calls, affected targets
4. Extract goal from first prompt, calculate stats (duration, tool calls, files)
5. Determine project_refs from scope (subproject_name or repo_name)
6. Call `VaultWriter.write_agent_session()` with universal agent session dict

Orphan Detection Algorithm

1. Collect all `.md` file stems across vault (excluding `Daily/` and `Templates/`)
2. Scan all notes for `[[...]]` regex matches → build set of referenced names
3. Orphans = stems not in the referenced set
4. Capped at 30 results, appended to daily summary

3.7 Multi-Agent Session Integration

The vault captures sessions from any AI agent type, not just Discord synthesis. Each agent feeds into the same `write_agent_session()` method via a universal JSON schema.

Universal Agent Session Schema

json
{
  "agent_type": "claude-code",     // claude-code | codex | gemini | clawdbot | human | custom
  "session_id": "abc123...",
  "provider": "anthropic",          // anthropic | openai | google | together | ...
  "model": "claude-opus-4-6",
  "goal": "Implement vault writer",
  "outcome": "complete",            // complete | failed | cancelled | interrupted
  "date": "2026-03-01",
  "stats": {
    "duration_minutes": 45.3,
    "prompt_count": 12,
    "tool_call_count": 87,
    "files_modified": ["writer.py", "templates.py"],
    "files_read": ["config.py", "session_end_hook.py"],
    "estimated_tokens": 15000
  },
  "project_refs": ["obsidian-vault-writer"],
  "key_topics": ["cc-graph-kernel", "Comp-Core"],
  "source": "session_end_hook"     // session_end_hook | prompt_scan | cap | daemon | manual
}

Input Paths

SourceAgent TypeTriggerMechanism
Claude Code SessionEnd hook`claude-code`Session ends`write_vault_session()` in `session_end_hook.py` spawns vault writer as subprocess
Prompt-log scanner`claude-code`Prefect flow every 6h`scan_agent_sessions()` task in `vault_sync.py` retroactively finds missed sessions
AAO Reputation CollectoranyTask quality assessed`write_task_to_vault()` in `aao_reputation_collector.py` — fire-and-forget after each assessment
Codex CLI`codex`Manual or daemonPipe JSON to `writer.py --mode agent-session` via stdin
Gemini CLI`gemini`Manual or daemonPipe JSON to `writer.py --mode agent-session` via stdin
CAP (Clarity Agent Protocol)anyAgent dispatch completeFuture: CAP dispatcher calls vault writer after agent completes
cc-agent-daemonanyDaemon execution completeFuture: Daemon reports result + vault write

Claude Code SessionEnd Hook Integration

Modified file: `[home-path]`

The existing SessionEnd hook already computes comprehensive session stats (duration, prompt count, tool calls, files modified/read, scope). A new `write_vault_session()` function was added:

python
def write_vault_session(session_id, stats, prompts):
    """Write session to Obsidian vault. Fire-and-forget subprocess."""
    # 1. Extract goal from first prompt text (first 100 chars)
    # 2. Build project_refs from scope (subproject_name or repo_name)
    # 3. Extract key_topics from affected targets (cc-* and Comp-* dirs)
    # 4. Build universal agent session dict
    # 5. Spawn: python3 -m obsidian_vault_writer.writer --mode agent-session
    #    with JSON on stdin, stdout/stderr devnull

Call site: Step 8 of `process_session_end()`, after archive and before cleanup:

python
# 8. Write session to Obsidian vault (fire-and-forget)
try:
    write_vault_session(session_id, stats, prompts)
except Exception as e:
    log_debug(f"Vault write failed (non-critical): {e}")

Design: Fire-and-forget via `subprocess.Popen()` — the hook writes JSON to stdin and closes it without waiting for the process to exit. The hook never blocks session teardown.

Vault Note Output (Agent Session)

Path: `Sessions/{date}-{agenttype}-{short_id}.md`

Examples:
- `Sessions/2026-03-01-claudecode-abc123def456.md`
- `Sessions/2026-03-01-codex-xyz789.md`
- `Sessions/2026-03-01-gemini-qrs456.md`

markdown
---
type: agent-session
agent_type: claude-code
provider: anthropic
model: claude-opus-4-6
session_id: abc123def456...
date: 2026-03-01
outcome: complete
source: session_end_hook
duration_minutes: 45.3
prompt_count: 12
tool_calls: 87
project_refs: [obsidian-vault-writer]
tags: [cc-graph-kernel, Comp-Core]
created: 2026-03-01T18:30:00Z
---

# Claude Code Session: Implement vault writer multi-agent integration

| | |
|---|---|
| **Agent** | Claude Code (claude-opus-4-6) |
| **Duration** | 45m |
| **Prompts** | 12 |
| **Tool Calls** | 87 |
| **Outcome** | complete |

## Goal
Implement vault writer multi-agent integration

## Files Modified (5)
- `writer.py`
- `templates.py`
- `session_end_hook.py`
- `vault_sync.py`
- `ARCHITECTURE.md`

## Files Read (8)
- `config.py`
- `session_end_hook.py`
- ...

## Links
- Project: [[obsidian-vault-writer]]
- Related: [[cc-graph-kernel]], [[Comp-Core]]

Relationship to CAP (Clarity Agent Protocol)

The existing CAP protocol (`Desktop/clarity-agent-protocol/`) defines `AgentCapability` interfaces for dispatching to multiple agent types. The vault writer's universal session format aligns with CAP's agent types:

CAP Agent TypeVault `agent_type`Status
`claude-code``claude-code`Active — SessionEnd hook + prompt scanner
`codex``codex`Ready — stdin JSON pipe
`gemini``gemini`Ready — stdin JSON pipe
`clawdbot``clawdbot`Ready — stdin JSON pipe
`human``human`Ready — manual entry
`custom``custom`Ready — fallback

Future: The CAP dispatcher (`dispatcher.ts`) can call the vault writer after each agent dispatch completes, passing the session result through the universal JSON schema.

AAO Reputation Collector → Vault Integration

Modified file: `[home-path]`

The AAO (Admissible Agent Orchestration) reputation collector runs every 10 minutes and assesses quality of completed tasks from Supabase `mac_tasks`. After each quality assessment, it now writes a vault note.

Data source: Supabase `mac_tasks` table with expanded `select`:

id, task_content, output, exit_code, duration_ms, error_log,
claimed_by, completed_at, started_at, task_type, model_used,
project_path, source, session_id, pool_mode, team_role

`write_task_to_vault()` function:

1. Agent type detection: Maps `claimed_by` device → agent type (mac1-5 → claude-code), then overrides from `model_used` (gpt/o1/o3 → codex, gemini → gemini)
2. Provider detection: Inferred from model string (claude → anthropic, gpt → openai, gemini → google)
3. Duration: Prefers `duration_ms`, falls back to `started_at`↔`completed_at` diff
4. Project refs: Extracted from `project_path` — looks for `cc-` or `Comp-` directories, falls back to last meaningful path component
5. Goal: First 120 chars of `task_content`
6. Outcome: Maps quality → outcome (`high`/`medium` → complete, `low` → degraded, `failed` → failed)
7. Key topics: Auto-tagged from `task_type`, `pool_mode`, `team_role`, and quality level
8. AAO metadata: Passes `aao_quality`, `aao_confidence`, `aao_device_score` into the session note

AAO-specific fields in vault notes:

Frontmatter includes:

yaml
aao_quality: high
aao_confidence: 0.85
aao_device_score: 0.912

Stats table includes:

| **AAO Quality** | high (85% confidence) |
| **Device Score** | 0.91 |

These fields enable Obsidian Dataview queries like:
- `TABLE aao_quality, aao_device_score FROM "Sessions" WHERE source CONTAINS "aao"`
- Find all degraded tasks: `WHERE aao_quality = "low" OR aao_quality = "failed"`
- Track device reputation trends across sessions

Call site: In the main flow loop, after `assess_and_attest()` returns for non-skipped tasks:

python
for t in tasks:
    result = assess_and_attest(t, reputation)
    if q != "skipped":
        write_task_to_vault(t, quality=q, confidence=..., device_score=...)

Design: Fire-and-forget via `subprocess.Popen()`. Never blocks the reputation collector. Failures are caught and logged at debug level.

Dedup note: AAO tasks use Supabase task UUIDs as session IDs, while Claude Code SessionEnd hooks use Claude's internal session IDs. These are completely different ID spaces, so there's no collision risk between the two paths.

3.8 Obsidian Headless Sync Service

Package: `obsidian-headless` v0.0.3 (npm global)
Binary: `/usr/bin/ob`
Requires: Node.js 22+ (upgraded from 20 → 22.22.0 on cloud-vm)

Systemd Service

File: `/etc/systemd/system/obsidian-sync.service`

ini
[Unit]
Description=Obsidian Headless Sync (continuous)
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=mohameddiomande
ExecStart=/usr/bin/ob sync --path /home/mohameddiomande/obsidian-vault --continuous
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=obsidian-sync

[Install]
WantedBy=multi-user.target

Status: Enabled, running, auto-restarts.

Remote Vault

PropertyValue
Vault ID`6489767d30bd6868e28965dbab9a53f6`
Vault NameAgent Vault
RegionNorth America
EncryptionManaged
Device Name`cloud-vm`
Conflict StrategyMerge (most recent wins)

---

4. Data Flow Diagrams

4.1 Real-time Path (Discord → Vault → All Devices)

User sends Discord message
        │
        ▼
Clawdbot gateway receives message
        │
        ▼
synthesis-preprocessor.js hook fires
        │
        ├──► callSynthesizer() ──► Kimi-K2 (Together API)
        │         │
        │         ▼
        │    Synthesis JSON returned
        │         │
        │         ├──► Attach to message metadata (existing behavior)
        │         │
        │         └──► writeToVault() [NEW, fire-and-forget]
        │                   │
        │                   ▼
        │              spawn python3 -m obsidian_vault_writer.writer
        │                   │ stdin: {"synthesis": {...}, "message": "..."}
        │                   │
        │                   ▼
        │              VaultWriter.write_synthesis()
        │                   │
        │                   ├──► Write Inbox/{date}/{time}-{slug}.md
        │                   ├──► ensure_entity() for each knowledge connection
        │                   ├──► ensure_project() for each project_ref
        │                   └──► ensure_concept() for each dream_seed
        │
        ▼
Message continues through Clawdbot pipeline (unblocked)

        ... meanwhile on cloud-vm ...

obsidian-sync.service detects new .md files
        │
        ▼
ob sync --continuous uploads to Obsidian cloud
        │
        ▼
Obsidian apps on iPhone/Mac1/Mac4/Mac5 pull new notes

4.2 Agent Session Path (Claude Code → Vault)

Claude Code session ends
        │
        ▼
SessionEnd hook fires (session_end_hook.py)
        │
        ├──► Steps 1-7: Stats, Orbit notify, archive (existing)
        │
        ├──► Step 8: write_vault_session() [NEW]
        │         │
        │         ├──► Extract goal from first prompt
        │         ├──► Build project_refs from scope
        │         ├──► Extract key_topics from affected targets
        │         ├──► Build universal agent session JSON
        │         │
        │         └──► subprocess.Popen (fire-and-forget)
        │                   │ stdin: agent session JSON
        │                   ▼
        │              python3 -m obsidian_vault_writer.writer --mode agent-session
        │                   │
        │                   ▼
        │              VaultWriter.write_agent_session()
        │                   │
        │                   └──► Sessions/{date}-claudecode-{short_id}.md
        │
        ├──► Steps 9-10: Cleanup, summary (existing)
        │
        ▼
Session ends normally (vault write never blocks)

4.3 AAO Reputation → Vault Path (every 10min)

Prefect scheduler triggers aao-reputation-collector flow
        │
        ▼
fetch_unassessed_tasks task
        │
        ├──► Query Supabase mac_tasks: status=complete, completed_at > last_assessed_at
        ├──► Returns: task_content, output, claimed_by, model_used,
        │              project_path, duration_ms, task_type, pool_mode, team_role
        │
        ▼
For each task: assess_and_attest()
        │
        ├──► Compute quality score (output length, exit code, errors, duration)
        ├──► Sign HMAC attestation triple → Graph Kernel
        ├──► Update device reputation score
        │
        └──► write_task_to_vault() [NEW, fire-and-forget]
                  │
                  ├──► Map claimed_by → agent_type (mac* → claude-code, etc.)
                  ├──► Infer provider from model_used
                  ├──► Extract project_refs from project_path
                  ├──► Map quality → outcome (high → complete, low → degraded)
                  ├──► Include AAO metadata: quality, confidence, device_score
                  │
                  └──► subprocess.Popen: python3 -m obsidian_vault_writer.writer
                            │                    --mode agent-session
                            ▼
                       Sessions/{date}-{agenttype}-{task_uuid}.md
                            │
                            │ Includes in frontmatter:
                            │   aao_quality: high
                            │   aao_confidence: 0.85
                            │   aao_device_score: 0.912
                            │
                            ▼
                       obsidian-sync picks up → all devices

4.4 Catch-up Path (Prefect, every 6h)

Prefect scheduler triggers vault_sync flow
        │
        ▼
catch_up_writes task (Discord synthesis)
        │
        ├──► Read .vault_sync_state.json → last_sync timestamp
        ├──► Query kimi_memory.db: synthesis_results WHERE timestamp > last_sync
        ├──► For each missing result: VaultWriter.write_synthesis()
        └──► Update last_sync in state file
        │
        ▼
scan_agent_sessions task (Claude Code) [NEW]
        │
        ├──► Scan [home-path]
        ├──► For each session dir modified since last_sync:
        │    ├──► Skip if vault note already exists
        │    ├──► Read metadata.json + prompts.jsonl
        │    ├──► Extract goal, stats, project refs
        │    └──► VaultWriter.write_agent_session()
        └──► Log count of retroactively written notes
        │
        ▼
generate_daily_summary task
        │
        ├──► Check if yesterday's Daily/ note exists
        ├──► Count: Inbox notes, entities, projects, sessions, concepts
        └──► Write Daily/{yesterday}.md with stats + [[links]]
        │
        ▼
detect_orphans task
        │
        ├──► Scan all vault notes for [[wikilink]] references
        ├──► Compare against all note stems
        └──► Log orphans (notes with zero incoming links)

---

5. Infrastructure

5.1 Cloud-VM Services

ServicePortSystemd UnitRole
Obsidian Sync`obsidian-sync.service`Continuous vault sync to Obsidian cloud
Perception Mesh8093`perception-mesh.service`Device orchestration (existing)
Prefect Server4200DockerFlow orchestration (existing)
Syncthing8384File sync Mac1↔cloud-vm (existing)

5.2 File Sync Topology

Mac1 ([home-path])
    │
    │ Syncthing (home folder, 60s scan)
    ▼
cloud-vm ([home-path])
    │
    │ obsidian-headless (continuous WebSocket)
    ▼
Obsidian Cloud (Agent Vault, North America)
    │
    │ Obsidian Sync protocol (E2E)
    ▼
iPhone, Mac3, Mac4, Mac5 (Obsidian app)

Note: Syncthing provides Mac1↔cloud-vm sync. All other devices receive notes via Obsidian Sync. Mac1 gets notes from both paths (Syncthing for locally-written notes, Obsidian Sync for notes written directly on cloud-vm by the synthesis hook).

5.3 Python Environment

HostPythonVenv PathUsed By
Mac13.14.3`[home-path]`Local testing, backfill
cloud-vm3.10+`[home-path]`Synthesis hook, Prefect flow

5.4 Node.js

HostVersionPathUsed By
cloud-vm22.22.0`/usr/bin/node``ob` CLI (obsidian-headless)

Upgraded from v20.20.0 → v22.22.0 on 2026-03-01 because obsidian-headless requires Node 22+.

---

6. Operations

6.1 Monitoring

bash
# Check sync service status
ssh cloud-vm "systemctl status obsidian-sync"

# Watch sync logs
ssh cloud-vm "journalctl -u obsidian-sync -f"

# Count vault notes
ssh cloud-vm 'for d in [home-path] do
  echo "$(basename $d)/: $(find $d -name "*.md" | wc -l) files"
done'

# Check Prefect vault_sync flow
# Via Prefect UI at http://cloud-vm:4200 or:
ssh cloud-vm "python3 [home-path]  # manual run

6.2 Troubleshooting

SymptomCauseFix
Notes not appearing on phoneSync service down`sudo systemctl restart obsidian-sync`
`ob sync` auth errorToken expired`ob login` again on cloud-vm
Synthesis notes not createdHook spawn failingCheck `VAULT_PATH` env, Python venv exists
Duplicate entity notesCase mismatchBackfill handles this; real-time stubs are idempotent
`203/EXEC` in systemdWrong binary pathVerify `which ob` matches `ExecStart` path
Syncthing not syncing vault`.stignore` exclusionCheck both sides: `grep obsidian [home-path]`

6.3 Manual Vault Operations

bash
# Re-run backfill (idempotent for entities/projects, adds new synthesis notes)
cd [home-path] && obsidian_vault_writer/.venv/bin/python3 -m obsidian_vault_writer.backfill

# Dry-run backfill (preview without writing)
cd [home-path] && obsidian_vault_writer/.venv/bin/python3 -m obsidian_vault_writer.backfill --dry-run

# Write a manual synthesis note
echo '{"synthesis": {...}, "message": "..."}' | \
  python3 -m obsidian_vault_writer.writer --mode synthesis --channel manual

# Write a Pulse session summary
python3 -m obsidian_vault_writer.writer --mode session --session-id <ID>

# Write a manual agent session note (any agent type)
echo '{"agent_type":"codex","session_id":"xyz","goal":"Fix bug","outcome":"complete","date":"2026-03-01","stats":{"duration_minutes":15,"prompt_count":3,"tool_call_count":10,"files_modified":["app.py"],"files_read":["config.py"]},"project_refs":["my-project"],"key_topics":[],"source":"manual"}' | \
  python3 -m obsidian_vault_writer.writer --mode agent-session

# Generate today's daily summary
python3 -m obsidian_vault_writer.writer --mode daily --date 2026-03-01

# Force Obsidian re-sync
ssh cloud-vm "ob sync --path [home-path]

---

7. Current Vault Statistics (2026-03-01)

DirectoryCountSource
Inbox/4Backfilled synthesis results
Entities/7553 from knowledge graph + 22 stubs from synthesis links
Projects/4Stubs: comp-core, nko-linguistics, litRPG, milkmen
Concepts/3Dream seeds: Voice-Controlled NKo Keyboard, AI Oat Milk Delivery Optimizer, litRPG Project Revival
Knowledge/3Facts (4 items), Preferences (4 items), Patterns (3 items)
Sessions/0No Pulse sessions written yet
Daily/0First daily summary generates on next vault_sync run
Templates/5Synthesis, Session, Entity, Project, Daily
Total94

---

8. Future Phases

Phase 4: Multi-Agent Session Integration (Complete)

All agent session types are supported via the universal agent session schema:
- Claude Code: Automated via SessionEnd hook (`session_end_hook.py`) + Prefect prompt-log scanner
- Pulse sessions: `--mode session --session-id <ID>` reads from `[home-path]`
- All other agents (Codex, Gemini, Clawdbot, CAP): `--mode agent-session` reads universal JSON from stdin

Output: `Sessions/{date}-{agenttype}-{short_id}.md`

Phase 5: MCP Integration + Dataview (Deferred)

Once the vault has 500+ notes with links:

  • Vault MCP server: Allow Claude Code to search/read vault notes directly via MCP protocol
  • Dataview plugin: Programmatic queries inside Obsidian (e.g., list all ideas by project, show orphan notes, weekly synthesis count)
  • Graph View: The visual map of the knowledge space — clusters around projects, entities, and concepts become navigable

---

9. Key Design Decisions

Why fire-and-forget (not async/await)?

The synthesis hook runs in the Clawdbot message pipeline. Any delay blocks message delivery. The vault write is non-critical — if it fails, the catch-up Prefect flow will find and write the missing note within 6 hours. The `spawn()` + `unref()` pattern ensures zero impact on message latency.

Why Python (not Node.js)?

The synthesizer (`dream-weaver-engine`) is Python. The Prefect flows are Python. The kimi_memory.db queries use Python's `sqlite3`. Writing the vault module in Python means it shares the ecosystem and can be imported directly by backfill and Prefect flows. The hook bridges via stdin JSON pipe — language-agnostic.

Why filesystem notes (not a database)?

Obsidian operates on a directory of `.md` files. This is the format. It means:
- Notes are human-readable without any tool
- Git-versionable (future: commit vault changes)
- Syncthing-compatible (no lock contention)
- Obsidian Sync handles conflict resolution (merge, most-recent-wins)

Why filter the knowledge graph so aggressively?

5,562 triples → 195 meaningful ones. Without filtering, `user` alone would have 2,154 relations, making its entity note useless. The filters target three noise categories:
1. Filesystem artifacts: Predicates like `has_file`, `has_path` from codebase scanning
2. Generic subjects: `user`, `context`, `system` — too broad to be meaningful entities
3. Ephemeral context: Subjects starting with paths (`/`, `~`, `.`) or URLs (`http`)

Why a universal agent session format (not per-agent types)?

Multiple agents (Claude Code, Codex, Gemini, Clawdbot) all produce session-like output — a goal, duration, files modified, outcome. Rather than writing separate adapters for each, the universal JSON schema captures the common denominator. Agent-specific metadata (like Claude Code's compaction count or Codex's reasoning traces) can be added to the `stats` dict without breaking the schema. This aligns with CAP's `AgentCapability` interface that already normalizes across agent types.

Why fire-and-forget for the SessionEnd hook?

The SessionEnd hook runs during session teardown. Blocking it to wait for the vault writer would delay the user's terminal returning. The vault write is non-critical — if it fails, the Prefect prompt-log scanner catches missed sessions within 6 hours. `subprocess.Popen()` with no `wait()` ensures the hook returns immediately.

Why both a hook AND a scanner?

Belt and suspenders. The SessionEnd hook catches 99

Why case-insensitive dedup?

The knowledge graph stores `Clawdbot` and `clawdbot` as separate subjects. In Obsidian, `[[Clawdbot]]` and `[[clawdbot]]` resolve to the same note. Merging at backfill time prevents duplicate entity files and consolidates all relations under one note.

Why Obsidian Sync over just Syncthing?

Syncthing only covers Mac1↔cloud-vm. Obsidian Sync covers all devices including iPhone (where Syncthing doesn't run well). Obsidian Sync also handles merge conflicts natively and works with the Obsidian mobile app's expectations for vault state.

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

projects/obsidian_vault_writer/ARCHITECTURE.md

Detected Structure

Method · Evaluation · References · Figures · Code Anchors · Architecture