Obsidian Vault Integration — Architecture & Operations Guide
| Store | Type | Weakness | |-------|------|----------| | `memory/*.md` files | Flat Markdown | No linking, manual curation, linear | | Kimi SQLite DB (`kimi_memory.db`) | Structured tables | Queryable but invisible, no graph | | Supabase | Cloud relational | API-only access, no browsing | | Orbit | Semantic memory | Black-box embeddings, no human navigation | | Discord threads | Chat messages | Ephemeral, unsearchable after scroll | | Plan files (`.claude/plans/`) | Task-scoped Markdown | Die when plans complete |
Full Public Reader
Obsidian Vault Integration — Architecture & Operations Guide
Version: 0.2.0
Created: 2026-03-01
Updated: 2026-03-01
Status: Phase 1-4 complete, Phase 5 deferred
---
1. Problem Statement
Knowledge generated by the agent ecosystem is fragmented across 6+ disconnected stores:
| Store | Type | Weakness |
|---|---|---|
| `memory/*.md` files | Flat Markdown | No linking, manual curation, linear |
| Kimi SQLite DB (`kimi_memory.db`) | Structured tables | Queryable but invisible, no graph |
| Supabase | Cloud relational | API-only access, no browsing |
| Orbit | Semantic memory | Black-box embeddings, no human navigation |
| Discord threads | Chat messages | Ephemeral, unsearchable after scroll |
| Plan files (`.claude/plans/`) | Task-scoped Markdown | Die when plans complete |
None of these stores link ideas together. A synthesis about "N'Ko voice keyboards" doesn't know it's related to a Pulse session that built the "Speak" project, which connects to the "voice recognition" entity in the knowledge graph.
Obsidian's `[[bidirectional links]]` solve this organically. Every `[[entity]]` reference creates a backlink, and over time the graph view reveals clusters, orphans, and unexpected connections between ideas. The `obsidian-headless` CLI (v0.0.3) makes the vault machine-writable without the desktop app.
---
2. Architecture Overview
┌─────────────────────────────┐
│ Discord Message │
└──────────┬──────────────────┘
│
┌──────────▼──────────────────┐
│ synthesis-preprocessor.js │
│ (Clawdbot hook) │
│ │
│ callSynthesizer() │
│ ↓ Kimi-K2 via Together │
│ Returns structured JSON │
│ ↓ │
│ writeToVault() ◄── NEW │
│ fire-and-forget spawn │
└──────────┬──────────────────┘
│ stdin JSON pipe
┌──────────▼──────────────────┐
│ obsidian_vault_writer │
│ writer.py --mode synthesis │
│ │
│ • YAML frontmatter │
│ • [[wikilinks]] extraction │
│ • Entity/Project stubs │
└──────────┬──────────────────┘
│ writes .md files
┌──────────▼──────────────────┐
│ [home-path] │
│ (cloud-vm filesystem) │
└──────────┬──────────────────┘
│
┌──────────▼──────────────────┐
│ obsidian-sync.service │
│ ob sync --continuous │
│ (systemd, always running) │
└──────────┬──────────────────┘
│ Obsidian Sync (E2E encrypted)
┌────────┬───────┼───────┬────────┐
▼ ▼ ▼ ▼ ▼
iPhone Mac1 Mac3 Mac4 Mac5
Obsidian Obsidian Obsidian Obsidian
App Desktop Desktop DesktopCatch-up Path (Prefect)
┌──────────────────────────────────┐
│ vault_sync.py (every 6h) │
│ Prefect flow on cloud-vm │
│ │
│ 1. Query kimi_memory.db for │
│ synthesis rows > last_sync │
│ 2. Write missing vault notes │
│ 3. Generate Daily/ summary │
│ 4. Detect orphan notes │
└──────────────────────────────────┘Multi-Agent Session Path
┌──────────────────────────────────────┐
│ Agent Sessions │
│ │
┌─────────┐ │ ┌───────────┐ ┌────────┐ ┌──────┐ │
│ Claude │──┤ │ Codex │ │Gemini │ │ CAP │ │
│ Code │ │ │ (OpenAI) │ │CLI │ │agents│ │
└─────┬───┘ │ └─────┬─────┘ └───┬────┘ └──┬───┘ │
│ │ │ │ │ │
▼ │ ▼ ▼ ▼ │
SessionEnd │ stdin JSON stdin JSON stdin JSON │
hook fires │ pipe pipe pipe │
│ │ │ │ │ │
▼ └────────┼─────────────┼──────────┼──────┘
write_vault_ │ │ │
session() │ │ │
│ │ │ │
└───────────────┼─────────────┼──────────┘
▼
┌──────────────────────────────┐
│ obsidian_vault_writer │
│ writer.py --mode agent-session│
│ │
│ Universal agent session JSON │
│ ↓ │
│ Sessions/{date}-{type}-{id}.md│
└──────────────────────────────┘Prompt-Log Scanner (Prefect catch-up)
[home-path]
│
├── <session-id>/
│ ├── metadata.json ← scope, started_at, model
│ └── prompts.jsonl ← all prompts with tool calls
│
▼
vault_sync.py → scan_agent_sessions task (every 6h)
│
├── Check each session dir modified since last_sync
├── Skip if vault note already exists (Sessions/*-claudecode-{id}.md)
├── Extract: goal, stats, files, project refs
└── VaultWriter.write_agent_session() → Sessions/{date}-claudecode-{id}.mdBackfill Path (one-time)
kimi_memory.db ──► backfill.py ──► [home-path]
│ │
├─ synthesis_results (4 rows) ├─ Inbox/ (4 notes)
├─ context_memory (11 rows) ├─ Knowledge/ (3 index files)
└─ knowledge_graph (5,562 triples) └─ Entities/ (53 notes, filtered)
+ 22 stubs from synthesis links---
3. Component Details
3.1 Vault Writer Module
Location: `[home-path]`
Runtime: Python 3.14, standalone venv (`.venv/`), single dependency (`pyyaml`)
| File | Lines | Purpose |
|---|---|---|
| `__init__.py` | 7 | Package marker, version string |
| `config.py` | 50 | Constants: vault path, directory names, filter lists |
| `slugify.py` | 32 | Unicode-aware title → filesystem-safe slug |
| `templates.py` | 334 | f-string templates for all 8 note types (incl. agent session) |
| `linker.py` | 72 | Extracts `[[wikilinks]]` from synthesis JSON fields |
| `writer.py` | 451 | `VaultWriter` class + CLI entry point (4 modes) |
| `backfill.py` | 200 | One-time kimi_memory.db → vault migration |
| `requirements.txt` | 1 | `pyyaml>=6.0` |
VaultWriter API
from obsidian_vault_writer.writer import VaultWriter
writer = VaultWriter(vault_path=Path("[home-path]))
# Write a synthesis result as an Inbox note
path = writer.write_synthesis(synthesis_dict, message_str, channel_str)
# Write a Pulse session summary
path = writer.write_session_summary(session_dict)
# Write a universal agent session note (Claude Code, Codex, Gemini, etc.)
path = writer.write_agent_session(session_dict)
# Write a daily rollup
path = writer.write_daily_summary(date_str, stats_dict)
# Create entity/project/concept stubs (idempotent, no-op if exists)
path = writer.ensure_entity("N'Ko keyboard")
path = writer.ensure_project("comp-core")
path = writer.ensure_concept("Voice Input", essence="...", tags=["NKo"])CLI Modes
# Synthesis (reads JSON from stdin)
echo '{"synthesis": {...}, "message": "..."}' | \
python3 -m obsidian_vault_writer.writer --mode synthesis --channel discord
# Pulse session
python3 -m obsidian_vault_writer.writer --mode session --session-id abc12345
# Universal agent session (reads JSON from stdin)
echo '{"agent_type":"claude-code","session_id":"...","stats":{...}}' | \
python3 -m obsidian_vault_writer.writer --mode agent-session
# Daily summary
python3 -m obsidian_vault_writer.writer --mode daily --date 2026-03-01Link Extraction Logic (`linker.py`)
The linker scans three synthesis fields to generate `[[wikilinks]]`:
| Source Field | What it Extracts | Link Target |
|---|---|---|
| `project_refs` | Project names | `Projects/{ref}.md` |
| `knowledge_connections` | Subject and object of each triple | `Entities/{name}.md` |
| `dream_seeds` | Seed titles and tags | `Concepts/{title}.md`, `Entities/{tag}.md` |
Deduplication: entities that match a project_ref (case-insensitive) are excluded from the "Related" line to avoid redundancy.
3.2 Vault Directory Structure
Location: `[home-path]` on cloud-vm (synced to all devices)
[home-path]
├── Inbox/ ← Real-time synthesis notes, organized by date
│ └── YYYY-MM-DD/
│ └── {HHMMSS}-{slug}.md
├── Projects/ ← One note per project_ref (evergreen index)
│ └── {project-name}.md
├── Entities/ ← One note per knowledge graph entity
│ └── {entity-name}.md (accumulates backlinks over time)
├── Concepts/ ← Dream seed ideas that grow
│ └── {concept-title}.md
├── Sessions/ ← Agent session summaries (Pulse, Claude Code)
│ └── {date}-pulse-{short-id}.md
├── Daily/ ← Prefect-generated daily rollups
│ └── YYYY-MM-DD.md
├── Knowledge/ ← Curated index notes
│ ├── Facts.md
│ ├── Preferences.md
│ └── Patterns.md
└── Templates/ ← Obsidian templates for manual note creation
├── Synthesis.md
├── Session.md
├── Entity.md
├── Project.md
└── Daily.md3.3 Note Format
Every note has YAML frontmatter for Obsidian metadata queries and typed categorization.
Synthesis Note (Inbox)
---
type: synthesis
intent: idea
confidence: 0.85
route: direct
channel: discord
project_refs: [nko-linguistics]
tags: [NKo, voice recognition, adaptive technology]
created: 2026-03-01T10:30:00Z
---
# Voice-Controlled NKo Keyboard
## Enriched Prompt
Context-rich version of the original message...
## Dream Seeds
- **Voice-Controlled NKo Keyboard** (energy: 0.7)
Tags: NKo, voice recognition, adaptive technology
## Skill Chain
lin:nko → thk:quantum
## Knowledge Connections
- [[NKo keyboard]] --related to--> [[voice recognition]]
- [[adaptive technology]] --applies to--> [[keyboard design]]
## Learnings
### Facts
- **nko_voice_input**: Voice input for NKo requires tone-aware recognition
## Links
- Project: [[nko-linguistics]]
- Related: [[NKo keyboard]], [[voice recognition]], [[keyboard design]]Entity Note (auto-generated stub)
---
type: entity
created: 2026-03-01T10:30:00Z
auto_generated: true
---
# voice recognition
> Auto-generated entity stub. Backlinks will accumulate as more notes
> reference this entity.Over time, backlinks from multiple synthesis notes, sessions, and other entities converge on entity pages, creating emergent clusters in the graph view.
Entity Note (backfill, with relations)
---
type: entity
relations: 74
created: 2026-03-01T17:13:42Z
source: backfill
---
# Buf Barista
## Relations
- --works_on--> [[project template]]
- --created--> [[Spore]]
- --works_on--> [[Graph Kernel]]
- --works_on--> [[Prompt Synthesizer]]
...3.4 Discord → Vault Pipeline
Modified file: `[home-path]`
The synthesis hook already intercepts all Discord messages, runs them through Kimi-K2 synthesis, and attaches structured metadata. Two additions were made:
Constants (top of file, after existing config)
const VAULT_WRITER_PATH = path.join(os.homedir(),
'projects/obsidian_vault_writer/writer.py');
const VAULT_PYTHON_PATH = path.join(os.homedir(),
'projects/obsidian_vault_writer/.venv/bin/python3');`writeToVault()` function
function writeToVault(synthesis, messageContent, channel) {
try {
const proc = spawn(VAULT_PYTHON_PATH, [
'-m', 'obsidian_vault_writer.writer',
'--mode', 'synthesis',
'--channel', channel || 'unknown',
], {
cwd: path.join(os.homedir(), 'projects'),
env: {
...process.env,
VAULT_PATH: path.join(os.homedir(), 'obsidian-vault'),
},
timeout: 10000,
stdio: ['pipe', 'ignore', 'ignore'],
});
proc.stdin.write(JSON.stringify({ synthesis, message: messageContent }));
proc.stdin.end();
proc.unref();
} catch (e) { /* silent fail — vault write is non-critical */ }
}Design constraints enforced:
| Constraint | How |
|---|---|
| Never block message flow | `spawn()` + `proc.unref()` — Node doesn't wait for the child |
| No stdout/stderr noise | `stdio: ['pipe', 'ignore', 'ignore']` — only stdin is open for JSON |
| Timeout protection | `timeout: 10000` — kills writer after 10s if stuck |
| Silent failure | `try/catch` with empty catch — vault write is non-critical |
| Data passing | JSON piped to stdin — avoids shell escaping issues with template literals |
Call site
Inserted after the "Synthesis complete" log (line ~207 in modified file), before the return statement that attaches synthesis to message metadata:
// Fire-and-forget: write synthesis to Obsidian vault
writeToVault(synthesis, context.content, context.channel);3.5 Backfill System
File: `[home-path]`
One-time migration from `kimi_memory.db` (located at `[home-path]`).
Source Tables
| Table | Rows | Target | Notes |
|---|---|---|---|
| `synthesis_results` | 4 | `Inbox/` notes | Joins with `messages` for original content |
| `context_memory` | 11 | `Knowledge/` index files | Grouped by category: facts (4), preferences (4), patterns (3) |
| `knowledge_graph` | 5,562 | `Entities/` notes | Heavy filtering required (see below) |
Knowledge Graph Filtering
The knowledge graph contains significant noise from filesystem scanning and ephemeral context. Three filter layers clean the data:
Layer 1 — Predicate filter (`SKIP_PREDICATES`):
Removes filesystem-artifact predicates: `has_file`, `has_path`, `contains_file`, `located_at`, `has_directory`, `has_extension`, `lives_in`
Layer 2 — Exact subject filter (`SKIP_SUBJECTS_EXACT`):
Removes generic subjects that accumulate thousands of meaningless relations: `user` (2,154 relations), `context` (842), `system` (45), `project` (5), `app` (2), `node`, `assistant`, `server`, `client`, `service`, `module`
Layer 3 — Prefix subject filter (`SKIP_SUBJECT_PREFIXES`):
Removes subjects starting with: `user `, `context `, `system `, `assistant `, `/`, `~`, `.`, `http`
Layer 4 — Length filter:
Removes subjects shorter than 2 chars or longer than 100 chars.
Result: 5,367 triples filtered → 195 meaningful triples → 53 entity notes.
Case-Insensitive Deduplication
Subjects like `Clawdbot` and `clawdbot` are merged under a single entity note. The display name is chosen by preferring the variant with more uppercase characters (e.g., `Clawdbot` wins over `clawdbot`).
Backfill Results (2026-03-01)
Synthesis notes: 4
Context entries: 11 (across 3 Knowledge index files)
Entity notes: 53 (from knowledge graph)
+22 (stubs from synthesis link extraction)
────
Total: 89 notes + 5 templates = 94 files3.6 Prefect Vault Sync Flow
File: `[home-path]`
Schedule: Every 6 hours (`0 /6 `)
Deployment: Registered in `[home-path]`
Tasks
| Task | Purpose | Retry |
|---|---|---|
| `catch-up-vault-writes` | Query `synthesis_results` newer than `last_sync`, write missing Inbox notes | 1 retry, 30s delay |
| `scan-agent-sessions` | Scan `[home-path]` for Claude Code sessions without vault notes | 1 retry, 30s delay |
| `generate-daily-summary` | Create `Daily/{yesterday}.md` with stats and note links | None |
| `detect-orphan-notes` | Scan vault for notes with zero incoming `[[links]]` | None |
State File
`[home-path]`:
{
"last_sync": "2026-03-01T17:13:42.000000",
"last_daily": "2026-02-28"
}Agent Session Scanner (Task 2)
| Task | Purpose | Retry |
|---|---|---|
| `scan-agent-sessions` | Scan `[home-path]` for sessions without vault notes | 1 retry, 30s delay |
Scans each session directory modified since `last_sync`. For each session:
1. Check if vault note already exists (`Sessions/*-claudecode-{short_id}.md`)
2. Read `metadata.json` for scope, model, timestamps
3. Read `prompts.jsonl` for prompt text, tool calls, affected targets
4. Extract goal from first prompt, calculate stats (duration, tool calls, files)
5. Determine project_refs from scope (subproject_name or repo_name)
6. Call `VaultWriter.write_agent_session()` with universal agent session dict
Orphan Detection Algorithm
1. Collect all `.md` file stems across vault (excluding `Daily/` and `Templates/`)
2. Scan all notes for `[[...]]` regex matches → build set of referenced names
3. Orphans = stems not in the referenced set
4. Capped at 30 results, appended to daily summary
3.7 Multi-Agent Session Integration
The vault captures sessions from any AI agent type, not just Discord synthesis. Each agent feeds into the same `write_agent_session()` method via a universal JSON schema.
Universal Agent Session Schema
{
"agent_type": "claude-code", // claude-code | codex | gemini | clawdbot | human | custom
"session_id": "abc123...",
"provider": "anthropic", // anthropic | openai | google | together | ...
"model": "claude-opus-4-6",
"goal": "Implement vault writer",
"outcome": "complete", // complete | failed | cancelled | interrupted
"date": "2026-03-01",
"stats": {
"duration_minutes": 45.3,
"prompt_count": 12,
"tool_call_count": 87,
"files_modified": ["writer.py", "templates.py"],
"files_read": ["config.py", "session_end_hook.py"],
"estimated_tokens": 15000
},
"project_refs": ["obsidian-vault-writer"],
"key_topics": ["cc-graph-kernel", "Comp-Core"],
"source": "session_end_hook" // session_end_hook | prompt_scan | cap | daemon | manual
}Input Paths
| Source | Agent Type | Trigger | Mechanism |
|---|---|---|---|
| Claude Code SessionEnd hook | `claude-code` | Session ends | `write_vault_session()` in `session_end_hook.py` spawns vault writer as subprocess |
| Prompt-log scanner | `claude-code` | Prefect flow every 6h | `scan_agent_sessions()` task in `vault_sync.py` retroactively finds missed sessions |
| AAO Reputation Collector | any | Task quality assessed | `write_task_to_vault()` in `aao_reputation_collector.py` — fire-and-forget after each assessment |
| Codex CLI | `codex` | Manual or daemon | Pipe JSON to `writer.py --mode agent-session` via stdin |
| Gemini CLI | `gemini` | Manual or daemon | Pipe JSON to `writer.py --mode agent-session` via stdin |
| CAP (Clarity Agent Protocol) | any | Agent dispatch complete | Future: CAP dispatcher calls vault writer after agent completes |
| cc-agent-daemon | any | Daemon execution complete | Future: Daemon reports result + vault write |
Claude Code SessionEnd Hook Integration
Modified file: `[home-path]`
The existing SessionEnd hook already computes comprehensive session stats (duration, prompt count, tool calls, files modified/read, scope). A new `write_vault_session()` function was added:
def write_vault_session(session_id, stats, prompts):
"""Write session to Obsidian vault. Fire-and-forget subprocess."""
# 1. Extract goal from first prompt text (first 100 chars)
# 2. Build project_refs from scope (subproject_name or repo_name)
# 3. Extract key_topics from affected targets (cc-* and Comp-* dirs)
# 4. Build universal agent session dict
# 5. Spawn: python3 -m obsidian_vault_writer.writer --mode agent-session
# with JSON on stdin, stdout/stderr devnullCall site: Step 8 of `process_session_end()`, after archive and before cleanup:
# 8. Write session to Obsidian vault (fire-and-forget)
try:
write_vault_session(session_id, stats, prompts)
except Exception as e:
log_debug(f"Vault write failed (non-critical): {e}")Design: Fire-and-forget via `subprocess.Popen()` — the hook writes JSON to stdin and closes it without waiting for the process to exit. The hook never blocks session teardown.
Vault Note Output (Agent Session)
Path: `Sessions/{date}-{agenttype}-{short_id}.md`
Examples:
- `Sessions/2026-03-01-claudecode-abc123def456.md`
- `Sessions/2026-03-01-codex-xyz789.md`
- `Sessions/2026-03-01-gemini-qrs456.md`
---
type: agent-session
agent_type: claude-code
provider: anthropic
model: claude-opus-4-6
session_id: abc123def456...
date: 2026-03-01
outcome: complete
source: session_end_hook
duration_minutes: 45.3
prompt_count: 12
tool_calls: 87
project_refs: [obsidian-vault-writer]
tags: [cc-graph-kernel, Comp-Core]
created: 2026-03-01T18:30:00Z
---
# Claude Code Session: Implement vault writer multi-agent integration
| | |
|---|---|
| **Agent** | Claude Code (claude-opus-4-6) |
| **Duration** | 45m |
| **Prompts** | 12 |
| **Tool Calls** | 87 |
| **Outcome** | complete |
## Goal
Implement vault writer multi-agent integration
## Files Modified (5)
- `writer.py`
- `templates.py`
- `session_end_hook.py`
- `vault_sync.py`
- `ARCHITECTURE.md`
## Files Read (8)
- `config.py`
- `session_end_hook.py`
- ...
## Links
- Project: [[obsidian-vault-writer]]
- Related: [[cc-graph-kernel]], [[Comp-Core]]Relationship to CAP (Clarity Agent Protocol)
The existing CAP protocol (`Desktop/clarity-agent-protocol/`) defines `AgentCapability` interfaces for dispatching to multiple agent types. The vault writer's universal session format aligns with CAP's agent types:
| CAP Agent Type | Vault `agent_type` | Status |
|---|---|---|
| `claude-code` | `claude-code` | Active — SessionEnd hook + prompt scanner |
| `codex` | `codex` | Ready — stdin JSON pipe |
| `gemini` | `gemini` | Ready — stdin JSON pipe |
| `clawdbot` | `clawdbot` | Ready — stdin JSON pipe |
| `human` | `human` | Ready — manual entry |
| `custom` | `custom` | Ready — fallback |
Future: The CAP dispatcher (`dispatcher.ts`) can call the vault writer after each agent dispatch completes, passing the session result through the universal JSON schema.
AAO Reputation Collector → Vault Integration
Modified file: `[home-path]`
The AAO (Admissible Agent Orchestration) reputation collector runs every 10 minutes and assesses quality of completed tasks from Supabase `mac_tasks`. After each quality assessment, it now writes a vault note.
Data source: Supabase `mac_tasks` table with expanded `select`:
id, task_content, output, exit_code, duration_ms, error_log,
claimed_by, completed_at, started_at, task_type, model_used,
project_path, source, session_id, pool_mode, team_role`write_task_to_vault()` function:
1. Agent type detection: Maps `claimed_by` device → agent type (mac1-5 → claude-code), then overrides from `model_used` (gpt/o1/o3 → codex, gemini → gemini)
2. Provider detection: Inferred from model string (claude → anthropic, gpt → openai, gemini → google)
3. Duration: Prefers `duration_ms`, falls back to `started_at`↔`completed_at` diff
4. Project refs: Extracted from `project_path` — looks for `cc-` or `Comp-` directories, falls back to last meaningful path component
5. Goal: First 120 chars of `task_content`
6. Outcome: Maps quality → outcome (`high`/`medium` → complete, `low` → degraded, `failed` → failed)
7. Key topics: Auto-tagged from `task_type`, `pool_mode`, `team_role`, and quality level
8. AAO metadata: Passes `aao_quality`, `aao_confidence`, `aao_device_score` into the session note
AAO-specific fields in vault notes:
Frontmatter includes:
aao_quality: high
aao_confidence: 0.85
aao_device_score: 0.912Stats table includes:
| **AAO Quality** | high (85% confidence) |
| **Device Score** | 0.91 |These fields enable Obsidian Dataview queries like:
- `TABLE aao_quality, aao_device_score FROM "Sessions" WHERE source CONTAINS "aao"`
- Find all degraded tasks: `WHERE aao_quality = "low" OR aao_quality = "failed"`
- Track device reputation trends across sessions
Call site: In the main flow loop, after `assess_and_attest()` returns for non-skipped tasks:
for t in tasks:
result = assess_and_attest(t, reputation)
if q != "skipped":
write_task_to_vault(t, quality=q, confidence=..., device_score=...)Design: Fire-and-forget via `subprocess.Popen()`. Never blocks the reputation collector. Failures are caught and logged at debug level.
Dedup note: AAO tasks use Supabase task UUIDs as session IDs, while Claude Code SessionEnd hooks use Claude's internal session IDs. These are completely different ID spaces, so there's no collision risk between the two paths.
3.8 Obsidian Headless Sync Service
Package: `obsidian-headless` v0.0.3 (npm global)
Binary: `/usr/bin/ob`
Requires: Node.js 22+ (upgraded from 20 → 22.22.0 on cloud-vm)
Systemd Service
File: `/etc/systemd/system/obsidian-sync.service`
[Unit]
Description=Obsidian Headless Sync (continuous)
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=mohameddiomande
ExecStart=/usr/bin/ob sync --path /home/mohameddiomande/obsidian-vault --continuous
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=obsidian-sync
[Install]
WantedBy=multi-user.targetStatus: Enabled, running, auto-restarts.
Remote Vault
| Property | Value |
|---|---|
| Vault ID | `6489767d30bd6868e28965dbab9a53f6` |
| Vault Name | Agent Vault |
| Region | North America |
| Encryption | Managed |
| Device Name | `cloud-vm` |
| Conflict Strategy | Merge (most recent wins) |
---
4. Data Flow Diagrams
4.1 Real-time Path (Discord → Vault → All Devices)
User sends Discord message
│
▼
Clawdbot gateway receives message
│
▼
synthesis-preprocessor.js hook fires
│
├──► callSynthesizer() ──► Kimi-K2 (Together API)
│ │
│ ▼
│ Synthesis JSON returned
│ │
│ ├──► Attach to message metadata (existing behavior)
│ │
│ └──► writeToVault() [NEW, fire-and-forget]
│ │
│ ▼
│ spawn python3 -m obsidian_vault_writer.writer
│ │ stdin: {"synthesis": {...}, "message": "..."}
│ │
│ ▼
│ VaultWriter.write_synthesis()
│ │
│ ├──► Write Inbox/{date}/{time}-{slug}.md
│ ├──► ensure_entity() for each knowledge connection
│ ├──► ensure_project() for each project_ref
│ └──► ensure_concept() for each dream_seed
│
▼
Message continues through Clawdbot pipeline (unblocked)
... meanwhile on cloud-vm ...
obsidian-sync.service detects new .md files
│
▼
ob sync --continuous uploads to Obsidian cloud
│
▼
Obsidian apps on iPhone/Mac1/Mac4/Mac5 pull new notes4.2 Agent Session Path (Claude Code → Vault)
Claude Code session ends
│
▼
SessionEnd hook fires (session_end_hook.py)
│
├──► Steps 1-7: Stats, Orbit notify, archive (existing)
│
├──► Step 8: write_vault_session() [NEW]
│ │
│ ├──► Extract goal from first prompt
│ ├──► Build project_refs from scope
│ ├──► Extract key_topics from affected targets
│ ├──► Build universal agent session JSON
│ │
│ └──► subprocess.Popen (fire-and-forget)
│ │ stdin: agent session JSON
│ ▼
│ python3 -m obsidian_vault_writer.writer --mode agent-session
│ │
│ ▼
│ VaultWriter.write_agent_session()
│ │
│ └──► Sessions/{date}-claudecode-{short_id}.md
│
├──► Steps 9-10: Cleanup, summary (existing)
│
▼
Session ends normally (vault write never blocks)4.3 AAO Reputation → Vault Path (every 10min)
Prefect scheduler triggers aao-reputation-collector flow
│
▼
fetch_unassessed_tasks task
│
├──► Query Supabase mac_tasks: status=complete, completed_at > last_assessed_at
├──► Returns: task_content, output, claimed_by, model_used,
│ project_path, duration_ms, task_type, pool_mode, team_role
│
▼
For each task: assess_and_attest()
│
├──► Compute quality score (output length, exit code, errors, duration)
├──► Sign HMAC attestation triple → Graph Kernel
├──► Update device reputation score
│
└──► write_task_to_vault() [NEW, fire-and-forget]
│
├──► Map claimed_by → agent_type (mac* → claude-code, etc.)
├──► Infer provider from model_used
├──► Extract project_refs from project_path
├──► Map quality → outcome (high → complete, low → degraded)
├──► Include AAO metadata: quality, confidence, device_score
│
└──► subprocess.Popen: python3 -m obsidian_vault_writer.writer
│ --mode agent-session
▼
Sessions/{date}-{agenttype}-{task_uuid}.md
│
│ Includes in frontmatter:
│ aao_quality: high
│ aao_confidence: 0.85
│ aao_device_score: 0.912
│
▼
obsidian-sync picks up → all devices4.4 Catch-up Path (Prefect, every 6h)
Prefect scheduler triggers vault_sync flow
│
▼
catch_up_writes task (Discord synthesis)
│
├──► Read .vault_sync_state.json → last_sync timestamp
├──► Query kimi_memory.db: synthesis_results WHERE timestamp > last_sync
├──► For each missing result: VaultWriter.write_synthesis()
└──► Update last_sync in state file
│
▼
scan_agent_sessions task (Claude Code) [NEW]
│
├──► Scan [home-path]
├──► For each session dir modified since last_sync:
│ ├──► Skip if vault note already exists
│ ├──► Read metadata.json + prompts.jsonl
│ ├──► Extract goal, stats, project refs
│ └──► VaultWriter.write_agent_session()
└──► Log count of retroactively written notes
│
▼
generate_daily_summary task
│
├──► Check if yesterday's Daily/ note exists
├──► Count: Inbox notes, entities, projects, sessions, concepts
└──► Write Daily/{yesterday}.md with stats + [[links]]
│
▼
detect_orphans task
│
├──► Scan all vault notes for [[wikilink]] references
├──► Compare against all note stems
└──► Log orphans (notes with zero incoming links)---
5. Infrastructure
5.1 Cloud-VM Services
| Service | Port | Systemd Unit | Role |
|---|---|---|---|
| Obsidian Sync | — | `obsidian-sync.service` | Continuous vault sync to Obsidian cloud |
| Perception Mesh | 8093 | `perception-mesh.service` | Device orchestration (existing) |
| Prefect Server | 4200 | Docker | Flow orchestration (existing) |
| Syncthing | 8384 | — | File sync Mac1↔cloud-vm (existing) |
5.2 File Sync Topology
Mac1 ([home-path])
│
│ Syncthing (home folder, 60s scan)
▼
cloud-vm ([home-path])
│
│ obsidian-headless (continuous WebSocket)
▼
Obsidian Cloud (Agent Vault, North America)
│
│ Obsidian Sync protocol (E2E)
▼
iPhone, Mac3, Mac4, Mac5 (Obsidian app)Note: Syncthing provides Mac1↔cloud-vm sync. All other devices receive notes via Obsidian Sync. Mac1 gets notes from both paths (Syncthing for locally-written notes, Obsidian Sync for notes written directly on cloud-vm by the synthesis hook).
5.3 Python Environment
| Host | Python | Venv Path | Used By |
|---|---|---|---|
| Mac1 | 3.14.3 | `[home-path]` | Local testing, backfill |
| cloud-vm | 3.10+ | `[home-path]` | Synthesis hook, Prefect flow |
5.4 Node.js
| Host | Version | Path | Used By |
|---|---|---|---|
| cloud-vm | 22.22.0 | `/usr/bin/node` | `ob` CLI (obsidian-headless) |
Upgraded from v20.20.0 → v22.22.0 on 2026-03-01 because obsidian-headless requires Node 22+.
---
6. Operations
6.1 Monitoring
# Check sync service status
ssh cloud-vm "systemctl status obsidian-sync"
# Watch sync logs
ssh cloud-vm "journalctl -u obsidian-sync -f"
# Count vault notes
ssh cloud-vm 'for d in [home-path] do
echo "$(basename $d)/: $(find $d -name "*.md" | wc -l) files"
done'
# Check Prefect vault_sync flow
# Via Prefect UI at http://cloud-vm:4200 or:
ssh cloud-vm "python3 [home-path] # manual run6.2 Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Notes not appearing on phone | Sync service down | `sudo systemctl restart obsidian-sync` |
| `ob sync` auth error | Token expired | `ob login` again on cloud-vm |
| Synthesis notes not created | Hook spawn failing | Check `VAULT_PATH` env, Python venv exists |
| Duplicate entity notes | Case mismatch | Backfill handles this; real-time stubs are idempotent |
| `203/EXEC` in systemd | Wrong binary path | Verify `which ob` matches `ExecStart` path |
| Syncthing not syncing vault | `.stignore` exclusion | Check both sides: `grep obsidian [home-path]` |
6.3 Manual Vault Operations
# Re-run backfill (idempotent for entities/projects, adds new synthesis notes)
cd [home-path] && obsidian_vault_writer/.venv/bin/python3 -m obsidian_vault_writer.backfill
# Dry-run backfill (preview without writing)
cd [home-path] && obsidian_vault_writer/.venv/bin/python3 -m obsidian_vault_writer.backfill --dry-run
# Write a manual synthesis note
echo '{"synthesis": {...}, "message": "..."}' | \
python3 -m obsidian_vault_writer.writer --mode synthesis --channel manual
# Write a Pulse session summary
python3 -m obsidian_vault_writer.writer --mode session --session-id <ID>
# Write a manual agent session note (any agent type)
echo '{"agent_type":"codex","session_id":"xyz","goal":"Fix bug","outcome":"complete","date":"2026-03-01","stats":{"duration_minutes":15,"prompt_count":3,"tool_call_count":10,"files_modified":["app.py"],"files_read":["config.py"]},"project_refs":["my-project"],"key_topics":[],"source":"manual"}' | \
python3 -m obsidian_vault_writer.writer --mode agent-session
# Generate today's daily summary
python3 -m obsidian_vault_writer.writer --mode daily --date 2026-03-01
# Force Obsidian re-sync
ssh cloud-vm "ob sync --path [home-path]---
7. Current Vault Statistics (2026-03-01)
| Directory | Count | Source |
|---|---|---|
| Inbox/ | 4 | Backfilled synthesis results |
| Entities/ | 75 | 53 from knowledge graph + 22 stubs from synthesis links |
| Projects/ | 4 | Stubs: comp-core, nko-linguistics, litRPG, milkmen |
| Concepts/ | 3 | Dream seeds: Voice-Controlled NKo Keyboard, AI Oat Milk Delivery Optimizer, litRPG Project Revival |
| Knowledge/ | 3 | Facts (4 items), Preferences (4 items), Patterns (3 items) |
| Sessions/ | 0 | No Pulse sessions written yet |
| Daily/ | 0 | First daily summary generates on next vault_sync run |
| Templates/ | 5 | Synthesis, Session, Entity, Project, Daily |
| Total | 94 |
---
8. Future Phases
Phase 4: Multi-Agent Session Integration (Complete)
All agent session types are supported via the universal agent session schema:
- Claude Code: Automated via SessionEnd hook (`session_end_hook.py`) + Prefect prompt-log scanner
- Pulse sessions: `--mode session --session-id <ID>` reads from `[home-path]`
- All other agents (Codex, Gemini, Clawdbot, CAP): `--mode agent-session` reads universal JSON from stdin
Output: `Sessions/{date}-{agenttype}-{short_id}.md`
Phase 5: MCP Integration + Dataview (Deferred)
Once the vault has 500+ notes with links:
- Vault MCP server: Allow Claude Code to search/read vault notes directly via MCP protocol
- Dataview plugin: Programmatic queries inside Obsidian (e.g., list all ideas by project, show orphan notes, weekly synthesis count)
- Graph View: The visual map of the knowledge space — clusters around projects, entities, and concepts become navigable
---
9. Key Design Decisions
Why fire-and-forget (not async/await)?
The synthesis hook runs in the Clawdbot message pipeline. Any delay blocks message delivery. The vault write is non-critical — if it fails, the catch-up Prefect flow will find and write the missing note within 6 hours. The `spawn()` + `unref()` pattern ensures zero impact on message latency.
Why Python (not Node.js)?
The synthesizer (`dream-weaver-engine`) is Python. The Prefect flows are Python. The kimi_memory.db queries use Python's `sqlite3`. Writing the vault module in Python means it shares the ecosystem and can be imported directly by backfill and Prefect flows. The hook bridges via stdin JSON pipe — language-agnostic.
Why filesystem notes (not a database)?
Obsidian operates on a directory of `.md` files. This is the format. It means:
- Notes are human-readable without any tool
- Git-versionable (future: commit vault changes)
- Syncthing-compatible (no lock contention)
- Obsidian Sync handles conflict resolution (merge, most-recent-wins)
Why filter the knowledge graph so aggressively?
5,562 triples → 195 meaningful ones. Without filtering, `user` alone would have 2,154 relations, making its entity note useless. The filters target three noise categories:
1. Filesystem artifacts: Predicates like `has_file`, `has_path` from codebase scanning
2. Generic subjects: `user`, `context`, `system` — too broad to be meaningful entities
3. Ephemeral context: Subjects starting with paths (`/`, `~`, `.`) or URLs (`http`)
Why a universal agent session format (not per-agent types)?
Multiple agents (Claude Code, Codex, Gemini, Clawdbot) all produce session-like output — a goal, duration, files modified, outcome. Rather than writing separate adapters for each, the universal JSON schema captures the common denominator. Agent-specific metadata (like Claude Code's compaction count or Codex's reasoning traces) can be added to the `stats` dict without breaking the schema. This aligns with CAP's `AgentCapability` interface that already normalizes across agent types.
Why fire-and-forget for the SessionEnd hook?
The SessionEnd hook runs during session teardown. Blocking it to wait for the vault writer would delay the user's terminal returning. The vault write is non-critical — if it fails, the Prefect prompt-log scanner catches missed sessions within 6 hours. `subprocess.Popen()` with no `wait()` ensures the hook returns immediately.
Why both a hook AND a scanner?
Belt and suspenders. The SessionEnd hook catches 99
Why case-insensitive dedup?
The knowledge graph stores `Clawdbot` and `clawdbot` as separate subjects. In Obsidian, `[[Clawdbot]]` and `[[clawdbot]]` resolve to the same note. Merging at backfill time prevents duplicate entity files and consolidates all relations under one note.
Why Obsidian Sync over just Syncthing?
Syncthing only covers Mac1↔cloud-vm. Obsidian Sync covers all devices including iPhone (where Syncthing doesn't run well). Obsidian Sync also handles merge conflicts natively and works with the Obsidian mobile app's expectations for vault state.
Promotion Decision
Promote into a technical note or architecture paper with implementation anchors.
Source Anchor
projects/obsidian_vault_writer/ARCHITECTURE.md
Detected Structure
Method · Evaluation · References · Figures · Code Anchors · Architecture