Grand Diomande Research · Full HTML Reader

Stage 3: Expand + Master Plan -- AgentOS Cognitive Synchronization

**Probability**: Medium (hooks fire frequently, timing overlap is plausible) **Impact**: High (missed SIG_SESSION_END means the orchestrator thinks the pane is still working for up to 5 minutes)

Agents That Account for Themselves proposal experiment writeup candidate score 22 .md

Full Public Reader

Stage 3: Expand + Master Plan -- AgentOS Cognitive Synchronization

---

3a. Risk Audit

CRITICAL RISKS

#### R1: Interrupt File Race Condition
Failure scenario: PostToolUse hook and SessionEnd hook fire within milliseconds of each other for the same pane. Both try to write `/tmp/pane_interrupts/ttys007.json`. One overwrites the other. The SessionEnd signal (high priority) is lost because PostToolUse (low priority) clobbered it.

Probability: Medium (hooks fire frequently, timing overlap is plausible)
Impact: High (missed SIG_SESSION_END means the orchestrator thinks the pane is still working for up to 5 minutes)

Mitigation: Use append-mode interrupt directory, not single file per pane:

/tmp/pane_interrupts/ttys007/
    1710100000_0x01_tool_active.json
    1710100001_0x03_session_end.json

Each interrupt is a separate file. The orchestrator reads ALL files in the directory, sorts by timestamp, takes the highest-priority one, and deletes consumed files.

Validation: Write a test that fires 100 concurrent interrupts to the same TTY directory and verifies zero data loss.

---

#### R2: Drift Meter Threshold Produces False Positives at High Pane Count
Failure scenario: With 40 panes, each emitting SIG_TOOL_ACTIVE (drift += 0.02), the aggregate drift noise triggers sync pulses continuously. The system enters a "sync storm" where every cycle fires a pulse, defeating the purpose of event-driven sync.

Probability: Medium-High (at 40 panes with 30s cycle, each pane emits ~2 tool events per second = 80 events/30s = 1.6 drift added per cycle across fleet)
Impact: High (sync storm degrades to worse than temporal polling)

Mitigation: Drift meters are PER-PANE, not global. A sync pulse fires only when a SINGLE pane crosses threshold. SIG_TOOL_ACTIVE should NOT increment drift at all -- it is a "heartbeat" signal, not a drift signal. Revise weights:

python
weights = {
    0x01: 0.00,    # tool_active -- NO drift, just "I'm alive"
    0x02: 0.40,    # context_full
    0x03: 0.60,    # session_end
    0x04: 0.80,    # error_stuck
}

Only high-value interrupts (context_full, session_end, error_stuck) contribute to drift. Tool activity is tracked separately for idle detection but does NOT feed the drift meter.

Validation: Simulate 40 panes with realistic interrupt patterns (1 tool_use every 15s, 1 context_full per hour, 2 session_ends per hour). Verify sync pulse frequency stays below 1 per 5 minutes on average.

---

#### R3: Content Fingerprint File Path Extraction Misses Dynamic Paths
Failure scenario: Claude Code constructs file paths via variables (`${HOME}/projects/...`) or reads them from JSON/YAML. The regex `(?:/[a-zA-Z0-9._-]+)+\.[a-zA-Z]{1,10}` misses these. Two panes working on the same project are not flagged as duplicates.

Probability: High (variable-based paths are common in terminal output)
Impact: Medium (duplicate detection is a nice-to-have, not a safety-critical feature)

Mitigation: Multi-strategy extraction:
1. Regex for literal paths (current)
2. Match project names from pane registry (from pane_registry.json project field)
3. Match known project directories from AGENTS.md or ARCHITECTURE.md
4. Fall back to keyword overlap on non-path tokens

Validation: Run fingerprint extraction on 20 saved terminal output snapshots from /tmp/pane_context/ and manually verify file path coverage.

---

MEDIUM RISKS

#### R4: NUMU Bus Schema Migration Requires Downtime
Failure scenario: Adding `sync.*` message types to numu-bus/src/schema.ts requires TypeScript compilation and daemon restart. During restart (10-30s), all NUMU consumers lose their WebSocket connection. Events emitted during this window are lost.

Probability: Certain (schema change requires recompile)
Impact: Low-Medium (30s of lost events is recoverable; the fallback timer catches it)

Mitigation:
- The sync.* messages are ADDITIONAL types, not modifications to existing ones. Existing consumers ignore unknown message types.
- Restart during low-activity window (US nighttime / Mohamed sleeping)
- NUMU bus clients already reconnect on disconnect (built into the WS client)

Validation: Send a burst of 100 events spanning the restart window. Verify the reconnected client receives events after restart.

---

#### R5: Haiku Skip Rate Causes Missed Context Exhaustion
Failure scenario: The "auto-compact" string detection in PostToolUse output fails because Claude changes its compaction message format. No SIG_CONTEXT_FULL interrupt fires. The orchestrator relies on Haiku to detect it, but Haiku was skipped because recent tool activity was detected. The context-exhausted pane sits unnoticed.

Probability: Low-Medium (Claude's compaction message has been stable but is not contractual)
Impact: Medium (pane wastes time in degraded context until the next Haiku check or the 5-min fallback)

Mitigation:
- Maintain a LIST of compaction indicators, not a single string: "auto-compact", "compressed prior", "context window", "messages compacted"
- Add a "time since last interrupt" counter. If a pane has been emitting SIG_TOOL_ACTIVE for >15 minutes continuously, force a Haiku check regardless (suspiciously long tool activity = possible context degradation)
- KARL trajectories that end with context exhaustion can retroactively identify the undetected pattern

Validation: Grep the last 30 days of terminal logs for compaction messages and verify the indicator list catches all variants.

---

#### R6: Priority Affinity Matching Creates Project Starvation
Failure scenario: OpenClawHub has 10 pending tasks and 5 panes with OpenClawHub affinity. feed-hub has 3 pending tasks but 0 panes with feed-hub affinity. All idle panes get assigned OpenClawHub tasks. feed-hub tasks starve indefinitely.

Probability: Medium (project clustering is a real pattern in the fleet)
Impact: Medium (some projects get delayed, but nothing breaks)

Mitigation: Add a starvation detector. If a task has been pending for longer than MAX_TASK_WAIT (30 minutes), boost its priority by 0.30, overriding affinity. This guarantees every task eventually gets scheduled regardless of pane affinity.

python
if task.get("created_at"):
    age_min = (time.time() - task["created_at"]) / 60
    if age_min > 30:
        source_weight += 0.30  # starvation boost

Validation: Simulate a workload where one project has 80

---

#### R7: Compass Review Prompt Generates Hallucinated Recommendations
Failure scenario: Haiku misinterprets the fleet snapshot and recommends reprioritizing a task that does not exist, or alerting about a non-existent duplicate. If auto-execution were enabled, this would cause harm.

Probability: Medium (Haiku 4.5 can hallucinate on structured data)
Impact: Low (Compass is advisory-only -- logged but not executed)

Mitigation:
- Compass responses are LOGGED ONLY, never auto-executed
- Include a "valid_task_ids" field in the prompt so Haiku can only reference real tasks
- Parse Compass output and validate referenced TTYs/task_ids against actual fleet state before logging

Validation: Run 50 Compass reviews on synthetic fleet snapshots and measure hallucination rate on task_id and TTY references.

---

LOW RISKS

#### R8: SyncMetrics Counter Overflow
Impact: Negligible. Counters are Python ints (arbitrary precision). Reset on daemon restart.

#### R9: File Path Regex Matches Non-File Strings
Impact: Low. False positive file paths in content fingerprint cause slightly noisier duplicate detection. No behavioral consequence.

#### R10: Interrupt File Accumulation on Disk
Impact: Low. Each interrupt file is <1KB. At 100 files/minute, 1 day = 144K files = ~144MB. Orchestrator consumes (deletes) files on each cycle. Risk only if orchestrator is down for extended period.

Mitigation: A cron job that purges interrupt files older than 1 hour.

---

3b. Expanded Specifications

Spec 1: Interrupt Signal Emitter (post_tool_hook.py modification)

Location: `[home-path]`
New function: `_emit_interrupt(tty, vector, signal_name, metadata)`
Interrupt directory: `/tmp/pane_interrupts/{tty_safe}/`
File naming: `{unix_ts_ms}_{vector_hex}_{signal_name}.json`

Interrupt JSON schema:

json
{
    "vector": 3,
    "signal": "SIG_SESSION_END",
    "tty": "/dev/ttys007",
    "machine_id": "mac1",
    "project": "OpenClawHub",
    "timestamp": 1710100001.234,
    "metadata": {
        "duration_sec": 1234,
        "tool_count": 47,
        "exit_reason": "natural"
    }
}

Error detection state (per-pane, persisted in `/tmp/pane_error_state/{tty_safe}.json`):

json
{
    "consecutive_failures": 3,
    "last_failure_ts": 1710100000.0,
    "last_success_ts": 1710099900.0,
    "failure_tools": ["Bash", "Bash", "Bash"]
}

Threshold: 3 consecutive Bash failures within 120s triggers SIG_ERROR_STUCK.

Context exhaustion indicators:

python
COMPACTION_INDICATORS = [
    "auto-compact",
    "compressed prior messages",
    "context window",
    "messages compacted",
    "conversation too long",
    "token limit",
]

Match is case-insensitive substring search on tool output.

Lines of code: ~60 new lines in post_tool_hook.py, ~30 new lines in session_end_hook.py

---

Spec 2: Drift Meter Engine

Location: New file `[home-path]`
Class: `DriftMeter` (one instance per tracked pane)
Manager class: `DriftFleet` (manages all meters, checks thresholds)

python
class DriftFleet:
    """Manages per-pane drift meters and fires sync pulses."""

    def __init__(self, decay_lambda: float = 0.005, threshold: float = 0.8):
        self.meters: dict[str, DriftMeter] = {}
        self.threshold = threshold
        self.decay_lambda = decay_lambda

    def ingest_interrupt(self, tty: str, vector: int):
        """Process an interrupt from a pane."""
        if tty not in self.meters:
            self.meters[tty] = DriftMeter(decay_lambda=self.decay_lambda)
        self.meters[tty].on_interrupt(vector)

    def check_pulses(self) -> list[str]:
        """Return list of TTYs that have crossed the drift threshold."""
        pulsing = []
        for tty, meter in self.meters.items():
            if meter.should_pulse(self.threshold):
                pulsing.append(tty)
                meter.reset()  # Clear drift after pulse fires
        return pulsing

    def prune_stale(self, max_age_sec: float = 3600):
        """Remove meters for panes that no longer exist."""
        now = time.time()
        stale = [tty for tty, m in self.meters.items()
                 if now - m.last_update > max_age_sec]
        for tty in stale:
            del self.meters[tty]

Drift weights (revised from R2 mitigation):

python
DRIFT_WEIGHTS = {
    0x01: 0.00,    # SIG_TOOL_ACTIVE: no drift, just alive signal
    0x02: 0.40,    # SIG_CONTEXT_FULL: needs attention
    0x03: 0.60,    # SIG_SESSION_END: pane is now free
    0x04: 0.80,    # SIG_ERROR_STUCK: critical, needs intervention
}

Decay constant: lambda = 0.005 (half-life ~138 seconds). This means:
- A SIG_SESSION_END (0.60) decays to 0.30 in ~138s
- Without any additional events, threshold (0.8) is only crossed by SIG_ERROR_STUCK alone or SIG_SESSION_END + SIG_CONTEXT_FULL within 2 minutes

Lines of code: ~80 lines (drift_meter.py)

---

Spec 3: Content Fingerprinter

Location: New function in `cortex_orchestrator.py` (not a separate file -- too small)
Function: `_compute_content_fingerprint(terminal_output: str) -> dict`

Multi-strategy extraction:

python
def _compute_content_fingerprint(terminal_output: str) -> dict:
    """Extract content fingerprint from terminal output."""
    lines = terminal_output.strip().split("\n")

    # Strategy 1: Literal file paths
    file_re = re.compile(r'(?:/[a-zA-Z0-9._-]+){2,}\.[a-zA-Z]{1,10}')
    files = set(file_re.findall(terminal_output))

    # Strategy 2: Project directory names
    proj_re = re.compile(
        r'(?:Desktop|projects|flows|monitoring|\.claude)/([a-zA-Z0-9_-]+)'
    )
    projects = set(proj_re.findall(terminal_output))

    # Strategy 3: Content hash of last 20 lines
    tail = "\n".join(l.strip() for l in lines[-20:] if l.strip())
    content_hash = hashlib.sha256(tail.encode()).hexdigest()[:16]

    return {
        "files": sorted(files)[:10],
        "projects": sorted(projects)[:5],
        "content_hash": content_hash,
    }

Duplicate detection (integrated into sense phase):

python
def _detect_duplicates(self, fingerprints: dict[str, dict]) -> list[dict]:
    """Detect panes working on same files. O(n) via hash grouping."""
    file_panes: dict[str, list[str]] = {}
    for tty, fp in fingerprints.items():
        for f in fp.get("files", []):
            file_panes.setdefault(f, []).append(tty)

    return [
        {"file": f, "panes": panes}
        for f, panes in file_panes.items()
        if len(panes) > 1
    ]

Lines of code: ~50 lines (added to cortex_orchestrator.py)

---

Spec 4: NUMU Sync Domain Schema

Location: `[home-path]`
New types: SyncDriftReport, SyncPulse, SyncDuplicate (3 types)
New domain number: 7 (after health domain 4, thread domain 3... existing numbering)

Full TypeBox definitions as specified in compound Step 5. Added to the existing Validators map:

typescript
export const Validators = {
    // ... existing validators ...
    "sync.drift_report": TypeCompiler.Compile(SyncDriftReport),
    "sync.pulse": TypeCompiler.Compile(SyncPulse),
    "sync.duplicate_detected": TypeCompiler.Compile(SyncDuplicate),
};

Lines of code: ~60 lines in schema.ts

---

Spec 5: cortex_orchestrator.py Integration

Modified phases in the heartbeat cycle:

1. PRE-SENSE (new): Read interrupt directory, feed DriftFleet, check for sync pulses
2. SENSE: Existing pane discovery + NEW content fingerprinting on sync pulse or every 5th cycle
3. CLASSIFY: Existing Haiku classification WITH interrupt-skip logic (Step 2 from compound)
4. MATCH (modified): Priority affinity matching replaces longest-idle-first (Step 7 from compound)
5. INJECT: Existing injection logic (unchanged)
6. CHECK: Existing invariant checks + NEW duplicate detection logging
7. ADAPT: Existing interval adaptation + NEW SyncMetrics recording
8. COMPASS (new, periodic): Every 10 minutes, generate fleet snapshot and send to Haiku for strategic review

New instance variables on CortexOrchestrator class:

python
self.drift_fleet = DriftFleet()
self.sync_metrics = SyncMetrics()
self.fingerprints: dict[str, dict] = {}
self.recent_duplicates: list[dict] = []
self.compass_last_run: float = 0

Estimated change: ~200 lines modified/added in cortex_orchestrator.py

---

3c. Master Execution Checklist

Wave 0: Foundation (Day 1-2)

#TaskOwnerInputOutputValidationStatus
0.1Create `/tmp/pane_interrupts/` directory structure and cleanup cronClaudeCompound Step 2Directory + launchd cron (hourly purge of files >1h old)`ls /tmp/pane_interrupts/` exists, cron firesTODO
0.2Create `/tmp/pane_error_state/` directory for consecutive failure trackingClaudeSpec 1Directory + JSON schemaDirectory existsTODO
0.3Write `drift_meter.py` (DriftMeter + DriftFleet classes)ClaudeSpec 2`[home-path]` (~80 lines)Unit test: 3 interrupts at known timestamps produce expected drift valuesTODO
0.4Unit test drift_meter.pyClaudeTask 0.3Test file `[home-path]`All tests pass, including decay math verificationTODO

Wave 0 Gate: drift_meter.py passes all unit tests. Interrupt directories exist.

---

Wave 1: Interrupt Emitter (Day 2-3)

#TaskOwnerInputOutputValidationStatus
1.1Add `_emit_interrupt()` function to `post_tool_hook.py`ClaudeSpec 1Modified hook (~60 new lines)Hook still passes existing tests; interrupt files appear in `/tmp/pane_interrupts/`TODO
1.2Add SIG_CONTEXT_FULL detection to `post_tool_hook.py`ClaudeSpec 1, compaction indicators listSubstring matching on tool outputManually trigger auto-compact and verify interrupt file createdTODO
1.3Add SIG_ERROR_STUCK detection (3+ consecutive Bash failures)ClaudeSpec 1, error state trackingError state JSON + interrupt emissionRun 3 failing `Bash` calls and verify SIG_ERROR_STUCK firesTODO
1.4Add SIG_SESSION_END interrupt to `session_end_hook.py`ClaudeSpec 1Modified hook (~30 new lines)End a Claude session and verify interrupt file createdTODO
1.5Integration test: run a Claude session, verify interrupt file streamMohamedTasks 1.1-1.4Manual test reportInterrupt files appear for tool_active, error_stuck, session_endTODO

Wave 1 Gate: All 4 interrupt types fire correctly. Existing hook functionality (logging, KARL, cortex) unbroken.

---

Wave 2: Orchestrator Integration (Day 3-5)

#TaskOwnerInputOutputValidationStatus
2.1Add PRE-SENSE phase to cortex_orchestrator.py (read interrupts, feed DriftFleet)ClaudeSpec 5, drift_meter.pyModified orchestrator (~50 new lines)Orchestrator log shows "PRE-SENSE: read N interrupts, M drift pulses"TODO
2.2Add interrupt-skip logic to CLASSIFY phase (skip Haiku when interrupt provides classification)ClaudeCompound Step 2Modified classification (~40 new lines)Haiku skip rate logged in output; verify >50
2.3Add content fingerprinting to SENSE phaseClaudeSpec 3Modified sense phase (~50 new lines)Fingerprints logged with file paths and project namesTODO
2.4Add duplicate detection to CHECK phaseClaudeSpec 3Modified check phase (~30 new lines)When two panes edit the same file, "DUPLICATE" log line appearsTODO
2.5Add SyncMetrics counters and Prometheus outputClaudeSpec 5, compound Step 6Modified /metrics endpoint`curl localhost:9131/metrics` shows new sync metricsTODO
2.6Integration test: run full orchestrator with interrupt-driven classificationMohamedTasks 2.1-2.5Manual test: run 3 panes for 10 minutesHaiku skip rate >50

Wave 2 Gate: cortex_orchestrator.py runs stably for 1 hour with interrupt-driven classification. Haiku skip rate >50

---

Wave 3: Priority Matching + NUMU (Day 5-7)

#TaskOwnerInputOutputValidationStatus
3.1Replace longest-idle-first matching with priority affinity matchingClaudeCompound Step 7Modified _match_task_to_pane (~60 lines)Backlog task with project="OpenClawHub" is assigned to OpenClawHub pane, not random paneTODO
3.2Add starvation detector (30-min task age boost)ClaudeR6 mitigationAdded to matching logic (~10 lines)Task pending >30 min gets priority boost in log outputTODO
3.3Add sync.* message types to numu-bus schema.tsClaudeSpec 4Modified schema.ts (~60 lines)`bun build` succeeds; TypeBox validators compileTODO
3.4Add NUMU sync event emission to cortex_orchestrator.pyClaudeCompound Step 5Modified orchestrator (~40 lines)sync.pulse events appear on NUMU bus (verify via numu-bus log)TODO
3.5Restart NUMU daemon with new schemaMohamedTask 3.3Restarted daemonNUMU bus accepts and validates sync.* messagesTODO
3.6Integration test: full sync flow from interrupt to NUMU eventMohamedTasks 3.1-3.5Manual testEnd a session, see interrupt -> drift -> sync.pulse -> NUMU event in logsTODO

Wave 3 Gate: Priority matching assigns tasks by affinity. sync.* events flow through NUMU. Full interrupt-to-sync pipeline works end-to-end.

---

Wave 4: Compass + Polish (Day 7-9)

#TaskOwnerInputOutputValidationStatus
4.1Implement Compass periodic review (10-min Haiku fleet assessment)ClaudeCompound Step 8New _compass_review method (~60 lines)Compass log output appears every 10 minutes with fleet assessmentTODO
4.2Add Compass output validation (verify referenced TTYs/task_ids exist)ClaudeR7 mitigationValidation logic (~20 lines)Compass references to non-existent TTYs are flagged in logTODO
4.3Add Grafana dashboard panel for cognitive sync metricsClaudeSyncMetrics Prometheus outputGrafana dashboard JSONDashboard shows haiku_skip_rate, sync_pulses, duplicates, drift_maxTODO
4.4Add interrupt file cleanup cron (LaunchAgent, hourly)ClaudeR10 mitigationLaunchAgent plist`launchctl listgrep interrupt-cleanup` returns loaded
4.5Update Nexus Portal /services page with sync metricsClaudeSyncMetricsModified services pageNexus shows cognitive sync status cardTODO

Wave 4 Gate: Compass fires every 10 minutes with useful output. Grafana dashboard shows sync metrics. Cleanup cron is running.

---

Wave 5: Observation + Calibration (Day 9-14)

#TaskOwnerInputOutputValidationStatus
5.1Run system for 5 days, collect SyncMetrics dataMohamedRunning system5 days of Prometheus time series dataData exists in Grafana for all sync metricsTODO
5.2Analyze Haiku skip rate and validate >60
5.3Analyze drift threshold: is 0.8 correct?ClaudePrometheus data + sync pulse frequencyThreshold recommendationSync pulse frequency is 1-3 per hour (not too frequent, not too rare)TODO
5.4Analyze duplicate detection false positive/negative rateClaudeDuplicate logs + manual reviewQuality reportFalse positive rate <20
5.5Analyze Compass review quality (hallucination rate, actionability)ClaudeCompass logsQuality reportHallucination rate <10
5.6Calibrate drift weights based on observed dataClaudeAnalysis from 5.2-5.5Updated drift_meter.py constantsRecalibrated system runs for 2 more days with improved metricsTODO
5.7Decision gate: enable Compass auto-execution for "no_op" and "alert" actions?MohamedQuality reportsDecision documentExplicit go/no-go for Phase 2 auto-executionTODO

Wave 5 Gate: 5 days of clean operation. Metrics validated. Drift threshold calibrated. Compass quality assessed.

---

Kill Criteria

1. **If Haiku skip rate < 30

2. If sync pulse frequency > 10/hour after Wave 3: Drift threshold is too low or drift weights are too high. Recalibrate before proceeding.

3. If cortex_orchestrator.py cycle time increases by >2x after Wave 2: The interrupt reading and fingerprinting overhead is too high. Profile and optimize or remove the fingerprinting (it is the most expensive new operation).

---

Summary Statistics

MetricValue
Total tasks28
Waves6 (0-5)
New files2 (drift_meter.py, test_drift_meter.py)
Modified files4 (post_tool_hook.py, session_end_hook.py, cortex_orchestrator.py, numu-bus schema.ts)
New lines (estimated)~550
Modified lines (estimated)~200
New Supabase tables0 (all data is ephemeral in /tmp or existing Prometheus)
New NUMU message types3 (sync.drift_report, sync.pulse, sync.duplicate_detected)
New Prometheus metrics6
Estimated implementation time9-14 days
Risk items audited10 (2 critical, 5 medium, 3 low)

---

Architecture Diagram (Final State)

                           ┌───────────────────────┐
                           │    NUMU Bus (:7890)    │
                           │   + sync.* domain      │
                           └───────────┬───────────┘
                                       │
                    ┌──────────────────┼──────────────────┐
                    │                  │                   │
             ┌──────▼──────┐   ┌──────▼──────┐   ┌──────▼──────┐
             │ Mac1 Cortex │   │ Mac2 Cortex │   │ Mac3 Cortex │
             │ Orchestrator│   │ Orchestrator│   │ Orchestrator│
             └──────┬──────┘   └──────┬──────┘   └──────┬──────┘
                    │                  │                   │
         ┌──────────┼──────┐          │                   │
         │          │      │          │                   │
    ┌────▼──┐ ┌────▼──┐ ┌─▼───┐ ┌───▼───┐          ┌───▼───┐
    │Pane A │ │Pane B │ │... N│ │Pane M │          │Pane P │
    │       │ │       │ │     │ │       │          │       │
    │ Hooks │ │ Hooks │ │Hooks│ │ Hooks │          │ Hooks │
    └───┬───┘ └───┬───┘ └──┬──┘ └───┬───┘          └───┬───┘
        │         │        │        │                   │
        ▼         ▼        ▼        ▼                   ▼
    /tmp/pane_interrupts/{tty}/ (per-pane interrupt directory)
        │
        ▼
    ┌───────────────┐     ┌──────────────┐     ┌──────────────┐
    │ DriftFleet    │────>│ Content      │────>│ Priority     │
    │ (per-pane     │     │ Fingerprint  │     │ Affinity     │
    │  drift meters)│     │ (file paths) │     │ Matching     │
    └───────────────┘     └──────────────┘     └──────────────┘
        │                                           │
        │ threshold exceeded                        │ best (task, pane) pair
        ▼                                           ▼
    ┌───────────────┐                          ┌──────────────┐
    │ Sync Pulse    │                          │ Inject        │
    │ (event-driven)│                          │ (clipboard)   │
    └───────┬───────┘                          └──────────────┘
            │
            ▼
    ┌───────────────┐     ┌──────────────┐
    │ SyncMetrics   │────>│ Prometheus   │──> Grafana
    │ (counters)    │     │ :9131        │
    └───────────────┘     └──────────────┘
            │
            │ every 10 min
            ▼
    ┌───────────────┐
    │ Compass Review│
    │ (Haiku 4.5,   │
    │  advisory)    │
    └───────────────┘

---

Key Insight from This Evolution

The AgentOS paper's most valuable contribution is NOT the S-MMU (which requires internal state access we do not have) but the drift-triggered sync model. Replacing temporal polling with event-driven sync, where the events come from the agent's own hooks, bridges the gap between the theoretical AgentOS and our practical multi-process architecture. The interrupt-first classification (Step 2) is the foundation everything else depends on -- it makes the system both faster and cheaper simultaneously, which is rare. The content fingerprinting and duplicate detection (Step 4) are the highest-ROI semantic additions because they solve the concrete "two panes editing the same file" problem without requiring expensive embedding models.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

evo-cube-output/agentos-cognitive-sync/stage3-expand-master-plan.md

Detected Structure

Method · Evaluation · References · Code Anchors · Architecture · is Stage Research