Intelligent Pane Orchestrator -- Stage 0: Research
| File | Lines | Purpose | |------|-------|---------| | `pane_orchestrator_core.py` | ~945 | Config, state, notifications (Telegram/iMessage), Cortex authority check, ProjectContextCache, run_cycle, run_daemon, main loop | | `pane_orchestrator_invariants.py` | ~234 | State change detection, plateau detection, review gates, evolution candidacy, context exhaustion signals | | `pane_orchestrator_drift.py` | ~689 | Output evaluation, Gemini Flash prompt composition, 5-pattern fallback (question/steps/recommendation/com
Full Public Reader
# Intelligent Pane Orchestrator -- Stage 0: Research
Run: intelligent-pane-orchestrator
Generated: 2026-04-04
Method: Evolution Cube -- four-stage recursive evoflow (research-grounded)
Run Directory: Desktop/evo-cube-output/intelligent-pane-orchestrator/
Lineage: Succeeds intelligent-orchestration-v2 (2026-03-08). V2 built the 2-layer hook+daemon architecture and 7 patches. This evolution focuses on making the orchestrator predictive and self-healing rather than reactive.
---
What Exists Today
Pane Orchestrator Core (4 Python modules, ~2,600 lines total)
| File | Lines | Purpose |
|---|---|---|
| `pane_orchestrator_core.py` | ~945 | Config, state, notifications (Telegram/iMessage), Cortex authority check, ProjectContextCache, run_cycle, run_daemon, main loop |
| `pane_orchestrator_invariants.py` | ~234 | State change detection, plateau detection, review gates, evolution candidacy, context exhaustion signals |
| `pane_orchestrator_drift.py` | ~689 | Output evaluation, Gemini Flash prompt composition, 5-pattern fallback (question/steps/recommendation/completion/generic), injection memory |
| `pane_orchestrator_sense.py` | ~927 | Context readers (CLAUDE.md, GSD, Pulse, Evolution, git, vault notes), deep context builder, 14 project-specific prompt templates, compose_prompt_for_pane |
| `pane_orchestrator_loop.py` | ~100 | Thin facade re-exporting all modules + CLI (daemon/once/status/self-inject/notify) |
5-Phase Heartbeat Cycle
1. SENSE: AppleScript pane discovery + `/tmp/pane_signals/` + `/tmp/pane_context/` + terminal content analysis (last 500 chars for shell prompts, Claude exit messages)
2. SELECT: Longest-idle pane + highest-priority pending backlog task
3. MUTATE: Inject via clipboard paste (pbcopy + keystroke "v" with command down)
4. CHECK: Bounded divergence invariant (max M injections per window)
5. ADAPT: Metabolism-based interval: `new_interval = MIN(30) + (MAX(300) - MIN(30)) * (working/total)`
4 Non-Halting Invariants
1. Min Entropy Production: KL divergence with Laplace smoothing. Every cycle must produce novelty > epsilon.
2. Bounded Divergence: Injection rate capped per window. `MAX_INJECTIONS_PER_CYCLE = 2`.
3. Cross-Layer Forcing: If all panes idle for N cycles, escalate (force action).
4. No Absorbing States: Every state must have 2+ viable transitions.
Signal Detection (6 Categories)
| Signal | Threshold | Action |
|---|---|---|
| Idle | > 15 min no activity | Evaluate + compose + inject |
| Stuck | > 30 min no activity | Evaluate + inject + notify (iMessage) |
| Plateau | Same output hash for 6 cycles | Force approach pivot |
| Review gate | 5+ commits, 0 with "review" | Inject /meta-review |
| Evolution candidate | 30+ min stuck + no output change in 3 cycles | Pivot or skip task |
| Context exhaustion | 4+ hours runtime | Handoff + fresh session |
Prompt Composition (2-Tier)
Tier 1 -- Gemini Flash AI Composition: Sends last 40 lines of terminal output + project context + recent injection history to Gemini 2.0 Flash. Gets back a targeted 2-4 sentence prompt. Temperature 0.3, max 300 tokens, 8s timeout.
Tier 2 -- Pattern Matching Fallback: 5 regex-based patterns: (1) direct questions at tail, (2) numbered/bulleted next steps, (3) recommendations, (4) completion signals, (5) no match.
Response Evaluation Engine (evaluate_response)
20+ signal categories: build success/failure, test pass/fail, commit/push, deployment, awaiting input, plan mode, context limit, runtime error, task complete, pulse iteration, permission denied, context exhaustion, AskUserQuestion, tool activity. Each signal maps to a status + suggested_action + context_for_next string.
Infrastructure Dependencies
| System | Port/Path | Role in Orchestration |
|---|---|---|
| meshd | :9451 (Mac1-5) | HTTP inject/stream/read for agent slots. Orchestrator could use POST /inject/{slot} |
| NATS | :4222 (Mac1) | JetStream for mesh events. 5 streams. Leaf nodes :7422 for Mac2-5 |
| OPA | :8181 (Mac1) | Policy engine for violation/memory guards. <5ms eval |
| Supabase | aaqbofotpchgpyuohmmz | agent_slots table, app_evolution_states, mesh_events |
| ewd (Rust) | :8300 | Evolution World daemon. 5-phase heartbeat, meshd dispatch |
| Cortex | state file | Behavioral intelligence. Cortex authority TTL (180s) suppresses orchestrator injections when Cortex is active |
| NUMU | :7890 | WebSocket event bus (legacy, being replaced by NATS) |
| Telegram gateway | GCP Cloud Run | Cycle notifications |
| iMessage | AppleScript | Urgent/important notifications |
LaunchAgent
- Label: `com.openclaw.pane-orchestrator`
- KeepAlive: true, ThrottleInterval: 30s
- Logs: `/tmp/pane-orchestrator.log`, `/tmp/pane-orchestrator-error.log`
- State: `/tmp/orchestrator_state.json`
---
What Works Well
1. Cortex authority handoff. When Cortex is actively managing panes (heartbeat < 180s), the orchestrator suppresses its own injections. Clean authority delegation with no coordination protocol needed beyond a state file timestamp.
2. Injection memory prevents loops. `injection_memory.json` stores last 200 injections per pane. Recent 10 are passed to Gemini Flash to avoid repeating the same prompt. Pattern matching also checks against history.
3. Project context caching. `ProjectContextCache` with 5-minute TTL aggregates CLAUDE.md, GSD state, Pulse anchor, evolution state, git status, active tasks, changed files, known blockers. Avoids redundant filesystem reads across cycle evaluations.
4. Multi-signal idle detection. Four sources: signal files, context timestamps, terminal content analysis, and terminal busy flag. Each is unreliable alone but the union covers most cases.
5. Pane protection. Both list-based (`[home-path]`) and lock-file-based (`[home-path]`). Users can protect specific panes from auto-injection.
6. Per-pane cooldown. 600s (10 min) between injections per pane. Prevents rapid-fire injection storms.
---
What's Broken or Missing
Critical Gaps
1. No work prediction. The orchestrator is purely reactive. It waits for panes to go idle, then figures out what to inject. It never anticipates what work should go where before a pane finishes. There is no model of "what will be needed next."
2. No domain-aware routing. All idle panes are treated identically. Mac2 (TD/motion domain) gets the same prompt composition as Mac4 (Adobe domain). The orchestrator has no concept that certain work should go to certain machines based on installed tools, capabilities, or running services.
3. No crash recovery. If a pane crashes (Claude segfault, SSH timeout, Terminal.app hang), the orchestrator just sees it as "idle" after the threshold. It doesn't detect the crash, doesn't restart the session, doesn't recover the work context. The pane sits dead until someone notices.
4. No cross-machine orchestration. The orchestrator runs on Mac1 only. Mac2-5 panes are visible via `agent_slots` in Supabase, but the orchestrator only manages local Terminal.app panes via AppleScript. SSH+tmux injection is separate (`remote_injector.py`), and it's not integrated into the cycle.
5. No learning from outcomes. The injection memory records what was injected but never whether the injection succeeded (did the pane actually start working? did the work produce good output?). There is no feedback loop that improves injection quality over time.
Architectural Friction
6. Gemini Flash as single prompt composer. If Gemini is down or the API key expires, prompt composition falls through to regex patterns that are generic. There is no local fallback that understands project context.
7. State file as single truth. `/tmp/orchestrator_state.json` is the only state persistence. If the file is corrupted or deleted (which `/tmp/` does on reboot), all history is lost. Supabase sync exists in v2's plan but was never wired.
8. Hardcoded thresholds. Idle=15min, Stuck=30min, Cooldown=600s, MaxInjections=2. These are constants. Different projects have different cadences. A fast iteration loop (15-second Pulse cycles) needs different thresholds than a slow evolution (30-minute Evo3 stages).
9. **~30
10. No priority queue. All idle panes are processed by idle duration. There is no concept of urgent vs. routine work. A pane that just crashed running a critical build gets the same priority as a pane that finished a low-priority docs task.
---
Constraints and Invariants
| Constraint | Source | Impact |
|---|---|---|
| AppleScript injection is unreliable | macOS + Terminal.app | Must have retry mechanism or alternate injection path |
| Cortex authority is time-based | CORTEX_AUTHORITY_TTL_SEC=180s | Cannot inject while Cortex is fresh, except inbox wakes |
| 4 panes per machine (meshd slots) | meshd.toml | Bounded concurrency, no dynamic scaling |
| NATS leaf node latency ~20-50ms | Tailscale networking | Cross-machine events are near-real-time but not instant |
| Supabase round-trip ~100-300ms | Cloud database | Cannot be in the hot path of injection decisions |
| Gemini Flash timeout 8s | API call | Must have sub-8s fallback for prompt composition |
| Terminal.app busy flag unreliable | macOS Terminal | Cannot trust `busy` as sole idle indicator |
| OPA policy eval <5ms | Local HTTP | Can be in hot path for injection validation |
| macOS Security (SIP) | System Integrity | Cannot modify Terminal.app internals, must work via AppleScript or accessibility |
---
Infrastructure Available But Not Used by Orchestrator
1. meshd HTTP inject (:9451 POST /inject/{slot}). Direct HTTP injection to agent slots. Bypasses AppleScript entirely. More reliable than clipboard paste. Orchestrator currently doesn't use it.
2. NATS JetStream events. 5 durable streams with replay. Orchestrator currently doesn't publish cycle results or subscribe to mesh events via NATS. Still uses Telegram for notifications and file-based signals.
3. OPA policy engine (:8181). Could validate injection decisions (should this task go to this pane? does this injection violate any policy?). Currently only used by violation-guard and memory-guardian hooks.
4. ewd Rust daemon (:8300). Evolution World's 5-phase heartbeat with meshd dispatch and rate-limit-aware slot rotation. Could coordinate with orchestrator for evolution-related work dispatch.
5. KARL trajectory intelligence. 121+ trajectories with 72 skill-labeled, 11 domains, 5-signal reward. Could train injection prompts based on what historically produced the best outcomes.
6. Supabase agent_slots. Real-time view of all 5 machines' pane states. Updated by meshd heartbeat every 60s. The orchestrator reads it but doesn't write back its own decisions.
7. bridged (:9462 OSC, :9463 WS, Supabase adapter). Unified bridge daemon with topic routing. Could carry orchestrator events alongside OSC/WS traffic.
---
Prior Evolution History
### V1 (intelligent-orchestration, 2026-03-07)
- 5 divergent paths (A through E). Built the basic HEF daemon with 6-step decision engine.
- Key contribution: NUMU event integration, signal watcher, intent DAG, prompt composer.
### V2 (intelligent-orchestration-v2, 2026-03-08)
- 6 divergent paths (A through F). Compounded into 2-layer architecture (Hooks + Daemon).
- Key contribution: 3 new hooks (orchestration_detector, context_budget, handoff_writer), Evo3-to-GSD bridge, auto-advance, conductor unification.
- Status: Hooks built and deployed. Daemon patches partially applied. Cross-pane discovery and Supabase sync still pending.
What This Evolution Must Do Differently
V1 and V2 focused on making the orchestrator work. This evolution focuses on making it intelligent:
- Predict what work is coming and pre-stage it
- Route work to the right machine/pane based on domain capabilities
- Detect and recover from crashes without human intervention
- Learn from injection outcomes to improve over time
- Operate across all 5 machines, not just Mac1
- Adapt thresholds dynamically based on project cadence
The gap between "functional orchestrator" and "intelligent orchestrator" is the gap between a cron job and a nervous system.
---
Key Metrics to Track
| Metric | Current | Target |
|---|---|---|
| Injection success rate | ~70 | |
| Time-to-injection after idle | 15 min (threshold) | <5 min (predicted) |
| Cross-machine coverage | Mac1 only | All 5 machines |
| Crash detection time | Manual (hours) | <60s automated |
| Prompt relevance | Gemini-dependent | Multi-model with outcome feedback |
| Work routing accuracy | None (random idle pane) | >80 |
| State durability | /tmp/ (lost on reboot) | Supabase + local fallback |
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
evo-cube-output/intelligent-pane-orchestrator/stage0-research.md
Detected Structure
Method · Evaluation · Figures · Code Anchors · Architecture · is Stage Research