Grand Diomande Research · Full HTML Reader

Intelligent Pane Orchestrator -- Stage 0: Research

| File | Lines | Purpose | |------|-------|---------| | `pane_orchestrator_core.py` | ~945 | Config, state, notifications (Telegram/iMessage), Cortex authority check, ProjectContextCache, run_cycle, run_daemon, main loop | | `pane_orchestrator_invariants.py` | ~234 | State change detection, plateau detection, review gates, evolution candidacy, context exhaustion signals | | `pane_orchestrator_drift.py` | ~689 | Output evaluation, Gemini Flash prompt composition, 5-pattern fallback (question/steps/recommendation/com

Embodied Trajectory Systems technical note experiment writeup candidate score 18 .md

Full Public Reader

# Intelligent Pane Orchestrator -- Stage 0: Research
Run: intelligent-pane-orchestrator
Generated: 2026-04-04
Method: Evolution Cube -- four-stage recursive evoflow (research-grounded)
Run Directory: Desktop/evo-cube-output/intelligent-pane-orchestrator/
Lineage: Succeeds intelligent-orchestration-v2 (2026-03-08). V2 built the 2-layer hook+daemon architecture and 7 patches. This evolution focuses on making the orchestrator predictive and self-healing rather than reactive.

---

What Exists Today

Pane Orchestrator Core (4 Python modules, ~2,600 lines total)

File	Lines	Purpose
`pane_orchestrator_core.py`	~945	Config, state, notifications (Telegram/iMessage), Cortex authority check, ProjectContextCache, run_cycle, run_daemon, main loop
`pane_orchestrator_invariants.py`	~234	State change detection, plateau detection, review gates, evolution candidacy, context exhaustion signals
`pane_orchestrator_drift.py`	~689	Output evaluation, Gemini Flash prompt composition, 5-pattern fallback (question/steps/recommendation/completion/generic), injection memory
`pane_orchestrator_sense.py`	~927	Context readers (CLAUDE.md, GSD, Pulse, Evolution, git, vault notes), deep context builder, 14 project-specific prompt templates, compose_prompt_for_pane
`pane_orchestrator_loop.py`	~100	Thin facade re-exporting all modules + CLI (daemon/once/status/self-inject/notify)

5-Phase Heartbeat Cycle

1. SENSE: AppleScript pane discovery + `/tmp/pane_signals/` + `/tmp/pane_context/` + terminal content analysis (last 500 chars for shell prompts, Claude exit messages)
2. SELECT: Longest-idle pane + highest-priority pending backlog task
3. MUTATE: Inject via clipboard paste (pbcopy + keystroke "v" with command down)
4. CHECK: Bounded divergence invariant (max M injections per window)
5. ADAPT: Metabolism-based interval: `new_interval = MIN(30) + (MAX(300) - MIN(30)) * (working/total)`

4 Non-Halting Invariants

1. Min Entropy Production: KL divergence with Laplace smoothing. Every cycle must produce novelty > epsilon.
2. Bounded Divergence: Injection rate capped per window. `MAX_INJECTIONS_PER_CYCLE = 2`.
3. Cross-Layer Forcing: If all panes idle for N cycles, escalate (force action).
4. No Absorbing States: Every state must have 2+ viable transitions.

Signal Detection (6 Categories)

Signal	Threshold	Action
Idle	> 15 min no activity	Evaluate + compose + inject
Stuck	> 30 min no activity	Evaluate + inject + notify (iMessage)
Plateau	Same output hash for 6 cycles	Force approach pivot
Review gate	5+ commits, 0 with "review"	Inject /meta-review
Evolution candidate	30+ min stuck + no output change in 3 cycles	Pivot or skip task
Context exhaustion	4+ hours runtime	Handoff + fresh session

Prompt Composition (2-Tier)

Tier 1 -- Gemini Flash AI Composition: Sends last 40 lines of terminal output + project context + recent injection history to Gemini 2.0 Flash. Gets back a targeted 2-4 sentence prompt. Temperature 0.3, max 300 tokens, 8s timeout.

Tier 2 -- Pattern Matching Fallback: 5 regex-based patterns: (1) direct questions at tail, (2) numbered/bulleted next steps, (3) recommendations, (4) completion signals, (5) no match.

Response Evaluation Engine (evaluate_response)

20+ signal categories: build success/failure, test pass/fail, commit/push, deployment, awaiting input, plan mode, context limit, runtime error, task complete, pulse iteration, permission denied, context exhaustion, AskUserQuestion, tool activity. Each signal maps to a status + suggested_action + context_for_next string.

Infrastructure Dependencies

System	Port/Path	Role in Orchestration
meshd	:9451 (Mac1-5)	HTTP inject/stream/read for agent slots. Orchestrator could use POST /inject/{slot}
NATS	:4222 (Mac1)	JetStream for mesh events. 5 streams. Leaf nodes :7422 for Mac2-5
OPA	:8181 (Mac1)	Policy engine for violation/memory guards. <5ms eval
Supabase	aaqbofotpchgpyuohmmz	agent_slots table, app_evolution_states, mesh_events
ewd (Rust)	:8300	Evolution World daemon. 5-phase heartbeat, meshd dispatch
Cortex	state file	Behavioral intelligence. Cortex authority TTL (180s) suppresses orchestrator injections when Cortex is active
NUMU	:7890	WebSocket event bus (legacy, being replaced by NATS)
Telegram gateway	GCP Cloud Run	Cycle notifications
iMessage	AppleScript	Urgent/important notifications

LaunchAgent

Label: `com.openclaw.pane-orchestrator`
KeepAlive: true, ThrottleInterval: 30s
Logs: `/tmp/pane-orchestrator.log`, `/tmp/pane-orchestrator-error.log`
State: `/tmp/orchestrator_state.json`

---

What Works Well

1. Cortex authority handoff. When Cortex is actively managing panes (heartbeat < 180s), the orchestrator suppresses its own injections. Clean authority delegation with no coordination protocol needed beyond a state file timestamp.

2. Injection memory prevents loops. `injection_memory.json` stores last 200 injections per pane. Recent 10 are passed to Gemini Flash to avoid repeating the same prompt. Pattern matching also checks against history.

3. Project context caching. `ProjectContextCache` with 5-minute TTL aggregates CLAUDE.md, GSD state, Pulse anchor, evolution state, git status, active tasks, changed files, known blockers. Avoids redundant filesystem reads across cycle evaluations.

4. Multi-signal idle detection. Four sources: signal files, context timestamps, terminal content analysis, and terminal busy flag. Each is unreliable alone but the union covers most cases.

5. Pane protection. Both list-based (`[home-path]`) and lock-file-based (`[home-path]`). Users can protect specific panes from auto-injection.

6. Per-pane cooldown. 600s (10 min) between injections per pane. Prevents rapid-fire injection storms.

---

What's Broken or Missing

Critical Gaps

1. No work prediction. The orchestrator is purely reactive. It waits for panes to go idle, then figures out what to inject. It never anticipates what work should go where before a pane finishes. There is no model of "what will be needed next."

2. No domain-aware routing. All idle panes are treated identically. Mac2 (TD/motion domain) gets the same prompt composition as Mac4 (Adobe domain). The orchestrator has no concept that certain work should go to certain machines based on installed tools, capabilities, or running services.

3. No crash recovery. If a pane crashes (Claude segfault, SSH timeout, Terminal.app hang), the orchestrator just sees it as "idle" after the threshold. It doesn't detect the crash, doesn't restart the session, doesn't recover the work context. The pane sits dead until someone notices.

4. No cross-machine orchestration. The orchestrator runs on Mac1 only. Mac2-5 panes are visible via `agent_slots` in Supabase, but the orchestrator only manages local Terminal.app panes via AppleScript. SSH+tmux injection is separate (`remote_injector.py`), and it's not integrated into the cycle.

5. No learning from outcomes. The injection memory records what was injected but never whether the injection succeeded (did the pane actually start working? did the work produce good output?). There is no feedback loop that improves injection quality over time.

Architectural Friction

6. Gemini Flash as single prompt composer. If Gemini is down or the API key expires, prompt composition falls through to regex patterns that are generic. There is no local fallback that understands project context.

7. State file as single truth. `/tmp/orchestrator_state.json` is the only state persistence. If the file is corrupted or deleted (which `/tmp/` does on reboot), all history is lost. Supabase sync exists in v2's plan but was never wired.

8. Hardcoded thresholds. Idle=15min, Stuck=30min, Cooldown=600s, MaxInjections=2. These are constants. Different projects have different cadences. A fast iteration loop (15-second Pulse cycles) needs different thresholds than a slow evolution (30-minute Evo3 stages).

9. **~30

10. No priority queue. All idle panes are processed by idle duration. There is no concept of urgent vs. routine work. A pane that just crashed running a critical build gets the same priority as a pane that finished a low-priority docs task.

---

Constraints and Invariants

Constraint	Source	Impact
AppleScript injection is unreliable	macOS + Terminal.app	Must have retry mechanism or alternate injection path
Cortex authority is time-based	CORTEX_AUTHORITY_TTL_SEC=180s	Cannot inject while Cortex is fresh, except inbox wakes
4 panes per machine (meshd slots)	meshd.toml	Bounded concurrency, no dynamic scaling
NATS leaf node latency ~20-50ms	Tailscale networking	Cross-machine events are near-real-time but not instant
Supabase round-trip ~100-300ms	Cloud database	Cannot be in the hot path of injection decisions
Gemini Flash timeout 8s	API call	Must have sub-8s fallback for prompt composition
Terminal.app busy flag unreliable	macOS Terminal	Cannot trust `busy` as sole idle indicator
OPA policy eval <5ms	Local HTTP	Can be in hot path for injection validation
macOS Security (SIP)	System Integrity	Cannot modify Terminal.app internals, must work via AppleScript or accessibility

---

Infrastructure Available But Not Used by Orchestrator

1. meshd HTTP inject (:9451 POST /inject/{slot}). Direct HTTP injection to agent slots. Bypasses AppleScript entirely. More reliable than clipboard paste. Orchestrator currently doesn't use it.

2. NATS JetStream events. 5 durable streams with replay. Orchestrator currently doesn't publish cycle results or subscribe to mesh events via NATS. Still uses Telegram for notifications and file-based signals.

3. OPA policy engine (:8181). Could validate injection decisions (should this task go to this pane? does this injection violate any policy?). Currently only used by violation-guard and memory-guardian hooks.

4. ewd Rust daemon (:8300). Evolution World's 5-phase heartbeat with meshd dispatch and rate-limit-aware slot rotation. Could coordinate with orchestrator for evolution-related work dispatch.

5. KARL trajectory intelligence. 121+ trajectories with 72 skill-labeled, 11 domains, 5-signal reward. Could train injection prompts based on what historically produced the best outcomes.

6. Supabase agent_slots. Real-time view of all 5 machines' pane states. Updated by meshd heartbeat every 60s. The orchestrator reads it but doesn't write back its own decisions.

7. bridged (:9462 OSC, :9463 WS, Supabase adapter). Unified bridge daemon with topic routing. Could carry orchestrator events alongside OSC/WS traffic.

---

Prior Evolution History

### V1 (intelligent-orchestration, 2026-03-07)
- 5 divergent paths (A through E). Built the basic HEF daemon with 6-step decision engine.
- Key contribution: NUMU event integration, signal watcher, intent DAG, prompt composer.

### V2 (intelligent-orchestration-v2, 2026-03-08)
- 6 divergent paths (A through F). Compounded into 2-layer architecture (Hooks + Daemon).
- Key contribution: 3 new hooks (orchestration_detector, context_budget, handoff_writer), Evo3-to-GSD bridge, auto-advance, conductor unification.
- Status: Hooks built and deployed. Daemon patches partially applied. Cross-pane discovery and Supabase sync still pending.

What This Evolution Must Do Differently

V1 and V2 focused on making the orchestrator work. This evolution focuses on making it intelligent:

Predict what work is coming and pre-stage it
Route work to the right machine/pane based on domain capabilities
Detect and recover from crashes without human intervention
Learn from injection outcomes to improve over time
Operate across all 5 machines, not just Mac1
Adapt thresholds dynamically based on project cadence

The gap between "functional orchestrator" and "intelligent orchestrator" is the gap between a cron job and a nervous system.

---

Key Metrics to Track

Metric	Current	Target
Injection success rate	~70
Time-to-injection after idle	15 min (threshold)	<5 min (predicted)
Cross-machine coverage	Mac1 only	All 5 machines
Crash detection time	Manual (hours)	<60s automated
Prompt relevance	Gemini-dependent	Multi-model with outcome feedback
Work routing accuracy	None (random idle pane)	>80
State durability	/tmp/ (lost on reboot)	Supabase + local fallback

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

evo-cube-output/intelligent-pane-orchestrator/stage0-research.md

Detected Structure

Method · Evaluation · Figures · Code Anchors · Architecture · is Stage Research