Grand Diomande Research · Full HTML Reader

Stage 4: ARCHITECTURE — Fleet Evolution Engine v2

``` DISCOVER (KARL-ranked cohort selection) ↓ DISPATCH (Python orchestrator → AuraGateway skill command) ↓ PARSE (structured output → Supabase tasks) ↓ REVIEW (Codex adversarial → issue blocks → Supabase tasks) ↓ EVOLVE (Hydra cycles → Swift bridge polls quality gate) ↓ SHIP (quality gate passes → auto TestFlight trigger) ↓ LEARN (run as KARL trajectory → improve next cohort routing) ```

Agents That Account for Themselves architecture technical paper candidate score 18 .md

Full Public Reader

# Stage 4: ARCHITECTURE — Fleet Evolution Engine v2
> Forged from: Track 2 (Python orchestrator spine) + Track 3 (Hydra bridge) + Entanglement seeds

---

Core Formula

DISCOVER (KARL-ranked cohort selection)
  ↓
DISPATCH (Python orchestrator → AuraGateway skill command)
  ↓
PARSE (structured output → Supabase tasks)
  ↓
REVIEW (Codex adversarial → issue blocks → Supabase tasks)
  ↓
EVOLVE (Hydra cycles → Swift bridge polls quality gate)
  ↓
SHIP (quality gate passes → auto TestFlight trigger)
  ↓
LEARN (run as KARL trajectory → improve next cohort routing)

---

System Architecture

┌─────────────────────────────────────────────────────────┐
│                    SUPABASE (source of truth)            │
│  app_evolution_states  app_evolution_tasks  mesh_health  │
└──────────────┬────────────────────┬────────────────────-┘
               │ reads/writes        │ reads
               ▼                    ▼
┌──────────────────────┐   ┌───────────────────────────┐
│  Python Orchestrator  │   │   MeshControl Swift App   │
│  fleet_evolution_     │   │   FleetEvolutionClient    │
│  orchestrator.py      │   │   (read-only display)     │
│  (cloud-vm Prefect)   │   │   + enqueue UI            │
│                       │   │   + manual override       │
│  Writes: stage        │   │   + Hydra bridge (polls)  │
│  Reads: mesh_health   │   │   Writes: task completion │
└──────────────────────┘   └───────────────────────────┘
        │                              │
        ▼                              ▼
┌───────────────────┐        ┌──────────────────────┐
│  AuraGateway      │        │  HydraClient HTTP     │
│  Mac1 (:8001)     │        │  Mac1 (:8095)         │
│  Skill dispatch   │        │  Phase + quality data │
└───────────────────┘        └──────────────────────┘
        │
        ├──→ Mac1: claude skill runs (creative:forge, divergent-rail)
        ├──→ Mac4: /meta:omega + Codex adversarial review
        └──→ Mac3: creative content generation

---

Component Definitions

1. fleet_evolution_orchestrator.py (new, cloud-vm)

Responsibility: Own the pipeline. Advance stages. Parse output. Handle retries.

python
class AppEvolution:
    def run(self, app_id: str, pipeline: str):
        # Stage 1: Skill
        task_id = aura.spawn(build_skill_cmd(app_id, pipeline))
        output = aura.poll_and_retrieve(task_id, timeout=3600)
        tasks = parse_skill_output(output, app_id)
        supabase.upsert_tasks(tasks)
        supabase.set_stage(app_id, 'skillComplete')

        # Stage 2: Adversarial
        review_task_id = aura.dispatch(build_review_cmd(app_id), target='mac4', model='codex')
        review = aura.poll_and_retrieve(review_task_id, timeout=1800)
        issues = parse_review_output(review, app_id)
        supabase.upsert_tasks(issues, source='adversarial')
        supabase.set_stage(app_id, 'adversarialComplete')

        # Stage 3: Hydra
        aura.spawn(f"/meta:hydra {app_id}")
        supabase.set_stage(app_id, 'hydraBuild')
        # Hydra bridge in Swift polls from here; orchestrator monitors
        # Timeout: 4 hours
        deadline = time.time() + 14400
        while time.time() < deadline:
            quality = hydra_client.get_quality(app_id)
            if quality['isPassing']:
                supabase.set_stage(app_id, 'complete')
                trigger_testflight(app_id)  # Auto-upload if isPassing
                record_trajectory(app_id, outcome='success')
                break
            time.sleep(300)  # Poll every 5min from Python
        else:
            supabase.set_stage(app_id, 'failed', error='Hydra timeout 4h')

Prefect scheduling: Every 5 minutes, check queue and start up to `maxConcurrent` apps (read from Supabase config).

---

2. parse_skill_output(output, app_id) → [Task]

Strategy: Two-tier parsing.

Tier 1 (structured block): Scan output for `<!-- EVOLUTION_TASKS_START -->` JSON block. If found, use it directly.

Tier 2 (LLM extraction): If no structured block, pass output to Gemini Flash with extraction prompt:

Extract all actionable tasks from this output. For each task:
- priority: 1-5 (1=critical blocker, 5=nice-to-have)
- description: single sentence
- source: "skill"
Return JSON array of tasks.

Quality floor: Minimum 3 tasks per app. If < 3 tasks parsed, flag as `error: insufficient_tasks` and retry skill with more explicit prompt.

---

3. parse_review_output(review, app_id) → [Task]

Codex review prompt format:

Review app '{app_id}'. Prior skill tasks: {tasks_json}.
For each issue found, output an ISSUE block:
ISSUE FILE: <path> SEVERITY: critical|high|medium|low FIX: <description>
Output JSON at end: {"issues": [{"file": "...", "severity": "...", "description": "..."}]}

Swift display: Issue tasks show with shield icon, source="adversarial", red (critical) / orange (high).

---

4. Hydra Bridge (Swift, FleetEvolutionClient extension)

Responsibility: Poll Hydra quality gate while app is in foreground. Python handles when backgrounded.

swift
func startHydraMonitor(appId: String) {
    Task {
        while let state = states[appId], state.stage.phase == .hydra {
            try? await Task.sleep(nanoseconds: 10_000_000_000) // 10s
            let status = await HydraClient.shared.fetchStatus(for: appId)
            let mapped = EvolutionStage.fromHydraPhase(status.phase, cycle: status.cycle)
            if mapped != states[appId]?.stage {
                await updateStageLocally(appId: appId, stage: mapped)
                // Python orchestrator also updates Supabase; last-write-wins is fine
                // (both compute same mapped value from same Hydra status)
            }
            if status.quality.isPassing {
                // Let Python handle the official markComplete + TestFlight
                // But update UI immediately
                await MainActor.run { states[appId]?.stage = .complete }
                break
            }
        }
    }
}

---

5. KARL Cohort Ranking

Before each orchestrator run:

python
def rank_apps_by_trajectory(fleet: list[str]) -> list[str]:
    # Query KARL trajectory scores from Supabase
    trajectories = supabase.select('trajectories', {'app_id': {'in': fleet}})
    scores = {t['app_id']: t['reward_mean'] for t in trajectories}
    # Apps with no trajectory: medium priority
    return sorted(fleet, key=lambda x: scores.get(x, 0.5), reverse=True)

First cohort: Top 10 by trajectory score (or arbitrary if no history).
Progressive expansion: After cohort 1 completes, review learned patterns, expand to 20, then 40.

---

6. Post-Completion Chain

When `quality.isPassing`:
1. `supabase.set_stage(app_id, 'complete')`
2. `trigger_testflight(app_id)` — calls `ops:ios` archive + upload
3. `record_trajectory(app_id)` — writes to `trajectories` table with pipeline, tasks, quality scores
4. KARL picks up new trajectory on next training run

---

Anti-Patterns

  • Don't run all 89 apps at once. Cohort model. Start with 10.
  • Don't let Python and Swift both write `stage`. Python writes. Swift reads + displays.
  • Don't skip adversarial. It's the quality gate for Hydra input. Bad skill output → Hydra loops forever.
  • Don't let Hydra run without a timeout. 4 hours max. Hard stop.
  • Don't parse skill output with regex. Two-tier LLM extraction is more robust.

---

## Execution Gate (post-Forge feasibility check)
- ✅ Python orchestrator on cloud-vm: existing infra
- ✅ AuraGateway task poll: verify or add `/tasks/{id}` endpoint
- ✅ Hydra bridge Swift: 40 lines, no new dependencies
- ✅ KARL ranking: existing trajectories table
- ⚠️ TestFlight auto-trigger: needs `altool` or `xcrun` access from cloud-vm (Mac1 has Xcode, not cloud-vm)
Fix: TestFlight trigger dispatches via AuraGateway to Mac1, not run directly from cloud-vm
- ✅ 2+ viable execution paths: Python spine OR Swift manual override

Gate: PASS

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

omega-output/fleet-evolution-pipeline-20260325/04-architecture.md

Detected Structure

Method · Code Anchors · Architecture · is Stage Research