Grand Diomande Research · Full HTML Reader

Symphony -- Stage 3 Expansion: Multi-Agent Architecture

> **Evolution:** Symphony -- Multi-Agent Orchestrator for Linear Issue Automation > **Stage:** 3 Expansion (Architectural Retrofit) > **Date:** 2026-03-07 > **Input:** Stage 2 Compound (8-step synthesis) + Stage 3 Master Plan (44 tasks) + Multi-Agent Research (970 lines) > **Engine:** Evo-Cubed Runner (Opus 4.6)

Agents That Account for Themselves architecture technical paper candidate score 48 .md

Full Public Reader

# Symphony -- Stage 3 Expansion: Multi-Agent Architecture
## From Codex-Only to Claude Code + Gemini CLI + Codex as Interchangeable Backends

> Evolution: Symphony -- Multi-Agent Orchestrator for Linear Issue Automation
> Stage: 3 Expansion (Architectural Retrofit)
> Date: 2026-03-07
> Input: Stage 2 Compound (8-step synthesis) + Stage 3 Master Plan (44 tasks) + Multi-Agent Research (970 lines)
> Engine: Evo-Cubed Runner (Opus 4.6)

---

1. Updated Package Layout

The original monorepo has `packages/codex-client/` as the sole agent integration. This package is replaced by `packages/agent-adapters/`, which contains a unified interface, three adapter implementations, and an event normalization layer.

1.1 New Directory Structure

packages/
  agent-adapters/                   # REPLACES codex-client/
    package.json                    # deps: @anthropic-ai/claude-agent-sdk, gray-matter
    tsconfig.json
    src/
      index.ts                      # Public API barrel export
      interface.ts                  # AgentAdapter interface + event types
      factory.ts                    # createAdapter(kind) factory function
      event-normalizer.ts           # Maps agent-specific events to AgentEvent
      approval-router.ts            # Central approval policy engine
      session-registry.ts           # Maps symphonyRunId -> { agentKind, agentSessionId }

      adapters/
        claude.ts                   # Claude Code adapter via @anthropic-ai/claude-agent-sdk
        codex.ts                    # Codex adapter via app-server JSON-RPC
        gemini.ts                   # Gemini CLI adapter (subprocess MVP, ACP future)

      mappings/
        claude-events.ts            # Claude message types -> AgentEvent
        codex-events.ts             # Codex notification types -> AgentEvent
        gemini-events.ts            # Gemini NDJSON types -> AgentEvent
        approval-modes.ts           # ApprovalMode -> agent-specific flags

      types/
        claude-types.ts             # Types from claude-agent-sdk (re-exported for clarity)
        codex-types.ts              # Types from codex app-server generate-ts
        gemini-types.ts             # Manually defined types for Gemini stream-json

      __tests__/
        interface.test.ts           # Type-level tests (verify exhaustiveness)
        factory.test.ts             # Factory returns correct adapter
        claude-events.test.ts       # Event mapping correctness
        codex-events.test.ts
        gemini-events.test.ts
        claude.integration.test.ts  # Real Claude subprocess
        codex.integration.test.ts   # Real Codex app-server
        gemini.integration.test.ts  # Real Gemini subprocess

  linear-client/                    # UNCHANGED
  workspace-manager/                # UNCHANGED (minor: reconciler becomes agent-agnostic)
  state-machine/                    # UPDATED (IssueState stores agentKind)
  prompt-renderer/                  # UPDATED (WORKFLOW.md agent config parsing)
  metrics/                          # UPDATED (labels include agent kind)
  logger/                           # UNCHANGED

1.2 Dependency DAG (updated)

symphony-daemon
  |-- linear-client              (polls/receives issues)
  |-- state-machine              (owns per-issue lifecycle)
  |     |-- agent-adapters       (REPLACES codex-client)
  |     |     |-- claude.ts      (spawns Claude Code sessions)
  |     |     |-- codex.ts       (spawns Codex sessions)
  |     |     +-- gemini.ts      (spawns Gemini sessions)
  |     |-- workspace-manager    (isolates filesystem)
  |     |-- prompt-renderer      (renders WORKFLOW.md)
  |     +-- logger
  |-- metrics                    (instruments everything, labels by agent kind)
  +-- logger

The key constraint is preserved: `agent-adapters` is a package, not a service. It has no dependency on `linear-client`, `workspace-manager`, or `state-machine`. The orchestrator in `symphony-daemon` wires these together. This means each adapter is independently testable.

1.3 Package Dependencies

json
{
  "name": "@symphony/agent-adapters",
  "version": "0.1.0",
  "dependencies": {
    "@anthropic-ai/claude-agent-sdk": "^0.3.0"
  },
  "devDependencies": {
    "bun-types": "latest"
  }
}

Codex has no npm dependency -- it uses auto-generated types from `codex app-server generate-ts` committed directly into `types/codex-types.ts`. Gemini has no npm dependency -- it uses manually defined types for the NDJSON stream format.

---

2. Core Interface: `interface.ts`

The interface definition from the multi-agent research (Section 5.2) is adopted with three refinements grounded in Stage 3 audit findings.

typescript
// packages/agent-adapters/src/interface.ts

/** Agent identity */
export type AgentKind = "claude" | "gemini" | "codex";

/** Unified approval modes that map to each agent's native system */
export type ApprovalMode =
  | "ask"           // Every tool requires approval
  | "auto-edit"     // Auto-approve file edits, ask for commands
  | "auto-all"      // Auto-approve everything (Claude: bypassPermissions, Gemini: yolo, Codex: dangerFullAccess)
  | "read-only";    // No writes (Claude: plan, Gemini: plan, Codex: readOnly sandbox)

/** Session configuration passed to initialize() */
export interface SessionConfig {
  cwd: string;
  model?: string;
  approvalMode: ApprovalMode;
  systemPrompt?: string;
  mcpServers?: Record<string, McpServerConfig>;
  allowedTools?: string[];
  deniedTools?: string[];
  maxBudgetUsd?: number;           // Claude only -- ignored by others
  maxTurns?: number;
  env?: Record<string, string>;
  timeoutMs?: number;              // Per-session timeout (all agents)
}

/** Unified events emitted by all agent adapters */
export type AgentEvent =
  | { type: "session_start"; sessionId: string; agent: AgentKind }
  | { type: "text_delta"; text: string; sessionId: string }
  | { type: "text_complete"; text: string; sessionId: string }
  | { type: "tool_start"; toolName: string; toolId: string; input: unknown }
  | { type: "tool_result"; toolId: string; output: string; exitCode?: number }
  | { type: "approval_request"; requestId: string; toolName: string; input: unknown; reason?: string }
  | { type: "file_change"; filePath: string; diff: string }
  | { type: "error"; message: string; code?: string; recoverable: boolean }
  | { type: "turn_complete"; turnId: string; status: "completed" | "interrupted" | "failed" }
  | { type: "session_end"; sessionId: string; usage: UsageInfo }
  | { type: "reasoning"; summary: string }
  | { type: "plan_update"; steps: PlanStep[] };

export interface UsageInfo {
  inputTokens: number;
  outputTokens: number;
  totalCostUsd?: number;           // Claude provides this; Codex/Gemini may not
  durationMs: number;
}

export interface PlanStep {
  id: string;
  description: string;
  status: "pending" | "in_progress" | "completed" | "failed";
}

export type ApprovalResponse =
  | { action: "accept" }
  | { action: "accept_session" }
  | { action: "deny"; reason?: string }
  | { action: "cancel" };

export interface McpServerConfig {
  command: string;
  args?: string[];
  env?: Record<string, string>;
}

/** Resume capabilities vary by agent */
export interface ResumeCapability {
  /** Whether the agent supports session resume at all */
  canResume: boolean;
  /** Whether resume preserves full context (true) or requires context re-injection (false) */
  fullContextRestore: boolean;
  /** The native identifier type used for resume */
  identifierType: "uuid" | "thread_id" | "index" | "none";
}

/**
 * The core adapter interface.
 *
 * Each agent (Claude, Gemini, Codex) implements this interface.
 * The orchestrator never calls agent-specific methods -- it only
 * uses this interface. This is the single most important contract
 * in the multi-agent expansion.
 */
export interface AgentAdapter {
  readonly kind: AgentKind;
  readonly resumeCapability: ResumeCapability;

  initialize(config: SessionConfig): Promise<void>;

  startSession(opts: {
    resumeId?: string;
    forkFrom?: string;
    prompt?: string;
  }): Promise<string>;

  sendPrompt(
    sessionId: string,
    prompt: string,
  ): AsyncIterable<AgentEvent>;

  respondToApproval(
    sessionId: string,
    requestId: string,
    response: ApprovalResponse,
  ): Promise<void>;

  interrupt(sessionId: string): Promise<void>;

  endSession(sessionId: string): Promise<UsageInfo>;

  isAlive(): boolean;

  shutdown(): Promise<void>;
}

2.1 Refinements vs. Research Draft

1. `ResumeCapability` added. The Stage 3 Audit 2 finding (Codex handshake compliance) exposed that resume behavior is fundamentally different per agent. The orchestrator needs to know at runtime whether an agent can resume, and whether resume restores full context. This prevents the crash recovery logic (Stage 2, Step 3) from assuming all agents have Codex-quality `thread/resume`.

2. `timeoutMs` added to `SessionConfig`. The Stage 3 Audit 6 stall detection requirement applies to all agents, not just Codex. The timeout is set per-session by the orchestrator.

3. `totalCostUsd` made optional in `UsageInfo`. Claude provides cost via `ResultMessage.total_cost_usd`. Codex provides token counts but not dollar costs. Gemini provides `inputTokens`/`outputTokens` in stats. Only Claude has a definitive cost figure.

---

3. Adapter Implementations

3.1 Claude Adapter (`adapters/claude.ts`)

The Claude adapter uses the official `@anthropic-ai/claude-agent-sdk` TypeScript package. This SDK spawns the `claude` CLI as a subprocess and communicates over stdin/stdout JSON-lines. The `ClaudeSDKClient` class provides persistent multi-turn sessions with interrupt support.

typescript
// packages/agent-adapters/src/adapters/claude.ts
import { ClaudeSDKClient, ClaudeAgentOptions } from "@anthropic-ai/claude-agent-sdk";
import type { AgentAdapter, SessionConfig, AgentEvent, ApprovalResponse, UsageInfo, ResumeCapability } from "../interface";
import { mapClaudeMessage } from "../mappings/claude-events";
import { mapApprovalMode } from "../mappings/approval-modes";

export class ClaudeAdapter implements AgentAdapter {
  readonly kind = "claude" as const;
  readonly resumeCapability: ResumeCapability = {
    canResume: true,
    fullContextRestore: true,
    identifierType: "uuid",
  };

  private client: ClaudeSDKClient | null = null;
  private config: SessionConfig | null = null;
  private currentSessionId: string | null = null;
  private alive = false;

  async initialize(config: SessionConfig): Promise<void> {
    this.config = config;

    const options: ClaudeAgentOptions = {
      permissionMode: mapApprovalMode("claude", config.approvalMode),
      cwd: config.cwd,
      model: config.model,
      allowedTools: config.allowedTools,
      disallowedTools: config.deniedTools,
      maxTurns: config.maxTurns,
      maxBudgetUsd: config.maxBudgetUsd,
      includePartialMessages: true,
      env: config.env,
    };

    // MCP servers configuration
    if (config.mcpServers) {
      options.mcpServers = Object.fromEntries(
        Object.entries(config.mcpServers).map(([name, cfg]) => [
          name,
          { type: "stdio" as const, command: cfg.command, args: cfg.args, env: cfg.env },
        ])
      );
    }

    // Custom approval handler that emits approval_request events
    // The orchestrator's ApprovalRouter resolves these
    options.canUseTool = async (toolName, inputData, context) => {
      // This callback is invoked by the SDK when a tool needs approval.
      // We emit an approval_request event, and the orchestrator responds
      // via respondToApproval(), which resolves the pending promise.
      // Implementation deferred to the approval routing section below.
      return this.approvalBridge.waitForDecision(toolName, inputData);
    };

    this.client = new ClaudeSDKClient(options);
    this.alive = true;
  }

  async startSession(opts: { resumeId?: string; prompt?: string }): Promise<string> {
    if (!this.client) throw new Error("Claude adapter not initialized");

    if (opts.resumeId) {
      // Claude Code resume: --resume <session_id>
      // The SDK supports this via session continuation
      await this.client.resume(opts.resumeId);
      this.currentSessionId = opts.resumeId;
    } else if (opts.prompt) {
      // Start new session with initial prompt
      await this.client.query(opts.prompt);
      // Session ID is extracted from the init SystemMessage
      this.currentSessionId = this.client.sessionId;
    }

    return this.currentSessionId!;
  }

  async *sendPrompt(sessionId: string, prompt: string): AsyncIterable<AgentEvent> {
    if (!this.client) throw new Error("Claude adapter not initialized");

    await this.client.query(prompt);
    for await (const msg of this.client.receiveResponse()) {
      const event = mapClaudeMessage(msg, sessionId);
      if (event) yield event;
    }
  }

  async respondToApproval(sessionId: string, requestId: string, response: ApprovalResponse): Promise<void> {
    this.approvalBridge.resolve(requestId, response);
  }

  async interrupt(sessionId: string): Promise<void> {
    await this.client?.interrupt();
  }

  async endSession(sessionId: string): Promise<UsageInfo> {
    // Claude SDK provides usage in the final ResultMessage
    const result = this.client?.lastResult;
    return {
      inputTokens: result?.usage?.input_tokens ?? 0,
      outputTokens: result?.usage?.output_tokens ?? 0,
      totalCostUsd: result?.total_cost_usd,
      durationMs: result?.duration_ms ?? 0,
    };
  }

  isAlive(): boolean { return this.alive; }

  async shutdown(): Promise<void> {
    await this.client?.close();
    this.alive = false;
  }
}

Key Claude-specific decisions:

1. `ClaudeSDKClient` over `query()` function. The stateful client maintains a subprocess for the adapter's lifetime. The stateless `query()` function spawns a new subprocess per call, which adds ~2s startup overhead per session. For Symphony's lifecycle (sessions running 5-45 minutes), the startup cost is negligible, but the stateful client enables interrupt support and multi-turn continuation within a single session.

2. `canUseTool` as the approval bridge. Claude's approval model is callback-based -- the SDK calls `canUseTool` and blocks until it returns. Symphony's model is event-based -- it emits `approval_request` events and waits for `respondToApproval()`. The `approvalBridge` is a simple promise-map that converts between the two models: `canUseTool` creates a deferred promise keyed by requestId, emits the event, and awaits the promise. `respondToApproval` resolves the promise.

3. Resume via `--resume <session_id>`. Claude Code persists sessions to `[home-path]`. When the daemon crashes mid-session, the recovery logic (Stage 2 Step 3) reads the `sessionId` from SQLite and calls `startSession({ resumeId })`. Claude replays the full conversation history from disk.

3.2 Codex Adapter (`adapters/codex.ts`)

The Codex adapter is a refactored and corrected version of the original `codex-client` package from Stage 2 Step 3. It implements the full 4-phase handshake (initialize -> initialized -> thread/start -> turn/start) identified as missing in Stage 3 Audit 2.

typescript
// packages/agent-adapters/src/adapters/codex.ts
import type { AgentAdapter, SessionConfig, AgentEvent, ApprovalResponse, UsageInfo, ResumeCapability } from "../interface";
import { mapCodexNotification } from "../mappings/codex-events";
import { mapApprovalMode } from "../mappings/approval-modes";

export class CodexAdapter implements AgentAdapter {
  readonly kind = "codex" as const;
  readonly resumeCapability: ResumeCapability = {
    canResume: true,
    fullContextRestore: true,     // thread/resume replays all RolloutItems
    identifierType: "thread_id",
  };

  private proc: import("bun").Subprocess | null = null;
  private abortController: AbortController | null = null;
  private requestCounter = 0;
  private pendingResponses = new Map<string, { resolve: Function; reject: Function; timer: Timer }>();
  private config: SessionConfig | null = null;
  private initialized = false;

  async initialize(config: SessionConfig): Promise<void> {
    this.config = config;
    this.abortController = new AbortController();

    this.proc = Bun.spawn(["codex", "app-server"], {
      stdin: "pipe",
      stdout: "pipe",
      stderr: "pipe",
      signal: this.abortController.signal,
      env: { ...process.env, ...config.env },
    });

    // Phase 1: Send initialize request
    const initResult = await this.sendRequest("initialize", {
      clientInfo: { name: "symphony", title: "Symphony Orchestrator", version: "0.1.0" },
      capabilities: { experimentalApi: true },
    });

    // Phase 2: Send initialized notification (no response expected)
    this.sendNotification("initialized", {});
    this.initialized = true;
  }

  async startSession(opts: { resumeId?: string; prompt?: string }): Promise<string> {
    if (!this.initialized) throw new Error("Codex adapter not initialized");

    if (opts.resumeId) {
      // thread/resume: replays all RolloutItems from [home-path]
      const result = await this.sendRequest("thread/resume", { threadId: opts.resumeId });
      return opts.resumeId;
    }

    // thread/start creates a new thread
    const result = await this.sendRequest("thread/start", {});
    return result.thread.id;
  }

  async *sendPrompt(sessionId: string, prompt: string): AsyncIterable<AgentEvent> {
    // turn/start begins a turn within the thread
    await this.sendRequest("turn/start", {
      threadId: sessionId,
      input: [{ type: "text", text: prompt }],
      cwd: this.config!.cwd,
      model: this.config!.model ?? "o4-mini",
      approvalPolicy: mapApprovalMode("codex", this.config!.approvalMode),
      sandboxPolicy: {
        type: "workspaceWrite",
        writableRoots: [this.config!.cwd],
        networkAccess: true,
      },
    });

    // Consume notifications until turn/completed
    for await (const notification of this.readNotifications()) {
      const event = mapCodexNotification(notification, sessionId);
      if (event) yield event;
      if (notification.method === "turn/completed") break;
    }
  }

  async respondToApproval(sessionId: string, requestId: string, response: ApprovalResponse): Promise<void> {
    // Codex sends approval as a JSON-RPC request with an id.
    // We respond to that request with the approval decision.
    const pending = this.pendingResponses.get(requestId);
    if (!pending) throw new Error(`No pending approval for requestId: ${requestId}`);

    const codexResponse = response.action === "accept" || response.action === "accept_session"
      ? { decision: response.action === "accept_session" ? "acceptForSession" : "accept" }
      : { decision: "decline", reason: response.action === "deny" ? response.reason : undefined };

    pending.resolve(codexResponse);
  }

  async interrupt(sessionId: string): Promise<void> {
    // Graceful: send turn/interrupt first
    try {
      await this.sendRequest("turn/interrupt", { threadId: sessionId });
    } catch {
      // If interrupt fails, hard kill via AbortController
      this.abortController?.abort();
    }
  }

  async endSession(sessionId: string): Promise<UsageInfo> {
    await this.sendNotification("thread/unsubscribe", { threadId: sessionId });
    return {
      inputTokens: this.accumulatedUsage.inputTokens,
      outputTokens: this.accumulatedUsage.outputTokens,
      durationMs: Date.now() - this.sessionStartTime,
    };
  }

  isAlive(): boolean {
    return this.proc !== null && !this.proc.killed;
  }

  async shutdown(): Promise<void> {
    this.abortController?.abort();
    for (const [, pending] of this.pendingResponses) {
      clearTimeout(pending.timer);
      pending.reject(new Error("Adapter shutdown"));
    }
    this.pendingResponses.clear();
  }

  // --- Private transport methods ---

  private async sendRequest(method: string, params: unknown): Promise<any> {
    const id = String(++this.requestCounter);
    const msg = JSON.stringify({ method, id, params }) + "\n";
    this.proc!.stdin.write(msg);
    return new Promise((resolve, reject) => {
      const timer = setTimeout(() => reject(new Error(`RPC timeout: ${method}`)), 30_000);
      this.pendingResponses.set(id, { resolve, reject, timer });
    });
  }

  private sendNotification(method: string, params: unknown): void {
    const msg = JSON.stringify({ method, params }) + "\n";
    this.proc!.stdin.write(msg);
  }

  private async *readNotifications(): AsyncGenerator<any> {
    const reader = this.proc!.stdout.getReader();
    const decoder = new TextDecoder();
    let buffer = "";

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      buffer += decoder.decode(value, { stream: true });

      let idx: number;
      while ((idx = buffer.indexOf("\n")) !== -1) {
        const line = buffer.slice(0, idx).trim();
        buffer = buffer.slice(idx + 1);
        if (!line) continue;

        const msg = JSON.parse(line);
        if ("id" in msg && this.pendingResponses.has(msg.id)) {
          const pending = this.pendingResponses.get(msg.id)!;
          clearTimeout(pending.timer);
          this.pendingResponses.delete(msg.id);
          if (msg.error) pending.reject(msg.error);
          else pending.resolve(msg.result);
        } else if ("method" in msg) {
          // Handle approval requests (these are server-initiated requests with an id)
          if (msg.method.endsWith("/requestApproval") && msg.id) {
            this.pendingResponses.set(msg.id, { /* bridged to respondToApproval */ });
            yield { method: "approval_request", params: msg.params, _rpcId: msg.id };
          } else {
            yield msg;
          }
        }
      }
    }
  }
}

Key Codex-specific decisions:

1. Full 4-phase handshake implemented. Stage 3 Audit 2 found the original compound skipped `initialize`/`initialized`. This is fixed. The adapter sends `initialize` with client info, waits for the response, then sends the `initialized` notification before any thread operations.

2. `turn/interrupt` before hard kill. Stage 3 Audit 2 item 4. The `interrupt()` method first attempts `turn/interrupt` for graceful shutdown, then falls back to AbortController abort.

3. Approval bridging via pending response map. Codex sends approval as a JSON-RPC request with an `id`. The adapter holds the response promise in the same `pendingResponses` map used for request-response correlation. When the orchestrator calls `respondToApproval()`, the pending promise is resolved with the decision.

3.3 Gemini Adapter (`adapters/gemini.ts`)

The Gemini adapter has two operational modes: subprocess MVP (available now) and ACP mode (recommended when stable). For Symphony V1, the subprocess mode is implemented.

typescript
// packages/agent-adapters/src/adapters/gemini.ts
import type { AgentAdapter, SessionConfig, AgentEvent, ApprovalResponse, UsageInfo, ResumeCapability } from "../interface";
import { mapGeminiEvent } from "../mappings/gemini-events";
import { mapApprovalMode } from "../mappings/approval-modes";

export class GeminiAdapter implements AgentAdapter {
  readonly kind = "gemini" as const;
  readonly resumeCapability: ResumeCapability = {
    canResume: false,              // No reliable session persistence for programmatic use
    fullContextRestore: false,
    identifierType: "none",
  };

  private proc: import("bun").Subprocess | null = null;
  private abortController: AbortController | null = null;
  private config: SessionConfig | null = null;
  private sessionCounter = 0;

  async initialize(config: SessionConfig): Promise<void> {
    this.config = config;
    // Gemini adapter does not spawn a persistent process.
    // Each sendPrompt() spawns a new subprocess.
    // This is the correct architecture because Gemini CLI
    // has no daemon mode and no reliable session persistence.
  }

  async startSession(opts: { resumeId?: string; prompt?: string }): Promise<string> {
    if (opts.resumeId) {
      // Gemini has no reliable programmatic resume.
      // The orchestrator must re-inject context via the prompt.
      // The session ID is synthetic -- it does not correspond to
      // any Gemini-internal identifier.
      return `gemini-resumed-${opts.resumeId}`;
    }
    return `gemini-${++this.sessionCounter}-${Date.now()}`;
  }

  async *sendPrompt(sessionId: string, prompt: string): AsyncIterable<AgentEvent> {
    this.abortController = new AbortController();

    const args: string[] = [
      "-p", prompt,
      "--output-format", "stream-json",
      "--approval-mode", mapApprovalMode("gemini", this.config!.approvalMode),
    ];

    if (this.config!.model) args.push("--model", this.config!.model);

    this.proc = Bun.spawn(["gemini", ...args], {
      cwd: this.config!.cwd,
      stdin: "pipe",
      stdout: "pipe",
      stderr: "pipe",
      signal: this.abortController.signal,
      env: { ...process.env, ...this.config!.env },
    });

    yield { type: "session_start", sessionId, agent: "gemini" };

    const reader = this.proc.stdout.getReader();
    const decoder = new TextDecoder();
    let buffer = "";

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      buffer += decoder.decode(value, { stream: true });

      let idx: number;
      while ((idx = buffer.indexOf("\n")) !== -1) {
        const line = buffer.slice(0, idx).trim();
        buffer = buffer.slice(idx + 1);
        if (!line) continue;

        try {
          const raw = JSON.parse(line);
          const event = mapGeminiEvent(raw, sessionId);
          if (event) yield event;
        } catch {
          // Malformed JSON line -- skip (known Gemini issue #9009)
        }
      }
    }

    // Check exit code for error classification
    const exitCode = await this.proc.exited;
    if (exitCode === 53) {
      yield { type: "turn_complete", turnId: sessionId, status: "interrupted" }; // turn limit
    } else if (exitCode !== 0) {
      yield { type: "error", message: `Gemini exited with code ${exitCode}`, recoverable: exitCode === 1 };
    }
    yield { type: "turn_complete", turnId: sessionId, status: exitCode === 0 ? "completed" : "failed" };
  }

  async respondToApproval(sessionId: string, requestId: string, response: ApprovalResponse): Promise<void> {
    // In subprocess mode with --approval-mode yolo, no approvals are requested.
    // In future ACP mode, this would send session/request_permission response.
    throw new Error("Gemini subprocess mode does not support interactive approvals. Use auto-all approval mode.");
  }

  async interrupt(sessionId: string): Promise<void> {
    this.abortController?.abort();
  }

  async endSession(sessionId: string): Promise<UsageInfo> {
    return {
      inputTokens: this.accumulatedUsage?.inputTokens ?? 0,
      outputTokens: this.accumulatedUsage?.outputTokens ?? 0,
      durationMs: Date.now() - (this.sessionStartTime ?? Date.now()),
    };
  }

  isAlive(): boolean {
    return this.proc !== null && !this.proc.killed;
  }

  async shutdown(): Promise<void> {
    this.abortController?.abort();
    this.proc = null;
  }
}

Key Gemini-specific decisions:

1. No session persistence. Gemini CLI's `--resume` is index-based and project-local. It is not suitable for programmatic orchestration because the index is opaque and changes as new sessions are created. The adapter generates synthetic session IDs for Symphony's internal tracking.

2. Subprocess per prompt, not persistent process. Gemini CLI has no daemon mode. Each `sendPrompt()` spawns `gemini -p "..." --output-format stream-json`. This means there is a startup cost (~3-5s) per prompt, but it guarantees clean process isolation.

3. `auto-all` as the only viable approval mode for MVP. In subprocess mode (`gemini -p`), there is no stdin channel for interactive approval responses. The process runs to completion or gets killed. The adapter throws if `respondToApproval` is called. When ACP mode becomes stable, interactive approvals via `session/request_permission` will be supported.

4. Crash recovery requires context re-injection. When the daemon crashes during a Gemini session, there is no `thread/resume` equivalent. The orchestrator must reconstruct context by prepending the previous session's output to the new prompt. The `resumeCapability.fullContextRestore = false` flag signals this to the recovery logic.

3.4 Factory (`factory.ts`)

typescript
// packages/agent-adapters/src/factory.ts
import type { AgentAdapter, AgentKind } from "./interface";
import { ClaudeAdapter } from "./adapters/claude";
import { CodexAdapter } from "./adapters/codex";
import { GeminiAdapter } from "./adapters/gemini";

export function createAdapter(kind: AgentKind): AgentAdapter {
  switch (kind) {
    case "claude": return new ClaudeAdapter();
    case "codex":  return new CodexAdapter();
    case "gemini": return new GeminiAdapter();
    default: {
      const _exhaustive: never = kind;
      throw new Error(`Unknown agent kind: ${kind}`);
    }
  }
}

The `never` type assertion ensures compile-time exhaustiveness. Adding a new `AgentKind` variant without updating the factory produces a type error.

---

4. WORKFLOW.md Updates

4.1 Updated Front Matter Schema

The YAML front matter in WORKFLOW.md gains an `agent` section that controls agent selection and per-agent overrides.

yaml
---
# Agent configuration
agent:
  kind: claude                   # "claude" | "codex" | "gemini"
  model: opus                    # Agent-specific model identifier
  approval_mode: auto-edit       # "ask" | "auto-edit" | "auto-all" | "read-only"
  max_turns: 20                  # Maximum turns within a session
  timeout_ms: 3600000            # 1 hour per-session timeout

  # Per-agent overrides (only the block matching agent.kind is applied)
  claude:
    allowed_tools: ["Read", "Edit", "Bash", "Grep", "Glob"]
    max_budget_usd: 10.0
    thinking: adaptive           # "enabled" | "disabled" | "adaptive"
    effort: high                 # "low" | "medium" | "high" | "max"

  codex:
    approval_policy: auto-approve
    sandbox: full-auto
    effort: medium

  gemini:
    sandbox_mode: relaxed
    extensions: []

# Execution configuration (unchanged from Stage 2)
max_retries: 3
max_retry_backoff_ms: 300000
hooks:
  pre-session: |
    npm install
    npm run build
  post-session: |
    npm run lint --fix
    npm run test

# Fallback chain (new)
fallback:
  enabled: true
  chain: [claude, codex]         # If primary fails, try next in chain
  max_fallback_attempts: 1       # Only try one fallback agent
---

You are working on issue {{ issue_id }}: {{ issue_title }}.
...

4.2 Front Matter Parsing

The `prompt-renderer` package parses the `agent` section and produces a typed `AgentConfig` object:

typescript
// packages/prompt-renderer/src/types.ts
export interface WorkflowConfig {
  agent: {
    kind: AgentKind;
    model?: string;
    approvalMode: ApprovalMode;
    maxTurns: number;
    timeoutMs: number;
    overrides: Record<AgentKind, Record<string, unknown>>;
  };
  maxRetries: number;
  maxRetryBackoffMs: number;
  hooks: { preSession?: string; postSession?: string };
  fallback: { enabled: boolean; chain: AgentKind[]; maxFallbackAttempts: number };
}

The parser reads `agent.kind`, selects the corresponding override block (e.g., `agent.claude` when `kind: claude`), and merges it into the `SessionConfig` passed to the adapter's `initialize()`.

4.3 Linear Label Override

The WORKFLOW.md `agent.kind` is the default. A Linear issue label can override it:

Linear LabelEffect
`agent:claude`Use Claude adapter regardless of WORKFLOW.md
`agent:codex`Use Codex adapter
`agent:gemini`Use Gemini adapter
(no label)Use WORKFLOW.md `agent.kind`

The `LinearClient.fetchNewIssues()` method extracts `agent:*` labels and passes the override to the orchestrator. This allows per-issue agent selection without modifying WORKFLOW.md.

---

5. State Machine Updates

5.1 IssueState Discriminated Union (Updated)

The `session_starting` and `session_active` states now store `agentKind` alongside the session identifier. The field name changes from `threadId` to `agentSessionId` to be agent-agnostic.

typescript
// packages/state-machine/src/types.ts
type IssueState =
  | { status: "queued"; enqueuedAt: number; agentKind: AgentKind }
  | { status: "retry_queued"; enqueuedAt: number; retryCount: number; agentKind: AgentKind }
  | { status: "cloning"; startedAt: number; agentKind: AgentKind }
  | { status: "workspace_ready"; workspacePath: string; agentKind: AgentKind }
  | { status: "session_starting"; workspacePath: string; agentSessionId: string; agentKind: AgentKind }
  | { status: "session_active"; workspacePath: string; agentSessionId: string; turnCount: number; pid: number; agentKind: AgentKind }
  | { status: "reconciling"; workspacePath: string; agentSessionId: string; output: SessionOutput; agentKind: AgentKind }
  | { status: "done"; workspacePath: string; prUrl: string | null; completedAt: number; agentKind: AgentKind }
  | { status: "failed"; reason: string; retryCount: number; lastFailedAt: number; agentKind: AgentKind }
  | { status: "released"; releasedAt: number; reason: string; agentKind: AgentKind };

Changes from Stage 2/3:
- `threadId` renamed to `agentSessionId` (covers UUID for Claude, threadId for Codex, synthetic for Gemini)
- `agentKind` added to every state variant
- `retry_queued` added as distinct state (Stage 3 Audit 1 action item)
- `released` added for non-failure exits (Stage 3 Audit 1 action item)

5.2 Transition Function (Agent-Aware)

The transition function itself remains agent-agnostic. It does not branch on `agentKind`. The `agentKind` field is carried through state transitions as data, not logic. This is a critical design decision: the FSM does not know or care which agent is running. It only knows the lifecycle states.

typescript
function transition(state: IssueState, event: TransitionEvent): IssueState {
  // agentKind is threaded through -- never branched on
  switch (state.status) {
    case "cloning":
      if (event.type === "clone_complete")
        return { status: "workspace_ready", workspacePath: event.workspacePath, agentKind: state.agentKind };
      // ...
    case "session_active":
      if (event.type === "session_complete")
        return { status: "reconciling", workspacePath: state.workspacePath, agentSessionId: state.agentSessionId, output: event.output, agentKind: state.agentKind };
      // ...
  }
}

5.3 SQLite Schema (Updated)

sql
CREATE TABLE IF NOT EXISTS issue_runs (
  id TEXT PRIMARY KEY,
  issue_id TEXT NOT NULL,
  issue_identifier TEXT NOT NULL,
  agent_kind TEXT NOT NULL DEFAULT 'codex',   -- NEW: "claude" | "codex" | "gemini"
  agent_session_id TEXT,                       -- RENAMED from thread_id
  state_json TEXT NOT NULL,
  workspace_path TEXT,
  retry_count INTEGER DEFAULT 0,
  created_at INTEGER NOT NULL,
  updated_at INTEGER NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_issue_runs_status ON issue_runs(json_extract(state_json, '$.status'));
CREATE INDEX IF NOT EXISTS idx_issue_runs_agent ON issue_runs(agent_kind);

5.4 Supabase Schema (Updated)

sql
ALTER TABLE symphony_runs ADD COLUMN agent_kind TEXT NOT NULL DEFAULT 'codex';
ALTER TABLE symphony_runs RENAME COLUMN thread_id TO agent_session_id;
CREATE INDEX idx_symphony_runs_agent ON symphony_runs(agent_kind);

---

6. Session Resume Patterns Per Agent

The crash recovery logic from Stage 2 Step 3 (`recovery.ts`) must branch by `agentKind` because resume semantics differ fundamentally.

typescript
// services/symphony-daemon/src/recovery.ts
async function recoverSessions(
  store: IssueStore,
  adapterFactory: typeof createAdapter,
): Promise<void> {
  const incomplete = store.recoverIncomplete();

  for (const run of incomplete) {
    const state = JSON.parse(run.state_json) as IssueState;

    if (state.status === "session_active" && state.agentSessionId) {
      const adapter = adapterFactory(state.agentKind);

      switch (state.agentKind) {
        case "codex": {
          // CODEX: thread/resume replays all RolloutItems from
          // [home-path]
          // Full context restored. Zero progress lost.
          logger.info({ runId: run.id, agentSessionId: state.agentSessionId },
            "Recovering Codex session via thread/resume");
          await adapter.initialize(buildConfigForRun(run));
          await adapter.startSession({ resumeId: state.agentSessionId });
          orchestrator.attachRecoveredSession(run.id, adapter);
          break;
        }

        case "claude": {
          // CLAUDE: --resume <session_id> loads from
          // [home-path]
          // Full context restored. Equivalent to Codex thread/resume.
          logger.info({ runId: run.id, agentSessionId: state.agentSessionId },
            "Recovering Claude session via --resume");
          await adapter.initialize(buildConfigForRun(run));
          await adapter.startSession({ resumeId: state.agentSessionId });
          orchestrator.attachRecoveredSession(run.id, adapter);
          break;
        }

        case "gemini": {
          // GEMINI: No reliable session resume.
          // Strategy: re-queue with context injection.
          // The prompt renderer will prepend the previous session's
          // output as "Context from previous attempt" to the new prompt.
          logger.warn({ runId: run.id },
            "Gemini session cannot be resumed -- re-queuing with context injection");
          const contextSummary = await readPreviousOutput(run.workspace_path);
          store.persist(run.id, {
            status: "queued",
            enqueuedAt: Date.now(),
            agentKind: "gemini",
            _recoveryContext: contextSummary,  // Stored in SQLite, injected by prompt renderer
          });
          break;
        }
      }
    } else if (state.status === "cloning" || state.status === "session_starting") {
      // Pre-session: re-queue from scratch for any agent
      store.persist(run.id, { status: "queued", enqueuedAt: Date.now(), agentKind: state.agentKind });
    } else if (state.status === "reconciling") {
      // Post-session: retry reconciliation (agent-agnostic)
      await orchestrator.reconcile(run.id, state);
    }
  }
}

Resume Capability Summary

AgentResume MethodContext PreservationData Loss on Crash
Codex`thread/resume` by threadIdFull replay of all RolloutItemsNone
Claude`--resume <session_id>`Full session history from diskNone
GeminiRe-queue with context injectionPartial (summary of previous output)Turns completed before crash are lost but summarized

---

7. Orchestrator Updates

7.1 Agent Selection Logic

The orchestrator determines which agent to use through a priority chain:

typescript
function selectAgent(issue: LinearIssue, workflowConfig: WorkflowConfig): AgentKind {
  // Priority 1: Linear label override (highest priority)
  const agentLabel = issue.labels.find(l => l.startsWith("agent:"));
  if (agentLabel) {
    const kind = agentLabel.split(":")[1] as AgentKind;
    if (["claude", "codex", "gemini"].includes(kind)) return kind;
  }

  // Priority 2: WORKFLOW.md agent.kind (default)
  return workflowConfig.agent.kind;
}

7.2 Fallback Chain

When a session fails and retries are exhausted for the primary agent, the fallback chain activates:

typescript
function selectFallbackAgent(
  failedKind: AgentKind,
  workflowConfig: WorkflowConfig,
  fallbackAttemptCount: number,
): AgentKind | null {
  if (!workflowConfig.fallback.enabled) return null;
  if (fallbackAttemptCount >= workflowConfig.fallback.maxFallbackAttempts) return null;

  const chain = workflowConfig.fallback.chain;
  const currentIdx = chain.indexOf(failedKind);
  const nextIdx = currentIdx + 1;

  if (nextIdx >= chain.length) return null;
  return chain[nextIdx];
}

Fallback creates a new run with a different `agentKind` but the same issue. The previous run transitions to `released` with reason `"fallback_to_{nextAgent}"`. The new run starts from `queued` with the fallback agent.

7.3 Concurrency Limits Per Agent

Different agents have different resource profiles. The orchestrator tracks per-agent concurrency separately:

typescript
interface ConcurrencyLimits {
  global: number;        // Total across all agents (from symphony.toml max_concurrent_sessions)
  perAgent: Record<AgentKind, number>;
}

// Default limits (configurable in symphony.toml)
const DEFAULT_LIMITS: ConcurrencyLimits = {
  global: 5,
  perAgent: {
    claude: 3,           // Each Claude session is a subprocess (~150MB RSS)
    codex: 2,            // Codex app-server is heavier (~300MB RSS)
    gemini: 3,           // Gemini subprocess is lighter but disk-intensive
  },
};

The `drainQueue()` method checks both global and per-agent limits before spawning:

typescript
private drainQueue(): void {
  while (this.activeSessions.size < this.limits.global && this.queue.size > 0) {
    const next = this.queue.peek();
    if (!next || next.enqueuedAt > Date.now()) break;

    const agentCount = this.countActiveSessions(next.agentKind);
    if (agentCount >= this.limits.perAgent[next.agentKind]) {
      // This agent type is at capacity. Skip to next item in queue.
      // (Priority queue may have items for a different agent kind.)
      this.queue.skipHead();
      continue;
    }

    this.queue.dequeue();
    this.spawnSession(next);
  }
}

7.4 Adapter Pool

The orchestrator maintains a pool of adapter instances rather than creating one per session:

typescript
class AdapterPool {
  private pools: Record<AgentKind, AgentAdapter[]> = {
    claude: [],
    codex: [],
    gemini: [],
  };

  async acquire(kind: AgentKind, config: SessionConfig): Promise<AgentAdapter> {
    // Check for an idle adapter of the correct kind
    const pool = this.pools[kind];
    const idle = pool.find(a => a.isAlive() && !this.isInUse(a));
    if (idle) return idle;

    // Create new adapter
    const adapter = createAdapter(kind);
    await adapter.initialize(config);
    pool.push(adapter);
    return adapter;
  }

  async release(adapter: AgentAdapter): Promise<void> {
    // Mark as available for reuse (Codex) or shutdown (Claude/Gemini subprocess)
    if (adapter.kind === "codex") {
      // Codex app-server persists -- reuse the process for the next session
      return;
    }
    // Claude/Gemini: subprocess-per-session, shutdown after use
    await adapter.shutdown();
    this.pools[adapter.kind] = this.pools[adapter.kind].filter(a => a !== adapter);
  }
}

Codex adapters are reused because the app-server process supports multiple threads. Claude and Gemini adapters are disposed after each session because they are subprocess-per-invocation.

7.5 Prometheus Metrics (Updated)

All existing metrics gain an `agent` label:

typescript
readonly sessionsActive = this.gauge("symphony_sessions_active", "In-flight sessions", ["agent"]);
readonly sessionsTotal = this.counter("symphony_sessions_total", "Total sessions by outcome", ["outcome", "agent"]);
readonly sessionDuration = this.histogram("symphony_session_duration_seconds", "Wall-clock per session",
  [30, 60, 120, 300, 600, 1200, 1800, 2700], ["agent"]);

// New multi-agent specific metrics
readonly fallbacksTotal = this.counter("symphony_fallbacks_total", "Fallback agent activations", ["from_agent", "to_agent"]);
readonly adapterInitDuration = this.histogram("symphony_adapter_init_ms", "Adapter initialization time", [100, 500, 1000, 3000, 5000], ["agent"]);
readonly resumeSuccess = this.counter("symphony_resume_total", "Session resume attempts", ["agent", "outcome"]);

---

8. Updated Master Checklist

The original Stage 3 master plan had 44 tasks across 12 phases. This expansion replaces Phase 3 (Codex Client Integration) with four sub-phases and updates Phase 10 (Testing).

PHASE 3A: Agent Adapter Interface (1 hour)

  • [ ] 3A.1 Define AgentAdapter interface, AgentEvent union, and supporting types
  • Owner: Claude
  • Input: Section 2 of this document
  • Output: `packages/agent-adapters/src/interface.ts`
  • Validation: `bun typecheck` passes. `AgentKind` exhaustiveness enforced.
  • Depends on: 1.2 (monorepo scaffold)
  • Status: Not Started
  • [ ] 3A.2 Implement approval mode mapper
  • Owner: Claude
  • Input: Section 3 approval mode mappings per agent
  • Output: `packages/agent-adapters/src/mappings/approval-modes.ts`
  • Validation: Unit test covers all 4 modes x 3 agents = 12 mappings
  • Depends on: 3A.1
  • Status: Not Started
  • [ ] 3A.3 Implement factory function
  • Owner: Claude
  • Input: Section 3.4
  • Output: `packages/agent-adapters/src/factory.ts`
  • Validation: `createAdapter("claude")` returns ClaudeAdapter. Invalid kind throws.
  • Depends on: 3A.1
  • Status: Not Started
  • [ ] 3A.4 Implement session registry
  • Owner: Claude
  • Input: Maps symphonyRunId -> { agentKind, agentSessionId }
  • Output: `packages/agent-adapters/src/session-registry.ts`
  • Validation: Register, lookup, deregister operations
  • Depends on: 3A.1
  • Status: Not Started

PHASE 3B: Claude Adapter (2 hours)

  • [ ] 3B.1 Implement Claude event mapper
  • Owner: Claude
  • Input: Multi-agent research Section 5.4 event mapping table
  • Output: `packages/agent-adapters/src/mappings/claude-events.ts`
  • Validation: Unit test: map SystemMessage(init), StreamEvent(text_delta), ResultMessage to AgentEvent
  • Depends on: 3A.1
  • Status: Not Started
  • [ ] 3B.2 Implement ClaudeAdapter class
  • Owner: Claude
  • Input: Section 3.1 of this document
  • Output: `packages/agent-adapters/src/adapters/claude.ts`
  • Validation: Integration test with real `claude` binary: initialize, start session, send prompt, receive text_delta events, end session
  • Depends on: 3B.1, 3A.2
  • Status: Not Started
  • [ ] 3B.3 Implement approval bridge for Claude
  • Owner: Claude
  • Input: Claude's `canUseTool` callback -> Symphony's event-based approval model
  • Output: Approval bridge class within `adapters/claude.ts` (deferred promise map pattern)
  • Validation: Integration test: set approval mode to "ask", trigger a tool use, verify approval_request event is emitted, respond with accept, verify tool executes
  • Depends on: 3B.2
  • Status: Not Started

PHASE 3C: Codex Adapter (2 hours)

  • [ ] 3C.1 Generate Codex TypeScript types
  • Owner: Claude
  • Input: `codex app-server generate-ts --out packages/agent-adapters/src/types/`
  • Output: `packages/agent-adapters/src/types/codex-types.ts`
  • Validation: Types compile. Spot-check against protocol docs.
  • Depends on: 1.2
  • Status: Not Started
  • [ ] 3C.2 Implement Codex event mapper
  • Owner: Claude
  • Input: Multi-agent research Section 5.4
  • Output: `packages/agent-adapters/src/mappings/codex-events.ts`
  • Validation: Unit test: map item/started, item/completed, turn/completed, requestApproval
  • Depends on: 3C.1, 3A.1
  • Status: Not Started
  • [ ] 3C.3 Implement CodexAdapter class with full handshake
  • Owner: Claude
  • Input: Section 3.2 + Stage 3 Audit 2 corrections
  • Output: `packages/agent-adapters/src/adapters/codex.ts`
  • Validation: Integration test with real `codex app-server`: initialize handshake, thread/start, turn/start, consume events until turn/completed, interrupt
  • Depends on: 3C.2, 3A.2
  • Status: Not Started
  • [ ] 3C.4 Validate Codex handshake empirically
  • Owner: Mohamed
  • Input: Run `codex app-server` manually, send initialize + initialized + thread/start + turn/start via stdin, capture full transcript
  • Output: Transcript file at `[home-path]`
  • Validation: Transcript shows complete lifecycle
  • Depends on: Nothing (can run in parallel)
  • Status: Not Started
  • NOTE: This is the highest-risk task (from Stage 3 Audit 2). Its failure could invalidate the entire Codex adapter design.

PHASE 3D: Gemini Adapter MVP (1.5 hours)

  • [ ] 3D.1 Define Gemini NDJSON types manually
  • Owner: Claude
  • Input: Multi-agent research Section 2.2
  • Output: `packages/agent-adapters/src/types/gemini-types.ts`
  • Validation: Types cover init, message, tool_use, tool_result, error, result event types
  • Depends on: 3A.1
  • Status: Not Started
  • [ ] 3D.2 Implement Gemini event mapper
  • Owner: Claude
  • Input: Multi-agent research Section 5.4
  • Output: `packages/agent-adapters/src/mappings/gemini-events.ts`
  • Validation: Unit test: map each Gemini NDJSON event type to AgentEvent
  • Depends on: 3D.1
  • Status: Not Started
  • [ ] 3D.3 Implement GeminiAdapter class
  • Owner: Claude
  • Input: Section 3.3 of this document
  • Output: `packages/agent-adapters/src/adapters/gemini.ts`
  • Validation: Integration test with real `gemini` binary: spawn with `-p`, parse stream-json output, verify session_start and turn_complete events
  • Depends on: 3D.2, 3A.2
  • Status: Not Started

PHASE 5 UPDATE: Orchestrator (Updated for Multi-Agent)

The following tasks from the original Phase 5 are modified:

  • [ ] 5.3 (UPDATED) Implement Orchestrator with agent selection
  • Owner: Claude
  • Input: Section 7 of this document (agent selection, fallback chain, concurrency limits, adapter pool)
  • Output: `services/symphony-daemon/src/orchestrator.ts` with `selectAgent()`, `selectFallbackAgent()`, `AdapterPool`, per-agent concurrency tracking. `drainQueue()` respects both global and per-agent limits.
  • Validation: Integration test: configure WORKFLOW.md with `agent.kind: claude`, create issue with `agent:codex` label, verify Codex is used (label overrides WORKFLOW.md)
  • Depends on: 3A-3D (all adapter phases), 4.1 (Linear client), 5.1, 5.2
  • Status: Not Started

PHASE 10 UPDATE: Multi-Agent Test Matrix

  • [ ] 10.3 (REPLACED) Integration test: Claude session lifecycle
  • Owner: Claude + Mohamed
  • Input: Real `claude` binary
  • Output: `packages/agent-adapters/src/__tests__/claude.integration.test.ts`
  • Tests: initialize, startSession, sendPrompt (receive text_delta events), interrupt, endSession (verify usage), resume after kill
  • Status: Not Started
  • [ ] 10.3B (NEW) Integration test: Codex session lifecycle
  • Owner: Claude + Mohamed
  • Input: Real `codex app-server`
  • Output: `packages/agent-adapters/src/__tests__/codex.integration.test.ts`
  • Tests: full 4-phase handshake, thread/start, turn/start, event streaming, turn/interrupt, thread/resume
  • Status: Not Started
  • [ ] 10.3C (NEW) Integration test: Gemini session lifecycle
  • Owner: Claude + Mohamed
  • Input: Real `gemini` binary
  • Output: `packages/agent-adapters/src/__tests__/gemini.integration.test.ts`
  • Tests: subprocess spawn with stream-json, NDJSON parsing, exit code handling (0, 1, 42, 53)
  • Status: Not Started
  • [ ] 10.5 (UPDATED) End-to-end test: full issue lifecycle with each agent
  • Owner: Mohamed (manual)
  • Input: 3 Linear issues, each with a different `agent:*` label, pointing to a test repo
  • Output: All 3 agents pick up their respective issues, run sessions, create PRs, post Linear comments
  • Validation: 3 PRs on GitHub, 3 Linear comments, 3 `symphony_runs` rows with correct `agent_kind`
  • Status: Not Started
  • [ ] 10.7 (NEW) Fallback chain test
  • Owner: Mohamed (manual)
  • Input: Issue configured with `agent.kind: gemini` and `fallback.chain: [gemini, claude]`. Gemini intentionally fails (e.g., model not available).
  • Output: Gemini run transitions to `released`, new run with `agent_kind: claude` is created and completes
  • Validation: Two `symphony_runs` rows: one Gemini (released), one Claude (done)
  • Status: Not Started
  • [ ] 10.8 (NEW) Crash recovery test per agent
  • Owner: Mohamed (manual)
  • Input: Start 3 sessions (one per agent), kill daemon mid-session
  • Output: After LaunchAgent restart: Codex resumes via thread/resume, Claude resumes via --resume, Gemini re-queues with context injection
  • Validation: Logs show correct recovery strategy per agent. All 3 sessions complete.
  • Status: Not Started

---

9. Practical Considerations

9.1 Cost Model Per Agent

AgentPricing ModelEstimated Cost per Issue SessionNotes
Claude Code (Opus)$15/M input + $75/M output$2-$8 per session (20 turns)`max_budget_usd` cap available
Claude Code (Sonnet)$3/M input + $15/M output$0.50-$2 per sessionBetter for simple issues
Codex (o4-mini)Free with Pro/Team subscription$0 marginal costLimited by rate limits
Codex (gpt-5.3)API pricing TBD$1-$5 per session (estimate)Not yet available for all users
Gemini CLI (2.5 Pro)Free tier: 25 RPM$0 for low volume1M context window advantage
Gemini CLI (2.5 Flash)Free tier: 100 RPM$0 for low volumeFastest for simple tasks

Cost strategy: Use Codex (o4-mini) as the default for most issues (free with subscription). Escalate to Claude Opus for complex architectural issues. Use Gemini 2.5 Pro for issues requiring massive context (long files, large codebases) where the 1M context window is an advantage.

9.2 Model Selection Strategy

Issue Triage -> Agent Selection:

  Simple bug fix (< 100 LOC change)
    -> Codex o4-mini (free, fast) or Gemini 2.5 Flash (free, fastest)

  Moderate feature (100-500 LOC, multi-file)
    -> Claude Sonnet (best code quality per dollar) or Codex o4-mini

  Complex architecture (500+ LOC, cross-system)
    -> Claude Opus (highest reasoning quality)

  Large context (needs to read 10+ files, 50K+ tokens of context)
    -> Gemini 2.5 Pro (1M context window, free tier)

  Security-sensitive (auth, payments, encryption)
    -> Claude Opus with read-only mode first, then auto-edit

This strategy can be encoded as rules in WORKFLOW.md using Liquid conditionals:

liquid
{% if labels contains "security" %}
  {# Force Claude Opus for security issues #}
{% elsif labels contains "simple-fix" %}
  {# Use cheapest available agent #}
{% endif %}

9.3 MCP Server Compatibility

MCP FeatureClaude CodeCodexGemini CLI
Stdio MCP serversNative supportNative supportNative support (extensions)
SSE MCP serversSupportedSupportedNot yet confirmed
HTTP MCP serversSupportedExperimentalNot yet confirmed
In-process MCPSDK onlyNot supportedNot supported

Symphony should only use stdio MCP servers for cross-agent compatibility. The 15 registered MCP servers in Mohamed's setup (see MEMORY.md) all use stdio transport, so this is not a practical limitation.

MCP server configuration flows through `SessionConfig.mcpServers` to the adapter, which translates to the agent's native MCP format:
- Claude: `mcpServers` option in `ClaudeAgentOptions`
- Codex: MCP server registration via app-server config
- Gemini: `--extensions` flag or `.gemini/settings.json`

9.4 Configuration Updates to `symphony.toml`

toml
# New sections for multi-agent support

[agents]
default = "codex"                    # Default agent if WORKFLOW.md does not specify

[agents.claude]
binary = "claude"
max_concurrent = 3
default_model = "sonnet"
max_budget_usd = 10.0

[agents.codex]
binary = "codex"
args = ["app-server"]
max_concurrent = 2
default_model = "o4-mini"

[agents.gemini]
binary = "gemini"
max_concurrent = 3
default_model = "gemini-2.5-pro"

[agents.fallback]
enabled = true
chain = ["codex", "claude"]
max_attempts = 1

The original `[codex]` section is replaced by `[agents.codex]`. The `[agents]` section provides global defaults that WORKFLOW.md can override.

---

10. Architecture Diagram

                          +-------------------+
                          |   Linear API      |
                          |  (poll / webhook) |
                          +--------+----------+
                                   |
                          +--------v----------+
                          |   LinearClient    |
                          |  (label parser,   |
                          |   agent override) |
                          +--------+----------+
                                   |
                                   v
     +------------------------------------------------------------+
     |                    ORCHESTRATOR                              |
     |                                                              |
     |  selectAgent() --> agentKind                                |
     |  AdapterPool.acquire(kind) --> adapter                      |
     |  drainQueue() respects per-agent concurrency limits         |
     |  handleCompletion() --> reconcile or fallback               |
     |                                                              |
     |  +------------------+  +-----------+  +-------------------+ |
     |  | Promise.race()   |  | Priority  |  | Session           | |
     |  | poll|webhook|done|  | Queue     |  | Completions       | |
     |  +------------------+  +-----------+  +-------------------+ |
     +-----+------------------+------------------+-----------------+
           |                  |                  |
           v                  v                  v
     +-----------+     +-----------+     +-----------+
     |  Claude   |     |  Codex    |     |  Gemini   |
     |  Adapter  |     |  Adapter  |     |  Adapter  |
     |           |     |           |     |           |
     | SDK Client|     | app-server|     | subprocess|
     | subprocess|     | JSON-RPC  |     | stream-json|
     +-----------+     +-----------+     +-----------+
           |                  |                  |
           v                  v                  v
     +-----+------------------+------------------+----+
     |              EVENT NORMALIZER                    |
     |  Claude msg -> AgentEvent                       |
     |  Codex notification -> AgentEvent               |
     |  Gemini NDJSON -> AgentEvent                    |
     +---+-----------+------------+-----------+--------+
         |           |            |           |
         v           v            v           v
     +-------+  +--------+  +---------+  +---------+
     | SQLite|  |Supabase|  |Prometheus|  | Logger  |
     | (local|  | (remote|  | :8601   |  | (JSON)  |
     | truth)|  | truth) |  |         |  |         |
     +-------+  +--------+  +---------+  +---------+

---

11. Migration Path from Stage 2/3

The multi-agent expansion is backward compatible. The migration is additive, not destructive.

### Phase 1: Package Rename (30 min)
1. `mv packages/codex-client packages/agent-adapters`
2. Move existing `CodexSession` class into `adapters/codex.ts`
3. Add `interface.ts`, `factory.ts` with Codex as the only implementation
4. Update imports in `state-machine` and `symphony-daemon`
5. All existing tests pass -- behavior is identical, only the package name changes

### Phase 2: Add Claude Adapter (2 hours)
1. Implement `adapters/claude.ts` and `mappings/claude-events.ts`
2. Add `agent_kind` column to SQLite and Supabase schemas
3. Update `IssueState` to carry `agentKind`
4. Update WORKFLOW.md parser to read `agent` section
5. Codex continues as default -- Claude is available but opt-in via label

### Phase 3: Add Gemini Adapter (1.5 hours)
1. Implement `adapters/gemini.ts` and `mappings/gemini-events.ts`
2. Update recovery logic for Gemini's no-resume behavior
3. All three agents available -- selection via WORKFLOW.md or label

### Phase 4: Orchestrator Multi-Agent Features (2 hours)
1. Implement `AdapterPool` with per-agent concurrency
2. Implement fallback chain
3. Add multi-agent Prometheus labels
4. Update Grafana dashboard with agent breakdown

Total estimated time for the multi-agent expansion: 6-7 hours of implementation on top of the existing Stage 2/3 architecture.

---

12. What This Document Does NOT Cover

1. ACP convergence. Both Claude Code and Gemini CLI support ACP (Agent Client Protocol). If Codex adds ACP support, Symphony could use a single ACP transport for all three agents, eliminating the per-agent adapter code. This is a future optimization, not a V1 requirement.

2. Remote agent execution. All three adapters currently assume the agent binary is available locally. Running agents on remote machines (e.g., Mac2 for Claude, Mac4 for Codex) would require SSH-based transport or a network adapter layer.

3. Cost optimization routing. Automatically selecting the cheapest agent for a given issue complexity requires a complexity estimator, which does not exist yet. The strategy in Section 9.2 is manual (label-based). Automated routing is a V2 feature.

4. Multi-agent collaboration. Running two agents on the same issue (e.g., Claude reviews code written by Codex) is architecturally possible (two runs for one issue, different `agentKind`) but not specified in this expansion.

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

evo-cube-output/stage3-multi-agent-expansion.md

Detected Structure

Method · Evaluation · Figures · Code Anchors · Architecture · is Stage Research