Grand Diomande Research · Full HTML Reader

Agent Router -- Convergence Broker

The agent router decides which AI agent handles a given task. It scores complexity, checks rate limits, and falls back through a priority chain when the preferred agent is unavailable.

Agents That Account for Themselves research note experiment writeup candidate score 24 .md

Full Public Reader

Agent Router -- Convergence Broker

The agent router decides which AI agent handles a given task. It scores complexity, checks rate limits, and falls back through a priority chain when the preferred agent is unavailable.

---

1. Routing Rules

Each agent has a domain where it performs best. The router matches task characteristics to agent strengths.

### Claude Opus
- Complex architecture decisions spanning multiple systems
- iOS SwiftUI development (TCA, navigation, async patterns)
- Multi-file refactors that require full codebase context
- System design, protocol definitions, cross-service integration
- Creative writing, naming, brand voice work
- Tasks requiring 100K+ context windows

### Claude Sonnet
- Standard feature implementation (single module, 2-5 files)
- Code review and quality analysis
- Bug investigation with moderate context needs
- Documentation generation from code
- Test suite creation for existing modules
- Prompt engineering and template refinement

### Codex (OpenAI)
- Focused single-file code changes with clear specs
- Test writing from function signatures
- Linting fixes, formatting, mechanical refactors
- Dependency version bumps with migration
- Regex construction, data transformation scripts
- Tasks with well-defined input/output contracts

### Gemini
- Media generation (images via Imagen, video via Veo)
- Vision analysis (screenshot review, UI audit, OCR)
- Large context research (200K+ token documents)
- Multi-modal tasks (image + text reasoning)
- Batch summarization of large corpora
- Audio transcription and analysis

### OpenCode (cloud-vm)
- Docker container management and debugging
- Infrastructure configuration (Caddy, Prometheus, systemd)
- Cloud-vm-local file operations and service restarts
- Database migrations on cloud-vm PostgreSQL
- Log analysis and monitoring setup
- Network debugging (iptables, Tailscale, port forwarding)

---

2. Complexity Scoring

Every incoming task gets a 0-10 complexity score. The score drives agent selection.

Scoring Dimensions

Dimension	Weight	0 (low)	5 (mid)	10 (high)
File scope	0.25	1 file	3-5 files	10+ files or cross-repo
Context depth	0.20	Self-contained	Needs 1-2 related modules	Needs full system understanding
Ambiguity	0.20	Exact spec given	Some design decisions	Open-ended, creative
Risk	0.15	Reversible, no deploy	Needs testing	Production data, infra changes
Domain expertise	0.20	Generic coding	Framework-specific	Platform-specific (iOS, Rust, Stacks)

Score to Agent Mapping

0-3:  Codex (fast, cheap, focused)
      Examples: add a test, fix a lint error, rename a variable,
      update a version string, write a regex

4-6:  Claude Sonnet (balanced cost/capability)
      Examples: implement a Prefect flow, add a SwiftUI view,
      refactor a service class, write integration tests,
      build a CLI command

7-10: Claude Opus (maximum capability)
      Examples: design a new subsystem, multi-service refactor,
      debug a cross-machine coordination issue, architect
      a new app from scratch, creative/brand work

Special: Gemini (media/vision tasks regardless of complexity)
         OpenCode (cloud-vm tasks regardless of complexity)

Score Calculation

python

def score_task(task) -> float:
    return (
        task.file_count_score * 0.25 +
        task.context_depth * 0.20 +
        task.ambiguity * 0.20 +
        task.risk * 0.15 +
        task.domain_expertise * 0.20
    )

---

3. Rate Limit Awareness

The router must respect API rate limits and ACMP pool rotation.

Check Sequence

1. Pool status check: Query ACMP account pool (`acmp_autonomy_level` state) for current rotation position and remaining capacity per account.

2. Rate limit headers: Cache the most recent `x-ratelimit-remaining` and `retry-after` values per agent type. Source these from prompt-logger session data or direct API response headers.

3. Backoff state: Maintain a per-agent cooldown timer. If an agent returned 429 in the last N seconds, mark it as throttled.

Limits to Track

Agent	Requests/min	Tokens/min	Daily cap
Claude Opus	50	80K output	Budget-gated
Claude Sonnet	100	160K output	Budget-gated
Codex	20	100K output	Usage-tier
Gemini Flash	60	200K output	Free tier generous
Gemini Pro	10	50K output	Pay-per-use

Queue Discipline

Never queue more than 3 tasks for a single agent type. If the queue hits 3, overflow to the next agent in the fallback chain.
Tasks already in-flight do not count against the queue limit (they have been dispatched).
Priority tasks (user-initiated, blocking) can bypass the queue limit by 1 (max 4).

---

4. Fallback Chain

When the preferred agent is unavailable (rate limited, down, queue full), the router walks a fallback chain.

Primary Chain

Claude Opus -> Claude Sonnet -> Codex -> Gemini -> queue

Domain-Specific Overrides

iOS/SwiftUI:     Opus -> Sonnet -> queue (no fallback to non-Claude)
Media/Vision:    Gemini -> Opus -> queue (Gemini is primary)
Infrastructure:  OpenCode -> Sonnet -> Codex -> queue
Simple fixes:    Codex -> Sonnet -> queue (skip Opus entirely)

Fallback Behavior

1. Downgrade gracefully: When falling back from Opus to Sonnet, split multi-file tasks into per-file subtasks if possible. Sonnet handles focused work better than sprawling context.

2. Never upgrade silently: The router does not promote a task from Codex-tier to Opus-tier. If Codex is unavailable and the task scores 0-3, fall through to Sonnet but flag it as "overqualified agent" in telemetry.

3. Queue as last resort: If all agents in the chain are exhausted, the task enters a retry queue with exponential backoff (30s, 60s, 120s, max 5min). The queue is drained on the next availability check.

4. Human escalation: If a task has been in the retry queue for >15 minutes, emit a notification (Telegram/Discord) with the task summary and current agent status.

---

5. Routing Decision Record

Every routing decision is logged for cost analysis and tuning.

json

{
  "task_id": "t_abc123",
  "timestamp": "2026-03-18T14:30:00Z",
  "complexity_score": 6.2,
  "preferred_agent": "claude-sonnet",
  "actual_agent": "claude-sonnet",
  "fallback_used": false,
  "fallback_reason": null,
  "estimated_tokens": 45000,
  "estimated_cost_cents": 22.5,
  "queue_depth_at_dispatch": 1,
  "decision_latency_ms": 12
}

---

6. Integration Points

### Inputs
- Task descriptor: From NUMU bus, Prefect flow, or direct CLI dispatch
- ACMP pool state: `[home-path]` + Supabase `account_pool`
- Rate limit cache: `[home-path]` (updated by prompt-logger hooks)
- Cost tracker: `[home-path]` (from `creative_cost_tracker.py`)

### Outputs
- Dispatch event: NUMU bus message `agent.dispatch` with agent, task_id, estimated cost
- Routing log: Append to `[home-path]`
- Metrics: Prometheus counters via `push_counter("agent_dispatch", {agent, complexity_tier})`

### Dependencies
- `feedhub.config` for Supabase credentials
- `feedhub.webhooks` for Telegram/Discord notifications on escalation
- `creative_cost_tracker` for budget awareness (do not dispatch if daily spend exceeds threshold)

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

projects/convergence-broker/agent-router.md

Detected Structure

Method · Evaluation · Code Anchors · Architecture