Agent Router -- Convergence Broker
The agent router decides which AI agent handles a given task. It scores complexity, checks rate limits, and falls back through a priority chain when the preferred agent is unavailable.
Full Public Reader
Agent Router -- Convergence Broker
The agent router decides which AI agent handles a given task. It scores complexity, checks rate limits, and falls back through a priority chain when the preferred agent is unavailable.
---
1. Routing Rules
Each agent has a domain where it performs best. The router matches task characteristics to agent strengths.
### Claude Opus
- Complex architecture decisions spanning multiple systems
- iOS SwiftUI development (TCA, navigation, async patterns)
- Multi-file refactors that require full codebase context
- System design, protocol definitions, cross-service integration
- Creative writing, naming, brand voice work
- Tasks requiring 100K+ context windows
### Claude Sonnet
- Standard feature implementation (single module, 2-5 files)
- Code review and quality analysis
- Bug investigation with moderate context needs
- Documentation generation from code
- Test suite creation for existing modules
- Prompt engineering and template refinement
### Codex (OpenAI)
- Focused single-file code changes with clear specs
- Test writing from function signatures
- Linting fixes, formatting, mechanical refactors
- Dependency version bumps with migration
- Regex construction, data transformation scripts
- Tasks with well-defined input/output contracts
### Gemini
- Media generation (images via Imagen, video via Veo)
- Vision analysis (screenshot review, UI audit, OCR)
- Large context research (200K+ token documents)
- Multi-modal tasks (image + text reasoning)
- Batch summarization of large corpora
- Audio transcription and analysis
### OpenCode (cloud-vm)
- Docker container management and debugging
- Infrastructure configuration (Caddy, Prometheus, systemd)
- Cloud-vm-local file operations and service restarts
- Database migrations on cloud-vm PostgreSQL
- Log analysis and monitoring setup
- Network debugging (iptables, Tailscale, port forwarding)
---
2. Complexity Scoring
Every incoming task gets a 0-10 complexity score. The score drives agent selection.
Scoring Dimensions
| Dimension | Weight | 0 (low) | 5 (mid) | 10 (high) |
|---|---|---|---|---|
| File scope | 0.25 | 1 file | 3-5 files | 10+ files or cross-repo |
| Context depth | 0.20 | Self-contained | Needs 1-2 related modules | Needs full system understanding |
| Ambiguity | 0.20 | Exact spec given | Some design decisions | Open-ended, creative |
| Risk | 0.15 | Reversible, no deploy | Needs testing | Production data, infra changes |
| Domain expertise | 0.20 | Generic coding | Framework-specific | Platform-specific (iOS, Rust, Stacks) |
Score to Agent Mapping
0-3: Codex (fast, cheap, focused)
Examples: add a test, fix a lint error, rename a variable,
update a version string, write a regex
4-6: Claude Sonnet (balanced cost/capability)
Examples: implement a Prefect flow, add a SwiftUI view,
refactor a service class, write integration tests,
build a CLI command
7-10: Claude Opus (maximum capability)
Examples: design a new subsystem, multi-service refactor,
debug a cross-machine coordination issue, architect
a new app from scratch, creative/brand work
Special: Gemini (media/vision tasks regardless of complexity)
OpenCode (cloud-vm tasks regardless of complexity)Score Calculation
def score_task(task) -> float:
return (
task.file_count_score * 0.25 +
task.context_depth * 0.20 +
task.ambiguity * 0.20 +
task.risk * 0.15 +
task.domain_expertise * 0.20
)---
3. Rate Limit Awareness
The router must respect API rate limits and ACMP pool rotation.
Check Sequence
1. Pool status check: Query ACMP account pool (`acmp_autonomy_level` state) for current rotation position and remaining capacity per account.
2. Rate limit headers: Cache the most recent `x-ratelimit-remaining` and `retry-after` values per agent type. Source these from prompt-logger session data or direct API response headers.
3. Backoff state: Maintain a per-agent cooldown timer. If an agent returned 429 in the last N seconds, mark it as throttled.
Limits to Track
| Agent | Requests/min | Tokens/min | Daily cap |
|---|---|---|---|
| Claude Opus | 50 | 80K output | Budget-gated |
| Claude Sonnet | 100 | 160K output | Budget-gated |
| Codex | 20 | 100K output | Usage-tier |
| Gemini Flash | 60 | 200K output | Free tier generous |
| Gemini Pro | 10 | 50K output | Pay-per-use |
Queue Discipline
- Never queue more than 3 tasks for a single agent type. If the queue hits 3, overflow to the next agent in the fallback chain.
- Tasks already in-flight do not count against the queue limit (they have been dispatched).
- Priority tasks (user-initiated, blocking) can bypass the queue limit by 1 (max 4).
---
4. Fallback Chain
When the preferred agent is unavailable (rate limited, down, queue full), the router walks a fallback chain.
Primary Chain
Claude Opus -> Claude Sonnet -> Codex -> Gemini -> queueDomain-Specific Overrides
iOS/SwiftUI: Opus -> Sonnet -> queue (no fallback to non-Claude)
Media/Vision: Gemini -> Opus -> queue (Gemini is primary)
Infrastructure: OpenCode -> Sonnet -> Codex -> queue
Simple fixes: Codex -> Sonnet -> queue (skip Opus entirely)Fallback Behavior
1. Downgrade gracefully: When falling back from Opus to Sonnet, split multi-file tasks into per-file subtasks if possible. Sonnet handles focused work better than sprawling context.
2. Never upgrade silently: The router does not promote a task from Codex-tier to Opus-tier. If Codex is unavailable and the task scores 0-3, fall through to Sonnet but flag it as "overqualified agent" in telemetry.
3. Queue as last resort: If all agents in the chain are exhausted, the task enters a retry queue with exponential backoff (30s, 60s, 120s, max 5min). The queue is drained on the next availability check.
4. Human escalation: If a task has been in the retry queue for >15 minutes, emit a notification (Telegram/Discord) with the task summary and current agent status.
---
5. Routing Decision Record
Every routing decision is logged for cost analysis and tuning.
{
"task_id": "t_abc123",
"timestamp": "2026-03-18T14:30:00Z",
"complexity_score": 6.2,
"preferred_agent": "claude-sonnet",
"actual_agent": "claude-sonnet",
"fallback_used": false,
"fallback_reason": null,
"estimated_tokens": 45000,
"estimated_cost_cents": 22.5,
"queue_depth_at_dispatch": 1,
"decision_latency_ms": 12
}---
6. Integration Points
### Inputs
- Task descriptor: From NUMU bus, Prefect flow, or direct CLI dispatch
- ACMP pool state: `[home-path]` + Supabase `account_pool`
- Rate limit cache: `[home-path]` (updated by prompt-logger hooks)
- Cost tracker: `[home-path]` (from `creative_cost_tracker.py`)
### Outputs
- Dispatch event: NUMU bus message `agent.dispatch` with agent, task_id, estimated cost
- Routing log: Append to `[home-path]`
- Metrics: Prometheus counters via `push_counter("agent_dispatch", {agent, complexity_tier})`
### Dependencies
- `feedhub.config` for Supabase credentials
- `feedhub.webhooks` for Telegram/Discord notifications on escalation
- `creative_cost_tracker` for budget awareness (do not dispatch if daily spend exceeds threshold)
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
projects/convergence-broker/agent-router.md
Detected Structure
Method · Evaluation · Code Anchors · Architecture