Evolution³ — Stage 0: Research Brief (V4)
**V2 (Technical Substrate)** decided: - Centralized TypeScript daemon with hot standby on Mac4 - TOML declarative manifests for service/flow/skill definitions - WebSocket pub/sub event bus for inter-service messaging - 43-task master plan, 264 hours, 10-16 weeks
Full Public Reader
# Evolution³ — Stage 0: Research Brief (V4)
## NUMU FARE — Rust Core + Native Thread Intelligence + openclaw Library Integration
### Generated: 2026-03-05 | Method: 3 parallel research agents + 2 prior evo-cube runs (V2 substrate, V3 personal OS)
---
Foundation: What V2 and V3 Established
V2 (Technical Substrate) decided:
- Centralized TypeScript daemon with hot standby on Mac4
- TOML declarative manifests for service/flow/skill definitions
- WebSocket pub/sub event bus for inter-service messaging
- 43-task master plan, 264 hours, 10-16 weeks
V3 (Personal Integration) decided:
- Name: NUMU FARE (ߒߎߡߎ ߝߊ߬) — "The Forge That Crosses"
- CLI: `numu`, Fleet: `numu fleet`, Intelligence: "The Weave"
- 147 skills with heat tracking + memory spines (40 active, 107 cold storage)
- Adaptive enrichment curriculum (MASTER/CRAFTSMAN/JOURNEYMAN depth)
- ADI personas as forge temperatures
- 7-wave implementation plan (~7 weeks, score target 7.6→9.2)
V4 introduces three new dimensions that reshape the architecture:
1. The openclaw/openclaw library patterns — proven at scale, directly adoptable
2. Rust as the daemon language — performance, safety, deployment characteristics
3. Native Discord thread management — threads as first-class workspaces, not afterthoughts
---
Research Dimension 1: openclaw/openclaw Library Patterns
### What It Is
A TypeScript/Node.js personal AI assistant platform (263K+ GitHub stars). It solves many of the same problems NUMU FARE faces: multi-model orchestration, skill/tool management, context enrichment, memory search, session management.
9 Adoptable Patterns
Pattern 1: Typed WebSocket Protocol (TypeBox)
- Every WS message has a runtime-validated schema via TypeBox
- Client and server share type definitions — no parsing ambiguity
- Messages have `type`, `correlationId`, `payload` structure
- Relevance: V2 specified WS event bus but left message schema open. This closes that gap.
- Adoption: Use serde + JSON schema in Rust, or TypeBox if staying TypeScript
Pattern 2: Subagent Registry with Announce Retry
- Each spawned agent gets a `runId` UUID
- Lifecycle: `pending → running → ended → announced`
- If announce fails, retry with exponential backoff (max 3 retries)
- 30-minute hard expiry — orphan detection kills stale agents
- `current_agents` array in parent tracks all children
- Relevance: NUMU currently has no subagent lifecycle tracking. Enriched_spawn.py fires and forgets.
- Adoption: Direct port. The `SubagentRegistry` becomes a core NUMU module.
Pattern 3: Auth Profile Rotation with Cooldowns
- Ordered list of auth profiles (accounts)
- On rate limit or error: mark profile "bad" with cooldown timer
- On success: mark "good" with timestamp
- Exponential backoff on repeated failures
- Runtime snapshot persistence (survives restart)
- Relevance: NUMU has 2 Claude Max accounts with ACMP rotation. This formalizes it.
- Adoption: Direct port. Replace daemon_v3.js ACMP logic with proper rotation module.
Pattern 4: Skills as SKILL.md Modules
- Each skill is a directory with a `SKILL.md` file
- YAML frontmatter declares: name, description, requirements, model preferences
- LLM selects skills at runtime based on task analysis
- Skill files are loaded dynamically, not hardcoded
- Relevance: NUMU already has 147 `[home-path]` files. The pattern matches exactly.
- Adoption: Already adopted. V4 adds the formalized metadata parsing that openclaw uses.
Pattern 5: Lightweight Context Mode
- Cron/heartbeat runs skip loading AGENTS.md, SOUL.md, heavy context
- Task metadata flags: `lightweight: true` or context tier
- Saves 15-20K tokens per lightweight dispatch
- Relevance: Maps directly to V3's JOURNEYMAN/CRAFTSMAN/MASTER enrichment tiers.
- Adoption: Extend the V3 curriculum system with a `CRON` tier below JOURNEYMAN that loads almost nothing.
Pattern 6: Session-Level Model Override
- Any session can specify a model override that applies for its duration
- Does not affect other concurrent sessions
- Cascades to subagents unless they have their own override
- Relevance: NUMU's Discord already has model prefix overrides (gemini:, claude:, codex:). This makes them session-scoped.
- Adoption: Add session-level model persistence in the dispatch context.
Pattern 7: Hybrid Memory Search (BM25 + Vector + Temporal Decay + MMR)
- BM25 for keyword matching (fast, good for exact terms)
- Vector similarity for semantic matching
- Temporal decay weights recent memories higher
- MMR (Maximal Marginal Relevance) ensures diversity in results
- Relevance: RAG++ currently uses vector-only. Adding BM25 and MMR would improve retrieval quality.
- Adoption: Enhance RAG++ :8000 with BM25 fallback and MMR re-ranking.
Pattern 8: Session Compaction with Task Preservation
- When context exceeds 40
- Chunk → summarize → merge strategy
- Active tasks, todos, and code state are preserved through compaction
- Structural markers guide what to keep vs. compress
- Relevance: Claude Code already compacts, but NUMU's enriched context doesn't survive compaction well. This ensures critical state persists.
- Adoption: Add compaction-aware markers to enriched context injection.
Pattern 9: Cron Stagger
- Multiple scheduled tasks don't all fire at :00
- Stagger offsets prevent burst-loading the system
- Health checks verify pre-conditions before cron execution
- Relevance: NUMU's 50 Prefect flows sometimes cluster. This prevents burst overload.
- Adoption: Add stagger offsets to Prefect flow schedules.
---
Research Dimension 2: Rust Implementation Assessment
The Case FOR Rust
| Factor | Rust | TypeScript (Node.js) | TypeScript (Bun) |
|---|---|---|---|
| Memory | ~10-15 MB daemon | ~80-120 MB | ~40-60 MB |
| Startup | <100ms | 1-2s | 200-500ms |
| CPU (routing) | ~0.5ms per decision | ~5ms | ~3ms |
| Binary | Single static binary | node_modules + runtime | bun + node_modules |
| Crash safety | No null, no data races | Runtime exceptions | Runtime exceptions |
| Concurrency | Tokio (zero-cost async) | Event loop (single-threaded) | Event loop |
| Dependencies | cargo.lock reproducible | npm (dependency hell) | npm (same) |
Crate Ecosystem (verified):
- `tokio` — Async runtime (the standard)
- `serenity` — Discord API library (feature-complete, threads, forums, voice)
- `reqwest` — HTTP client
- `serde` / `serde_json` / `toml` — Serialization
- `axum` — Web framework (from tokio team)
- `sqlx` — Async Postgres/SQLite
- `tracing` — Structured logging
- `clap` — CLI argument parsing
- `notify` — File watcher (for config hot-reload)
- `tera` / `minijinja` — Template engines
- `jsonschema` — JSON schema validation
- `tungstenite` / `tokio-tungstenite` — WebSocket
The Case AGAINST Rust
| Factor | Impact |
|---|---|
| Learning curve | 2-4 months for borrow checker fluency. Mo is a solo operator. |
| Iteration speed | Compile times (30s-2min for full build) vs. instant reload in TS |
| Discord library maturity | `serenity` is good but community is smaller than discord.js |
| MCP integration | MCP protocol libraries exist in TS/Python. Rust would need custom implementation |
| Rapid prototyping | Rust's type system slows experimentation |
| Existing codebase | 4,671 lines of working JS/TS orchestration code. Rewrite cost is real. |
Recommendation: Hybrid Approach
Keep TypeScript (Bun) for the daemon. The NUMU daemon is an orchestration layer — it reads config, dispatches tasks, manages lifecycle. It does NOT do heavy compute. The 80MB→40MB reduction from Node→Bun is sufficient. The iteration speed advantage of TypeScript matters for a solo operator who changes the daemon weekly.
Use Rust for performance-critical modules:
1. The Skill Matcher — 147 skills × embedding similarity on every dispatch. Rust + SIMD = sub-millisecond.
2. The Memory Indexer — BM25 + vector hybrid search. Rust outperforms Python/TS by 10-50x.
3. The Session Compactor — Token counting + chunk boundary detection. CPU-bound, benefits from Rust.
4. The Sentinel — Health check daemon that must be ultra-lightweight and never crash. Rust's zero-cost abstractions are ideal.
Communication: Rust modules expose HTTP APIs (axum) or MCP tool interfaces. The TypeScript daemon calls them as needed. This is the same pattern as RAG++ (Python) being called by enriched_spawn.py (Python) from the daemon (TypeScript).
Migration Path:
- Month 1-3: Build in TypeScript/Bun. Ship fast. Prove architecture.
- Month 4-6: Profile hotspots. Rewrite hottest paths in Rust.
- Month 7+: Evaluate full Rust daemon if TypeScript becomes a bottleneck.
If Building a Full Rust Daemon (Alternative Path)
numu-daemon/
├── Cargo.toml # Workspace root
├── crates/
│ ├── numu-core/ # Event bus, config, lifecycle
│ │ ├── src/
│ │ │ ├── config.rs # TOML config parsing + hot-reload
│ │ │ ├── bus.rs # WebSocket pub/sub event bus
│ │ │ ├── dispatch.rs # Task routing + SmartRouter logic
│ │ │ └── registry.rs # Subagent registry (from openclaw pattern)
│ ├── numu-discord/ # Discord gateway + thread intelligence
│ │ ├── src/
│ │ │ ├── gateway.rs # Serenity-based bot
│ │ │ ├── threads.rs # Thread lifecycle manager
│ │ │ ├── forums.rs # Forum channel task boards
│ │ │ └── reactions.rs # Reaction reflex handlers
│ ├── numu-skills/ # Skill manifest, heat tracking, matching
│ │ ├── src/
│ │ │ ├── manifest.rs # 147-skill flat manifest
│ │ │ ├── heat.rs # Heat score calculation
│ │ │ ├── matcher.rs # SEA embedding matcher
│ │ │ └── spine.rs # Memory spine per skill
│ ├── numu-intel/ # Intelligence pipeline
│ │ ├── src/
│ │ │ ├── rag.rs # RAG++ client
│ │ │ ├── memory.rs # Hybrid BM25+vector search
│ │ │ ├── enrichment.rs # Adaptive curriculum builder
│ │ │ └── compaction.rs # Session compaction
│ ├── numu-sentinel/ # Health monitoring + self-healing
│ │ ├── src/
│ │ │ ├── health.rs # Fleet health checks
│ │ │ ├── immune.rs # Adaptive immunity
│ │ │ └── launcher.rs # Service lifecycle management
│ └── numu-cli/ # CLI entry point
│ └── src/main.rs # clap-based CLI---
Research Dimension 3: Native Discord Thread Intelligence
Thread Types (Discord API)
| Type | ID | Description | Use in NUMU |
|---|---|---|---|
| PUBLIC_THREAD | 11 | Visible to channel members | Task workspaces |
| PRIVATE_THREAD | 12 | Invite-only | Sensitive ops (credentials, billing) |
| ANNOUNCEMENT_THREAD | 10 | In announcement channels | Release notes, status broadcasts |
| GUILD_FORUM | 15 | Forum channel with tags | Task boards, project tracking |
Thread-as-Workspace Pattern
Current state: Clawdbot's `threadMode: 'per-task'` creates threads but treats them as flat message containers. No thread context awareness. No thread lifecycle. No cross-thread intelligence.
V4 Design — Thread Intelligence:
1. Thread Context Injection
Before responding in a thread, read the last 50 messages to understand context:
buildThreadContext(threadId) → {
summary: AI-generated summary of thread so far,
participants: who has contributed,
artifacts: pinned messages, code blocks, links,
status: inferred from last messages (active/blocked/waiting/done),
parentChannel: which channel spawned this thread,
relatedThreads: threads with similar topics (vector similarity)
}This context is injected into the enriched dispatch, making the agent "aware" of the thread's full history.
2. Forum Channels as Task Boards
Convert `#task-dispatch` from a flat channel to a `GUILD_FORUM` (type 15):
Forum: #forge-tasks
├── Tags: [pending, in-progress, blocked, done, review]
├── Thread: "Deploy Spore v1.3 to TestFlight"
│ ├── Tag: in-progress
│ ├── Pinned: ExportOptions.plist config
│ └── Messages: build log, error resolution, success confirmation
├── Thread: "Graph Kernel reconnection"
│ ├── Tag: blocked
│ ├── Pinned: Error 65 Supabase trace
│ └── Messages: diagnosis attempts, workaround proposals
└── Thread: "NUMU daemon scaffold"
├── Tag: pending
└── Messages: architecture decisions, code snippetsAvailable tags per forum: Up to 20 tags. Map to: priority (P0-P3), status (pending/active/blocked/done/review), domain (ios/infra/intel/discord/content), temperature (phoenix/sengoku/reddog/monkey).
3. Thread Lifecycle Management
CREATE → ACTIVE → [BLOCKED] → RESOLVED → ARCHIVED
↓ ↓ ↓ ↓
Name it Work in Flag with Pin summary
Tag it it tag change Archive thread
Memory pipeline- Auto-archive: Threads with no activity for 24h get tagged `stale`. After 72h, auto-archive with AI summary.
- Thread-to-Memory: On archive, generate summary → write to Obsidian vault → index in RAG++.
- Thread Resurrection: Reference an archived thread in a new message → auto-unarchive with context.
4. Cross-Thread Awareness
When dispatching a new task, check existing threads:
- BM25 keyword match on thread names + pinned messages
- Vector similarity on thread summaries
- If match score > 0.7: reply in existing thread instead of creating new one
- If match score 0.4-0.7: create new thread but link to related thread
5. Thread Reply Intelligence
The bot should understand thread hierarchy:
- Direct reply to a specific message in a thread → response should reference that message
- Thread starter message → response addresses the original task
- Reaction to a message in a thread → triggers the reaction reflex (approve/reject/gold/priority)
- Bot can pin its own messages (artifacts, summaries, final results)
6. Thread-Based Team Decomposition
Current: `/team <task>` creates subtask threads. V4 enhancement:
/team "Deploy all Wave 3 apps"
→ Forum thread: "Team: Deploy all Wave 3 apps" [tag: in-progress]
→ Sub-thread 1: "Subtask: Build archives" [tag: pending]
→ Sub-thread 2: "Subtask: Upload to ASC" [tag: blocked-by:1]
→ Sub-thread 3: "Subtask: Submit for review" [tag: blocked-by:2]
→ Summary post: Progress table, updated automatically7. Webhook Threading
Discord webhooks support `thread_id` parameter — send messages to specific threads:
POST /webhooks/{id}/{token}?thread_id={thread_id}This means Prefect flows, Sentinel alerts, and build pipelines can post directly to their relevant thread instead of flooding `#heartbeat`.
### Rate Limits
- Thread creation: 10 per channel per 10 minutes
- Message send: 5 per second per channel
- Thread list fetch: 50 per request (paginated)
- Archive/unarchive: Standard API rate (50/sec global)
---
Research Dimension 4: Integration with V3 Architecture
### Skills Integration (147 Skills → Thread-Aware)
- Each skill activation can create or reply in a thread
- Skill memory spines persist per-thread context
- Combinators (multi-skill squads) get their own threads
- SEA activation logging includes thread context
### Pane Awareness → Thread Awareness
- Pane registry tracks which Claude Code pane is working in which thread
- Cross-pollination detection works across threads (not just panes)
- Claim conflicts prevent two panes from modifying the same thread simultaneously
### Enrichment Engine → Thread Context Layer
Add an 8th layer to the 7-layer enrichment:
Layer 1: Identity (AGENTS.md, SOUL.md)
Layer 2: Project (CLAUDE.md, CHECKLIST.md)
Layer 3: Skills (heat-tracked, spine-injected)
Layer 4: Semantic Memory (RAG++ + CIA + BM25/MMR)
Layer 5: Protocols (Research-First, CEF, TEAM)
Layer 6: Quality Gates
Layer 7: Race Protocol
Layer 8: Thread Context (NEW — thread summary, participants, artifacts, related threads)### Build Pipeline → Thread-Aware
- Each `numu shipyard build <app>` creates a thread for the build
- Build logs stream into the thread in real-time
- Errors pin themselves with stack traces
- Success pins the TestFlight URL
---
Open Questions for Stage 1
1. Rust or TypeScript/Bun for the daemon? The hybrid approach is recommended, but each Stage 1 path should explore what a full commitment to one language looks like.
2. How deep should thread intelligence go? Thread context injection adds latency (50-message fetch + AI summary). What's the right depth vs. speed tradeoff?
3. Forum channels vs. regular channels? Forum channels (GUILD_FORUM) are powerful but change the UX significantly. Should all task channels be forums, or just `#forge-tasks`?
4. openclaw patterns: port or adapt? Some patterns (subagent registry, auth rotation) can be ported almost verbatim. Others (hybrid memory, compaction) need adaptation for NUMU's unique context.
5. Thread-to-memory pipeline: How much of a thread should be preserved? Full transcript? AI summary only? Key artifacts?
---
End of Stage 0 Research Brief (V4). Sourced from: openclaw/openclaw GitHub analysis, Rust crate ecosystem audit, Discord API thread documentation, V2 substrate decisions, V3 NUMU FARE architecture.
Ready for Stage 1: EXPLORE — 3 divergent paths incorporating Rust, threads, and openclaw patterns into NUMU FARE.
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
evo-cube-output/v4-rust-threads/stage0-research-brief.md
Detected Structure
Method · Evaluation · References · Code Anchors · Architecture · is Stage Research