Back to corpus
research noteexperiment writeup candidatescore 44

Pulse Plan: Cognitive Twin V2 — Decoupled RLM + Qwen 3.5

Benchmark results (March 4, 2026) using Qwen3-Next-80B-A3B via Together AI: - Config A (Bare): 29.5% → Config D (Full RLM): 93.6% - RAG is the biggest lever (+57.7%), RLM adds meaningful value on multi-hop (+3.9%) - API inference is fast (~1-2s/question) and free (Together serverless) - Target: 97%+ accuracy with fine-tuned Qwen3.5-35B-A3B on local exo cluster

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

**Created:** 2026-03-04 **Status:** 🟢 ACTIVE **Priority:** HIGH **Estimated Duration:** 5 waves across 5 weeks Benchmark results (March 4, 2026) using Qwen3-Next-80B-A3B via Together AI: - Config A (Bare): 29.5% → Config D (Full RLM): 93.6% - RAG is the biggest lever (+57.7%), RLM adds meaningful value on multi-hop (+3.9%) - API inference is fast (~1-2s/question) and free (Together serverless) - Target: 97%+ accuracy with fine-tuned Qwen3.5-35B-A3B on local exo cluster #### 1.1 — Extract RAGLayer Module - **Input:** `twin_server_v3.py` monolithic code - **Output:** `layers/rag.py` with clean interface - **Spec:** - `RAGLayer(kb_paths, embed_fn)` constructor - `.search(query, top_k=3) -> list[SearchResult]` - `.to_context(results) -> str` - Support Gemini + Ollama embedding backends - Unit tests with mock embeddings - **Verify:** Run benchmark with extracted layer, scores unchanged #### 1.2 — Extract GraphLayer Module - **Input:** BFS traversal code from `twin_server_v3.py` - **Output:** `layers/graph.py` - **Spec:** - `GraphLayer(graph_path)` constructor - `.traverse(query, max_depth=2) -> str` - Add fuzzy node matching via embeddings (not just string match) - Unit tests - **Verify:** Config C score should IMPROVE with fuzzy matching #### 1.3 — Extract RLMLayer Module - **Input:** Decomposition logic from both benchmark scripts - **Output:** `layers/rlm.py` - **Spec:** - `RLMLayer(rag, graph, llm_fn)` constructor - `.should_decompose(query) -> bool` - `.decompose(query) -> list[str]` - `.retrieve(query) -> str` (full pipeline) - Configurable decomposition signals - Unit tests with mock LLM - **Verify:** Config D scores unchanged

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.