Grand Diomande Research ยท Full HTML Reader

RAG++ & Graph Kernel โ€” Detailed Improvement Analysis

| Track | Target | Status | ETA | |-------|--------|--------|-----| | **Track 1: MiniMax Density Scoring** | kimi_memory.db (9.1K user turns) | ๐ŸŸข 62% complete, 0 errors | ~1.5h | | **Track 2: RAG++ & Graph Kernel** | Search quality, entity enrichment | ๐Ÿ”ด Audited, issues documented | Needs work |

Agents That Account for Themselves proposal experiment writeup candidate score 40 .md

Full Public Reader

RAG++ & Graph Kernel โ€” Detailed Improvement Analysis

Generated: 2026-02-16 21:20 EST

---

Current State of Play

Two Parallel Tracks Running

TrackTargetStatusETA
Track 1: MiniMax Density Scoringkimi_memory.db (9.1K user turns)๐ŸŸข 62
Track 2: RAG++ & Graph KernelSearch quality, entity enrichment๐Ÿ”ด Audited, issues documentedNeeds work

---

TRACK 1: MiniMax Density Scoring โ€” Status

Running process: PID 56430, `density_scorer.py --role user --min-length 50 --parallel 4`

**Progress at 62

TierCount
CORE146
ENRICHED559
ACTIVE2,983
PRUNED1,317
ERROR (parse)695

Projected final (at current ratios):
- CORE: ~235 turns
- ENRICHED: ~900 turns
- High-value (CORE+ENRICHED): ~1,135 turns for v9 expansion

Output: `Desktop/cognitive-twin/output/density_scores_20260216_183513.jsonl` (5,792 lines, 1.2MB)

---

TRACK 2: RAG++ Deep Dive

Architecture Overview

RAG++ (Python, FastAPI, :8000)
โ”œโ”€โ”€ Ingestion Layer
โ”‚   โ”œโ”€โ”€ clawdbot_bridge.py     โ†’ Clawdbot sessions โ†’ Supabase
โ”‚   โ”œโ”€โ”€ claude_code_bridge.py  โ†’ Claude Code sessions โ†’ Supabase
โ”‚   โ”œโ”€โ”€ embedder.py            โ†’ Gemini embedding-001 (768-dim)
โ”‚   โ””โ”€โ”€ prompt_bridge.py       โ†’ Orbit prompt tracking
โ”œโ”€โ”€ Retrieval Layer
โ”‚   โ”œโ”€โ”€ query.py               โ†’ MemoryRetriever (Supabase pgvector)
โ”‚   โ”œโ”€โ”€ dual_plane.py          โ†’ Raw turns + Semantic artifacts
โ”‚   โ”œโ”€โ”€ quality.py             โ†’ QualityReranker (stall/exec scoring)
โ”‚   โ”œโ”€โ”€ intent.py              โ†’ QueryIntentAnalyzer
โ”‚   โ””โ”€โ”€ provenance.py          โ†’ Slice-conditioned retrieval
โ”œโ”€โ”€ Generation Layer
โ”‚   โ”œโ”€โ”€ enhanced.py            โ†’ EnhancedGenerationEngine
โ”‚   โ”œโ”€โ”€ context.py             โ†’ Context assembly
โ”‚   โ””โ”€โ”€ synthesis.py           โ†’ Response synthesis
โ”œโ”€โ”€ ML Layer
โ”‚   โ”œโ”€โ”€ cognitivetwin_v2.py    โ†’ 270M param model (MPS)
โ”‚   โ”œโ”€โ”€ inference/             โ†’ Inference server
โ”‚   โ””โ”€โ”€ training/              โ†’ Continuous learning pipeline
โ””โ”€โ”€ Service Layer (API Routes)
    โ”œโ”€โ”€ context.py             โ†’ /api/context/search (general search)
    โ”œโ”€โ”€ rag.py                 โ†’ /api/rag/search, /enhanced, /global, /slice
    โ”œโ”€โ”€ graph_enrichment.py    โ†’ /api/rag/enrich (GK bridge)
    โ”œโ”€โ”€ topology.py            โ†’ /api/topology/* (3D visualization)
    โ””โ”€โ”€ training.py            โ†’ /api/training/*

Data Stats (Supabase `memory_turns`)

MetricValue
Total turns245,499
With embeddings234,731 (95.6
With salience scores245,499 (100
Embedding modelGemini embedding-001 (768-dim)
Search methodpgvector cosine similarity via `search_memory` RPC

Lifecycle Distribution:

TierCount
CORE1,474
ENRICHED24,049
ACTIVE106,408
LOW20,788
PRUNED15,038
(unclassified)77,742

Role split: 62,602 user / 178,201 assistant / ~5K other

Issues Found (Stress Test Results)

#### Issue 1: Search Returns Irrelevant Old AI Outputs
Severity: HIGH

Query: "What is Mo passionate about?"
- Result 1: [0.670] "I have a passion for public speaking..." โ† OLD ChatGPT generic output
- Result 2: [0.663] "I am an avid learner..." โ† More generic AI text
- Result 3: [0.660] "I am a passionate traveler..." โ† Completely wrong

Root cause: No lifecycle filtering in default search. PRUNED turns (15K) and old generic ChatGPT outputs rank by pure cosine similarity without quality weighting. The `min_salience` parameter exists but defaults to 0.0.

Fix: Set `min_salience` default to 0.3 in `search_context()` route AND exclude `lifecycle_status = 'pruned'` by default.

#### Issue 2: Duplicate Results
Severity: MEDIUM

Query: "Why did we choose Supabase?"
- Returns the exact same assistant message 3 times (all [0.707])

Root cause: The RAG Bridge is creating duplicate entries during sync. The `search_memory` RPC doesn't deduplicate by content hash.

Fix: Add `content_hash` column to `memory_turns`, deduplicate during ingestion, and add DISTINCT filtering in search RPC.

#### Issue 3: Enhanced Search Endpoint Broken
Severity: MEDIUM

`GET /api/rag/search/enhanced?q=test` โ†’ 422 "Field required"

Root cause: Endpoint expects `query` parameter, not `q`. Inconsistent with `/api/context/search` which uses `q`.

Fix: Normalize all search endpoints to accept both `q` and `query`.

#### Issue 4: Style Signature Never Computed
Severity: HIGH

`GET /api/rag/signature` โ†’ `{"signature": null, "confidence": 0.0, "update_count": 0}`

Root cause: The `twin/extract-style` endpoint was never called with enough data. The continuous learning pipeline is OFF.

Fix: Run style extraction on CORE+ENRICHED turns, enable continuous learning.

#### Issue 5: No Recency Bias in Default Search
Severity: MEDIUM

Old 2023 ChatGPT outputs rank higher than 2026 conversations because embeddings don't account for temporal relevance.

Root cause: The `recency_boost` parameter exists in enhanced search (default 0.15) but the basic `/api/context/search` doesn't use it.

Fix: Add recency scoring to context search, or route all searches through enhanced search.

Concrete Improvement Plan

Phase 1 โ€” Quick Wins (can do now, code changes only):

1. Default min_salience to 0.3 in `/api/context/search`
- File: `rag_plusplus/service/routes/context.py` line ~28
- Change: `min_salience: float = Query(0.0, ...)` โ†’ `Query(0.3, ...)`

2. Exclude PRUNED from default search
- File: `rag_plusplus/retrieval/query.py` in `_vector_search()`
- Add: `lifecycle_status != 'pruned'` filter to Supabase query

3. Fix enhanced search parameter
- File: `rag_plusplus/service/routes/rag.py` line 926
- Add alias: accept both `q` and `query`

4. Add deduplication to search results
- File: `rag_plusplus/retrieval/query.py` in `search()`
- Post-process: dedupe by content_hash before returning

Phase 2 โ€” Salience-Weighted Search (medium effort):

5. Composite scoring in search: `final_score = 0.7 similarity + 0.2 salience + 0.1 * recency`
- File: `rag_plusplus/retrieval/query.py`
- Or update `search_memory` RPC in Supabase to include salience weighting

6. Run style extraction on top 1,474 CORE turns
- Endpoint: `POST /api/rag/twin/extract-style`
- Will compute Mo's linguistic fingerprint

7. Enable continuous learning pipeline
- Endpoint: `POST /api/rag/twin/continuous/start`
- Will keep style signature updated as new turns arrive

Phase 3 โ€” Wire as Claw's Recall Backend:

8. Create a Clawdbot plugin that calls RAG++ context search before answering recall questions
- Replace broken `memory_search` tool with RAG++ backend
- No need for OpenAI API key โ€” uses Gemini embeddings already

---

TRACK 2b: Graph Kernel Deep Dive

Architecture

Graph Kernel (Rust binary, :8001)
โ”œโ”€โ”€ Source: Desktop/Comp-Core/packages/admissibility-kernel/
โ”œโ”€โ”€ Binary: [home-path]
โ”œโ”€โ”€ DB: Supabase PostgreSQL (same as RAG++)
โ”œโ”€โ”€ Config: DATABASE_URL + KERNEL_HMAC_SECRET
โ””โ”€โ”€ Purpose: Context slicing + admissibility verification

Actual API Routes (from Rust source)

MethodPathPurposeStatus
POST/api/sliceConstruct context slice around anchor turnโœ… Exists
POST/api/slice/batchBatch context slicingโœ… Exists
POST/api/verify_tokenVerify admissibility tokenโœ… Exists
GET/api/policiesList registered policiesโœ… Exists
POST/api/policiesRegister new policyโœ… Exists
GET/healthHealth checkโœ… Working
GET/health/liveLiveness probeโœ… Exists
GET/health/readyReadiness probeโœ… Exists
GET/health/startupStartup probeโœ… Exists

The Disconnect

What RAG++ expects from GK:
- `GET /api/knowledge?subject=X` โ†’ Triple query (entity relationships)
- `POST /api/knowledge/traverse` โ†’ Graph traversal
- `GET /api/knowledge/aliases` โ†’ Entity alias lookup

What GK actually provides:
- `/api/slice` โ†’ Context slicing (admissibility-focused)
- `/api/verify_token` โ†’ Token verification
- `/api/policies` โ†’ Policy management

These are completely different capabilities. The GK is an admissibility kernel (slice-based context management), not a knowledge graph (entity-relationship storage). The RAG++ bridge was written assuming GK would have knowledge graph endpoints that were never implemented.

The /api/knowledge/batch 422 Mystery

Something is hitting `/api/knowledge/batch` every ~5 minutes, getting 422 each time. This endpoint doesn't exist in the Rust routes โ€” it's falling through to a catch-all that returns 422 instead of 404. The caller is likely the Clawdbot ingestion pipeline trying to push extracted triples.

What GK CAN Do (Currently)

1. Context Slicing โ€” Given an anchor turn, construct a context window of related turns
2. Admissibility Tokens โ€” HMAC-signed tokens proving a slice was properly constructed
3. Policy Management โ€” Register and apply retrieval policies

What GK NEEDS to Do (for the RAG++ bridge to work)

Option A: Add knowledge graph endpoints to GK
- Add triple storage (subject, predicate, object) table to Supabase
- Implement `GET /api/knowledge?subject=X`
- Implement `POST /api/knowledge/traverse`
- Implement `GET /api/knowledge/aliases`
- This makes GK a true knowledge graph + admissibility kernel

Option B: Update RAG++ bridge to use GK's actual capabilities
- Replace entity enrichment with slice-based context enrichment
- Use `/api/slice` to get contextually-related turns for any search result
- Use admissibility tokens to filter search results by context coherence
- This leverages what GK already does well

Recommendation: Option B first (faster), then Option A for full knowledge graph.

Concrete GK Improvement Plan

Phase 1 โ€” Fix the bridge (use what exists):

1. Update graph_enrichment.py to use `/api/slice` instead of `/api/knowledge`
- For each search result, ask GK to slice around it
- Use slice turn_ids to fetch related turns as context
- Attach slice context to enriched results

2. Stop the 422 spam โ€” find and fix whatever is calling `/api/knowledge/batch`
- Check launchd services, cron jobs, or the ingestion pipeline

3. Wire slice-conditioned search โ€” RAG++ already has `search_slice()` in query.py
- Currently not exposed through any route handler efficiently
- Make it the default for high-precision queries

Phase 2 โ€” Add knowledge graph to GK (Rust):

4. Add knowledge tables to Supabase:

sql
   CREATE TABLE knowledge_triples (
     id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
     subject TEXT NOT NULL,
     predicate TEXT NOT NULL,
     object TEXT NOT NULL,
     source_turn_id UUID REFERENCES memory_turns(id),
     confidence FLOAT DEFAULT 1.0,
     created_at TIMESTAMPTZ DEFAULT now()
   );
   CREATE TABLE entity_aliases (
     canonical TEXT NOT NULL,
     alias TEXT NOT NULL,
     PRIMARY KEY (canonical, alias)
   );

5. Add routes to Rust service:
- `GET /api/knowledge?subject=X` โ†’ query triples
- `POST /api/knowledge/batch` โ†’ batch insert triples (fix the 422)
- `POST /api/knowledge/traverse` โ†’ BFS/DFS traversal
- `GET /api/knowledge/aliases` โ†’ alias resolution

6. Extract triples from CORE turns โ€” use the existing entity extraction in graph_enrichment.py to populate the knowledge graph

---

Unified Improvement Roadmap

### Tonight (Track 1 finishing)
- [ ] MiniMax scoring completes (~1.5h)
- [ ] Generate v9 expansion from CORE+ENRICHED turns

### Quick Wins (1-2 hours of work)
- [ ] Set min_salience default to 0.3
- [ ] Exclude PRUNED from default search
- [ ] Fix enhanced search parameter bug
- [ ] Add result deduplication
- [ ] Find and fix the /api/knowledge/batch caller

### Medium Effort (4-6 hours)
- [ ] Implement composite scoring (similarity + salience + recency)
- [ ] Run style extraction on CORE turns
- [ ] Update graph_enrichment.py to use /api/slice
- [ ] Wire RAG++ as Claw's recall backend (replace memory_search)

### Deep Work (1-2 days)
- [ ] Add knowledge triple tables to Supabase
- [ ] Implement knowledge graph routes in Rust GK
- [ ] Extract and populate knowledge graph from CORE+ENRICHED turns
- [ ] Enable continuous learning pipeline
- [ ] Full integration test: search โ†’ enrich โ†’ generate

---

Both tracks are complementary. Track 1 (MiniMax scoring) feeds Track 2 (RAG++ quality) โ€” the scored turns become the quality filter for search results. The Graph Kernel becomes useful once it can provide structural context alongside RAG++'s semantic search.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

cognitive-twin/RAG_IMPROVEMENT_ANALYSIS.md

Detected Structure

Method ยท Evaluation ยท References ยท Code Anchors ยท Architecture