RAG++ Architecture
RAG++ is a trajectory-aware retrieval-augmented generation system that maintains a unified knowledge fabric across conversations, ideas, code, and motion data. It powers the CognitiveTwin — a personalized AI that learns user-specific reasoning patterns.
Full Public Reader
RAG++ Architecture
Version: 3.0 (Post-CognitiveTwin V3 + Idea Vault Integration)
Last Updated: January 2025
---
System Overview
RAG++ is a trajectory-aware retrieval-augmented generation system that maintains a unified knowledge fabric across conversations, ideas, code, and motion data. It powers the CognitiveTwin — a personalized AI that learns user-specific reasoning patterns.
┌─────────────────────────────────────────────────────────────────────────────┐
│ USER INTERFACES │
├─────────────────────────────────────────────────────────────────────────────┤
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ │
│ │ ChatGPT (MCP) │ │ Claude Desktop │ │ Cursor IDE │ │
│ │ │ │ Prompt Logger │ │ Prompt Logger │ │
│ └────────┬────────┘ └────────┬────────┘ └──────────────┬──────────────┘ │
└────────────┼────────────────────┼──────────────────────────┼────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ ORBIT SERVER │
│ (Rust / Axum) │
│ Port: 3847 │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Project │ │ Session │ │ Context │ │ Cross-Session │ │
│ │ Management │ │ Tracking │ │ Broker │ │ Correlator │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ RAG++ SERVICE │
│ (Python / FastAPI) │
│ Port: 8000 │
├─────────────────────────────────────────────────────────────────────────────┤
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ INGESTION LAYER ││
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌───────────┐ ││
│ │ │ ChatGPT │ │ Claude │ │ Cursor │ │ Code │ │ Media │ ││
│ │ │ Ingester │ │ Ingester │ │ Ingester │ │ Graph │ │ Ingester │ ││
│ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └─────┬─────┘ ││
│ │ │ │ │ │ │ ││
│ │ ▼ ▼ ▼ ▼ ▼ ││
│ │ ┌─────────────────────────────────────────────────────────────────┐ ││
│ │ │ UNIFIED INGESTER │ ││
│ │ │ • Embedding Generation (768-dim) │ ││
│ │ │ • Trajectory Coordinate Computation │ ││
│ │ │ • Primitive Enrichment (stall/exec/domain/clarification) │ ││
│ │ └─────────────────────────────────────────────────────────────────┘ ││
│ └─────────────────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ RETRIEVAL LAYER ││
│ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ ││
│ │ │ Query Intent │ │ Vector Search │ │ Quality Reranker │ ││
│ │ │ Analyzer │ │ (pgvector) │ │ (Lifecycle-Aware) │ ││
│ │ └────────┬─────────┘ └────────┬─────────┘ └──────────┬───────────┘ ││
│ │ │ │ │ ││
│ │ ▼ ▼ ▼ ││
│ │ ┌─────────────────────────────────────────────────────────────────┐ ││
│ │ │ TRAJECTORY RETRIEVER │ ││
│ │ │ • IRCP Algorithm (Inverse Ring Contextual Propagation) │ ││
│ │ │ • Graph Expansion (parent/child/sibling turns) │ ││
│ │ │ • Prior Bundle Generation │ ││
│ │ └─────────────────────────────────────────────────────────────────┘ ││
│ └─────────────────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ GENERATION LAYER ││
│ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐ ││
│ │ │ Policy Injector │ │ Enhanced Engine │ │ Post-Gen Validator │ ││
│ │ │ (Question Policy │ │ (Context Build) │ │ (Stall Detection) │ ││
│ │ │ Format Hints) │ │ │ │ │ ││
│ │ └──────────────────┘ └──────────────────┘ └──────────────────────┘ ││
│ └─────────────────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ ML LAYER ││
│ │ ┌───────────────────────────────────────────────────────────────────┐ ││
│ │ │ COGNITIVETWIN V3 │ ││
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ ││
│ │ │ │ Corpus │ │ Repo │ │Conversation │ │ Enhancer │ │ ││
│ │ │ │ Surgery │ │ Worm │ │ Worm │ │ Agent │ │ ││
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ ││
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ ││
│ │ │ │ Dataset │ │ DPO │ │ Training │ │ Evaluation │ │ ││
│ │ │ │ Builder │ │ Factory │ │ Pipeline │ │ Suite │ │ ││
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ ││
│ │ └───────────────────────────────────────────────────────────────────┘ ││
│ │ ││
│ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────────────────┐ ││
│ │ │ FunctionGemma │ │ T5Gemma-2 │ │ Attention Mechanisms │ ││
│ │ │ 270M │ │ (Embeddings) │ │ (IRCP, Dual Ring, Bias) │ ││
│ │ └───────────────┘ └───────────────┘ └───────────────────────────┘ ││
│ └─────────────────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ IDEA VAULT LAYER ││
│ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────────────────┐ ││
│ │ │ IdeaEnricher │ │ ClaimEnricher │ │ IdeaVaultReranker │ ││
│ │ │ (Quality + │ │ (Lifecycle + │ │ (Chain Coherence + │ ││
│ │ │ Domain) │ │ Evidence) │ │ Evidence Boost) │ ││
│ │ └───────────────┘ └───────────────┘ └───────────────────────────┘ ││
│ │ ┌───────────────────────────────────────────────────────────────────┐ ││
│ │ │ ChainContextFetcher │ ││
│ │ │ • Fetch idea → chain → claims → reviews → artifacts │ ││
│ │ │ • Format for prompt injection │ ││
│ │ └───────────────────────────────────────────────────────────────────┘ ││
│ └─────────────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ SUPABASE (PostgreSQL) │
├─────────────────────────────────────────────────────────────────────────────┤
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ UNIFIED KNOWLEDGE FABRIC ││
│ │ (memory_turns) ││
│ │ ││
│ │ 107,099+ turns • 768-dim embeddings • 5D trajectory coordinates ││
│ │ ││
│ │ Roles: user | assistant | tool | system | idea | claim | code | image ││
│ └─────────────────────────────────────────────────────────────────────────┘│
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐│
│ │ projects │ │ memory_turn │ │ ideas │ │ motion_segments ││
│ │ (Orbit) │ │ _edges │ │ (Vault) │ │ (Echelon) ││
│ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────────────┘│
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐│
│ │ chains │ │ claims │ │ reviews │ │ artifacts ││
│ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────────────┘│
└─────────────────────────────────────────────────────────────────────────────┘---
Core Data Model
memory_turns — The Unified Knowledge Fabric
Every piece of knowledge (conversation, idea, claim, code) flows into this table:
memory_turns (
-- Identity
id UUID PRIMARY KEY,
conversation_id UUID,
-- Content
role TEXT, -- 'user' | 'assistant' | 'idea' | 'claim' | ...
content_text TEXT,
embedding VECTOR(768), -- Semantic embedding
-- Trajectory Coordinates (5D)
trajectory_depth INTEGER, -- 0-987+, distance from root
trajectory_sibling_order INTEGER,
trajectory_homogeneity FLOAT, -- 0.0-1.0, similarity to parent
trajectory_temporal FLOAT, -- 0.0-1.0, time position
trajectory_complexity INTEGER, -- Content parts count
-- Phase Detection
phase TEXT, -- 'exploration' | 'consolidation' | ...
salience_score FLOAT, -- 0.0-1.0, importance
-- V3 Primitives (NEW)
stall_score INTEGER, -- Permission-seeking (0-10)
exec_score INTEGER, -- Execution content (0-10)
clarification_type VARCHAR, -- 'unjustified' | 'justified' | 'neutral'
domain VARCHAR, -- 'code' | 'research' | 'planning' | ...
task_type VARCHAR, -- 'implement' | 'debug' | 'refactor' | ...
-- Idea Vault (NEW)
source_type VARCHAR, -- 'idea_vault' | 'conversation' | ...
lifecycle_status VARCHAR, -- 'verified' | 'active' | 'falsified' | ...
enrichment_pending BOOLEAN, -- Needs Python-side enrichment
-- Metadata
metadata JSONB,
created_at TIMESTAMPTZ
)---
Module Reference
1. Ingestion (`rag_plusplus/ingestion/`)
| File | Purpose |
|---|---|
| `chatgpt.py` | ChatGPT conversation ingestion |
| `claude.py` | Claude conversation ingestion |
| `unified.py` | Unified ingestion orchestrator |
| `embedder.py` | Embedding generation |
| `trajectory.py` | Trajectory coordinate computation |
| `primitive_enricher.py` | V3 primitive scoring (stall, exec, domain) |
| `vision.py` | Image/vision content handling |
| `media.py` | Audio/video handling |
2. Retrieval (`rag_plusplus/retrieval/`)
| File | Purpose |
|---|---|
| `query.py` | `MemoryRetriever`, `SearchQuery` |
| `intent.py` | `QueryIntentAnalyzer`, directive detection |
| `quality.py` | `QualityReranker`, lifecycle scoring |
| `priors.py` | Prior bundle generation |
| `idea_retriever.py` | Idea-specific retrieval |
3. Generation (`rag_plusplus/generation/`)
| File | Purpose |
|---|---|
| `engine.py` | Base generation engine |
| `enhanced.py` | `EnhancedGenerationEngine` with primitives |
| `policy.py` | `PolicyInjector` (question policy, format hints) |
| `validator.py` | `PostGenerationValidator` (stall detection) |
| `context.py` | Context building |
| `synthesis.py` | Multi-turn synthesis |
4. Idea Vault (`rag_plusplus/idea_vault/`)
| File | Purpose |
|---|---|
| `enricher.py` | `IdeaEnricher`, `ClaimEnricher` |
| `reranker.py` | `IdeaVaultReranker` with chain coherence |
| `chain.py` | `ChainContextFetcher` |
5. CognitiveTwin V3 (`rag_plusplus/ml/cognitivetwin_v3/`)
#### 5.1 Corpus Surgery
| File | Purpose |
|------|---------|
| `classifier.py` | Stall/exec scoring, clarification classification |
| `constants.py` | Phrase lists, thresholds |
| `rewriter.py` | Unjustified → execution rewriting |
| `quarantine.py` | Friction detection, DPO pair extraction |
| `pipeline.py` | Corpus surgery orchestrator |
#### 5.2 Data Generation Worms
| File | Purpose |
|------|---------|
| `repo_worm.py` | Code task generation, diff creation |
| `conversation_worm.py` | Topology-consistent branching |
| `enhancer_agent.py` | Canonicalization, completion |
| `branch_generator.py` | Paraphrase/extension generation |
| `policy_enforcer.py` | Question policy validation |
#### 5.3 Dataset Building
| File | Purpose |
|------|---------|
| `labeler.py` | Directive, policy, format labeling |
| `pair_generator.py` | DPO pair generation (5 failure modes) |
| `exporter.py` | JSONL export |
| `schema.py` | CTv3 record schema |
#### 5.4 Training
| File | Purpose |
|------|---------|
| `pipeline.py` | Together AI DPO training |
| `v2_generator.py` | V2 model for preferred responses |
| `batch_generator.py` | 400K context batch generation |
#### 5.5 Evaluation
| File | Purpose |
|------|---------|
| `suite.py` | Regression test suite |
| `scorers.py` | Policy, format, content scoring |
| `reporter.py` | Markdown/JSON reports |
6. FunctionGemma (`rag_plusplus/ml/functiongemma/`)
| File | Purpose |
|---|---|
| `runtime.py` | Inference runtime |
| `kernel_bridge.py` | Rust kernel integration |
| `data_format.py` | FunctionGemma format utils |
| `dataset.py` | Training dataset |
7. Attention Mechanisms (`rag_plusplus/ml/attention/`)
| File | Purpose |
|---|---|
| `inverse.py` | Inverse attention for trajectory weighting |
| `ring.py` | Ring attention for long contexts |
| `dual_ring.py` | Dual ring (local + global) |
| `bias.py` | Position bias computation |
| `contextual.py` | Context-aware attention |
8. TPO (`rag_plusplus/tpo/`)
| File | Purpose |
|---|---|
| `pipeline.py` | Trajectory Preference Optimization |
| `retrieval.py` | TPO-aware retrieval |
| `consolidation/` | Trajectory consolidation |
---
Primitive Scores Reference
Quality Primitives
| Score | Range | Description |
|---|---|---|
| `stall_score` | 0-10 | Permission-seeking behavior |
| `exec_score` | 0-10 | Execution/action content |
| `directive_completeness` | 0.0-1.0 | How complete the request is |
Classification
| Type | Description |
|---|---|
| `unjustified` | Assistant asked when it shouldn't have |
| `justified` | Assistant correctly asked for clarification |
| `neutral` | No question asked |
Lifecycle (Idea Vault)
| Role | Status | Quality Adjustment |
|---|---|---|
| idea | inbox | +0.15 |
| idea | workbench | +0.25 |
| idea | resolved | +0.20 |
| idea | deprecated | -0.10 |
| claim | active | +0.10 |
| claim | verified | +0.30 |
| claim | falsified | -0.20 |
| claim | deprecated | -0.40 |
---
API Endpoints
RAG++ Service (Port 8000)
| Endpoint | Method | Description |
|---|---|---|
| `/health` | GET | Health check |
| `/api/rag/search` | GET | Semantic search |
| `/api/rag/ingest` | POST | Ingest new turn |
| `/api/rag/context` | POST | Get trajectory context |
| `/api/cognitivetwin/generate` | POST | Generate with CognitiveTwin |
| `/idea-vault/ideas` | POST/GET | Create/list ideas |
| `/idea-vault/claims` | POST/GET | Create/list claims |
| `/idea-vault/critique` | POST | Generate AI critique |
Orbit Server (Port 3847)
| Endpoint | Method | Description |
|---|---|---|
| `/api/projects` | GET/POST | Project management |
| `/api/context/query` | POST | Unified context query |
| `/api/sessions` | GET | Active AI sessions |
| `/api/inbox` | GET/POST | Inbox management |
| `/api/dispatch` | POST | Task dispatch |
---
Data Flow
1. Ingestion Flow
Source (ChatGPT/Claude/Cursor)
│
▼
┌──────────────────┐
│ Source Ingester │ Extract conversations
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Unified Ingester │ Normalize schema
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Embedder │ Generate 768-dim vectors
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Trajectory Calc │ Compute 5D coordinates
└────────┬─────────┘
│
▼
┌──────────────────┐
│Primitive Enricher│ Compute stall/exec/domain
└────────┬─────────┘
│
▼
┌──────────────────┐
│ memory_turns │ Store in database
└──────────────────┘2. Retrieval Flow
User Query
│
▼
┌──────────────────┐
│ Intent Analyzer │ Detect directive, domain, task
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Vector Search │ pgvector similarity
└────────┬─────────┘
│
▼
┌──────────────────┐
│Quality Reranker │ Apply lifecycle + primitive adjustments
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Graph Expansion │ Fetch parent/child turns
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Prior Bundle │ Generate outcome priors
└──────────────────┘3. Generation Flow
Retrieved Context
│
▼
┌──────────────────┐
│ Policy Injector │ Add question policy, format hints
└────────┬─────────┘
│
▼
┌──────────────────┐
│Enhanced Engine │ Build final prompt
└────────┬─────────┘
│
▼
┌──────────────────┐
│ LLM Call │ Generate response
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Post Validator │ Check for stall violations
└────────┬─────────┘
│
▼
Response---
Training Pipeline
CognitiveTwin V3 Training
┌───────────────────────────────────────────────────────────────────┐
│ DATA SOURCES │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────────┐ │
│ │Supabase │ │ Prompt │ │ Claude │ │ OpenAI │ │ Codebase │ │
│ │ Turns │ │ Logger │ │ JSON │ │ JSON │ │ (Repo) │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └──────┬──────┘ │
└───────┼──────────┼──────────┼──────────┼──────────────┼─────────┘
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
┌───────────────────────────────────────────────────────────────────┐
│ UNIFIED INGESTION │
│ • Normalize to UnifiedConversation schema │
│ • Deduplicate by content hash │
└───────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────────┐
│ CORPUS SURGERY │
│ • Classify unjustified clarifications │
│ • Quarantine friction segments │
│ • Generate rewritten preferred responses (GPT-5-mini) │
└───────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────────┐
│ DATA AUGMENTATION │
│ ┌─────────────┐ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ Repo Worm │ │Conversation Worm│ │ Enhancer Agent │ │
│ │ Code tasks │ │ Branch convos │ │ Canonicalize/Complete│ │
│ └─────────────┘ └─────────────────┘ └─────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────────┐
│ DPO PAIR GENERATION │
│ • Confirmation reflex pairs │
│ • Option spam pairs │
│ • Omission pairs │
│ • Format drift pairs │
│ • Permission-seeking pairs │
└───────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────────┐
│ TOGETHER AI TRAINING │
│ • Stage 1: SFT on preferred examples │
│ • Stage 2: DPO on preference pairs │
│ • Base: meta-llama/Meta-Llama-3.1-8B-Instruct-Reference │
└───────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────────────────────────┐
│ EVALUATION SUITE │
│ • Policy compliance (question policy adherence) │
│ • Format adherence (code blocks, lists) │
│ • Content quality (completeness, accuracy) │
│ • Regression tests (historical annoyances) │
└───────────────────────────────────────────────────────────────────┘---
Deployment
Local Development
# Start RAG++ service
cd cc-rag-plus-plus
source .venv/bin/activate
python -m rag_plusplus.service.app
# Start Orbit server (separate terminal)
cd cc-orbit
cargo runCloud Deployment
| Component | Platform | Notes |
|---|---|---|
| RAG++ Service | Cloud Run | Stateless, auto-scaling |
| Orbit Server | Cloud Run | Rust binary |
| Database | Supabase | PostgreSQL + pgvector |
| Training | Together AI / RunPod | DPO fine-tuning |
| FunctionGemma | Vertex AI | CPU training |
---
Key Invariants
1. Every turn has trajectory coordinates — No orphan data
2. Ideas/claims auto-index to memory_turns — Unified fabric
3. Edges form a DAG — No cycles in conversation graph
4. Salience is bounded [0, 1] — Normalized importance
5. Phases are exhaustive — Every turn has a phase
6. Primitives are computed at ingestion — Ready for retrieval filtering
7. Lifecycle status syncs to memory_turns — Triggers maintain consistency
---
Current Statistics
| Metric | Value |
|---|---|
| Total Memory Turns | 107,099+ |
| Turn Edges | 44,863 |
| Max Trajectory Depth | 987 |
| Embedding Dimension | 768 |
| Trained Models | CognitiveTwin V2 (Llama 3.1 8B) |
| Training Data | 29,348 SFT + 140+ DPO pairs |
Promotion Decision
Promote into a technical note or architecture paper with implementation anchors.
Source Anchor
Comp-Core/core/retrieval/cc-rag-plus-plus/docs/ARCHITECTURE.md
Detected Structure
Method · Evaluation · Code Anchors · Architecture