Grand Diomande Research · Full HTML Reader

Graph Kernel Evo³ — Evolution Cubed

**OpenClaw CompCore — Three-Phase Evolution Roadmap** **Version:** 1.0.0 · **Date:** 2026-02-14 **Baseline:** Graph Kernel v0.1.0, DEP Audit Score 7.4/10 **Author:** Mohamed Diomande

Agents That Account for Themselves proposal experiment writeup candidate score 52 .md

Full Public Reader

Graph Kernel Evo³ — Evolution Cubed

OpenClaw CompCore — Three-Phase Evolution Roadmap
Version: 1.0.0 · Date: 2026-02-14
Baseline: Graph Kernel v0.1.0, DEP Audit Score 7.4/10
Author: Mohamed Diomande

---

Overview

This document presents three evolutionary trajectories for the Graph Kernel, each building on the previous:

1. Evolution 1: Optimization — Maximize performance and correctness within the current architecture.
2. Evolution 2: Expansion — Add new capabilities that extend the system's reach.
3. Evolution 3: Transformation — Reimagine what the Graph Kernel could become.

Each evolution includes concrete implementation steps, estimated effort, dependencies, and risk assessment.

---

Evolution 1: Optimization

What can be improved within the current architecture?

1.1 SQLite Backend Migration (Native Rust)

Problem: 90

Solution: Implement `SqliteGraphStore` as a native Rust backend behind the existing `GraphStore` trait, with automatic sync from Supabase.

Implementation Plan

Phase 1: SqliteGraphStore Implementation (1 week)
├── src/store/sqlite.rs — New GraphStore implementation
│   ├── Use sqlx with sqlite feature (compile-time checked)
│   ├── Same schema as knowledge_graph table
│   ├── WAL mode for concurrent reads
│   ├── In-memory option for testing (`:memory:`)
│   └── File-based for production ([home-path])
├── Cargo.toml — New feature flag: `sqlite`
│   ├── sqlx = { features = ["sqlite", "runtime-tokio"] }
│   └── Feature: sqlite = ["sqlx/sqlite", "tokio"]
└── src/bin/graph_kernel_service.rs — Backend selection
    └── Match on DB_BACKEND env var: "sqlite" | "postgres" | "auto"

Phase 2: Sync Engine (3 days)
├── src/sync/mod.rs — Bidirectional sync module
│   ├── Full dump: Supabase → SQLite (initial load)
│   ├── Incremental: Poll Supabase for changes since last_sync_at
│   ├── Write-through: Local writes → SQLite + queue for Supabase push
│   └── Conflict resolution: Last-writer-wins with source tracking
└── Background task: tokio::spawn sync every 5 minutes

Phase 3: Hybrid Mode (2 days)
├── Reads: Always local SQLite (sub-10ms)
├── Writes: Local SQLite + async push to Supabase
├── Sync status in /health response
└── Manual sync trigger: POST /api/admin/sync

#### Effort: 2 weeks
#### Dependencies: sqlx sqlite feature, existing GraphStore trait
#### Risk: LOW — The trait abstraction already exists. SQLite is well-tested with sqlx. The only risk is sync conflicts, mitigated by last-writer-wins semantics.

#### Expected Impact
| Metric | Before | After |
|--------|--------|-------|
| Query latency | 291ms | 5–15ms |
| Multi-hop (3 hops) | 874ms | 15–45ms |
| Availability | Depends on Supabase | Works offline |

---

1.2 Server-Side Multi-Hop Traversal

Problem: Multi-hop queries require N HTTP round-trips. A 3-hop traversal costs 3 × 291ms = 874ms (or 3 × 10ms = 30ms with SQLite).

Solution: New `POST /api/knowledge/traverse` endpoint that performs BFS/DFS traversal server-side.

API Design

json

// Request
POST /api/knowledge/traverse
{
  "start": "clawdbot",           // Starting entity
  "predicates": ["uses", "depends_on"],  // Edge types to follow (null = any)
  "direction": "outgoing",       // "outgoing" | "incoming" | "both"
  "max_hops": 3,                 // Maximum traversal depth
  "max_results": 100,            // Result set cap
  "min_confidence": 0.5,         // Confidence floor
  "return_paths": true           // Include full path for each result
}

// Response
{
  "paths": [
    {
      "entities": ["clawdbot", "graph-kernel", "postgresql"],
      "edges": [
        {"subject": "clawdbot", "predicate": "uses", "object": "graph-kernel"},
        {"subject": "graph-kernel", "predicate": "uses", "object": "postgresql"}
      ],
      "hops": 2,
      "min_confidence": 0.85
    }
  ],
  "stats": {
    "entities_visited": 42,
    "edges_traversed": 67,
    "elapsed_ms": 12
  }
}

Implementation

rust

// src/service/traversal.rs
pub async fn traverse_handler(
    State(state): State<Arc<AppState>>,
    Json(request): Json<TraversalRequest>,
) -> Result<Json<TraversalResponse>, (StatusCode, Json<ErrorResponse>)> {
    let pool = state.store.pool();
    let mut visited: HashSet<String> = HashSet::new();
    let mut frontier: VecDeque<(String, Vec<TraversalEdge>, u32)> = VecDeque::new();
    let mut paths: Vec<TraversalPath> = Vec::new();

    frontier.push_back((request.start.clone(), vec![], 0));
    visited.insert(request.start.clone());

    while let Some((entity, path, depth)) = frontier.pop_front() {
        if depth > request.max_hops || paths.len() >= request.max_results {
            break;
        }

        // Query adjacent triples in one SQL call
        let triples = query_adjacent(pool, &entity, &request).await?;

        for triple in triples {
            let next_entity = if request.direction == "incoming" {
                &triple.subject
            } else {
                &triple.object
            };

            let mut new_path = path.clone();
            new_path.push(triple.clone());

            if !visited.contains(next_entity) {
                visited.insert(next_entity.clone());
                paths.push(TraversalPath::from_edges(&new_path));
                frontier.push_back((next_entity.clone(), new_path, depth + 1));
            }
        }
    }

    Ok(Json(TraversalResponse { paths, stats: ... }))
}

#### Effort: 3 days
#### Dependencies: None (uses existing pool)
#### Risk: LOW — Standard BFS over SQL. The `max_hops` and `max_results` caps prevent unbounded traversal.

---

1.3 Entity Normalization at Service Level

Problem: Entity normalization lives in a Python middleware outside the Rust service. Clients that bypass the middleware create duplicate entities.

Solution: A `canonicalize_entity()` function called in route handlers before database writes and in query handlers before lookups.

Implementation

rust

// src/service/normalize.rs

use std::collections::HashMap;
use std::sync::LazyLock;

/// Canonical entity normalization rules
static ALIAS_MAP: LazyLock<HashMap<&str, &str>> = LazyLock::new(|| {
    let mut m = HashMap::new();
    // Normalize common aliases
    m.insert("dream weaver", "dream-weaver-engine");
    m.insert("dreamweaver", "dream-weaver-engine");
    m.insert("dream-weaver", "dream-weaver-engine");
    m.insert("clawdbot", "clawdbot");
    m.insert("clawdbot-gateway", "clawdbot");
    m.insert("comp-core", "comp-core");
    m.insert("compcore", "comp-core");
    // ... loaded from config file or embedded resource
    m
});

/// Canonicalize an entity name.
///
/// Steps:
/// 1. Lowercase
/// 2. Trim whitespace
/// 3. Replace spaces/underscores with hyphens
/// 4. Check alias map for known variants
/// 5. Return canonical form
pub fn canonicalize_entity(name: &str) -> String {
    let normalized = name
        .trim()
        .to_lowercase()
        .replace(' ', "-")
        .replace('_', "-");

    ALIAS_MAP
        .get(normalized.as_str())
        .map(|s| s.to_string())
        .unwrap_or(normalized)
}

Apply in route handlers:

rust

// In add_knowledge_handler:
let triple = KnowledgeTriple {
    subject: canonicalize_entity(&triple.subject),
    object: canonicalize_entity(&triple.object),
    predicate: triple.predicate.to_lowercase(),
    ..triple
};

// In query_knowledge_handler:
let subject = params.subject.map(|s| canonicalize_entity(&s));
let object = params.object.map(|s| canonicalize_entity(&s));

#### Effort: 2 days
#### Dependencies: None
#### Risk: LOW — The alias map can start small and grow. False positives (incorrectly merging distinct entities) are the main risk, mitigated by making the map configurable.

#### Expected Impact
| Metric | Before | After |
|--------|--------|-------|
| Unique subjects | 221 (with duplicates) | ~190 (canonical) |
| Relationship relevance | 0.94 | 1.00 |
| Predicate relevance | 0.80 | 0.95+ |

---

1.4 Query Caching Layer

Problem: Identical knowledge graph queries hit PostgreSQL (or SQLite) every time.

Solution: LRU cache keyed on query parameters with TTL-based invalidation.

Implementation

rust

// src/service/cache.rs

use lru::LruCache;
use parking_lot::RwLock;
use std::hash::{Hash, Hasher};
use std::num::NonZeroUsize;
use std::time::{Duration, Instant};

pub struct QueryCache {
    cache: RwLock<LruCache<u64, CachedResult>>,
    ttl: Duration,
}

struct CachedResult {
    response: KnowledgeQueryResponse,
    cached_at: Instant,
}

impl QueryCache {
    pub fn new(max_entries: usize, ttl_secs: u64) -> Self {
        Self {
            cache: RwLock::new(LruCache::new(
                NonZeroUsize::new(max_entries).unwrap()
            )),
            ttl: Duration::from_secs(ttl_secs),
        }
    }

    pub fn get(&self, key: &KnowledgeQueryParams) -> Option<KnowledgeQueryResponse> {
        let hash = hash_params(key);
        let cache = self.cache.read();
        cache.peek(&hash)
            .filter(|r| r.cached_at.elapsed() < self.ttl)
            .map(|r| r.response.clone())
    }

    pub fn put(&self, key: &KnowledgeQueryParams, response: KnowledgeQueryResponse) {
        let hash = hash_params(key);
        let mut cache = self.cache.write();
        cache.put(hash, CachedResult {
            response,
            cached_at: Instant::now(),
        });
    }

    /// Invalidate all entries (called on write)
    pub fn invalidate(&self) {
        self.cache.write().clear();
    }
}

Cache invalidation strategy:
- Reads: Check cache first, fall through to DB on miss
- Writes: `invalidate()` on any triple insert/update/delete
- TTL: 60 seconds default, configurable via `GK_CACHE_TTL_SECS`
- Size: 1,000 entries default, configurable via `GK_CACHE_MAX_ENTRIES`

#### Effort: 1 day
#### Dependencies: `lru`, `parking_lot` (already in Cargo.toml)
#### Risk: LOW — Cache invalidation on writes prevents stale data. TTL provides a safety net.

---

1.5 Connection Pooling Optimization

Quick wins requiring only configuration changes:

Change	Current	Recommended	Impact
`test_before_acquire`	`true`	`false` (local), `true` (remote)	-200ms on first query per connection
`min_connections`	2	5	Faster burst response
`max_connections`	10	20 (local SQLite supports it)	2× concurrent capacity
`idle_timeout`	300s	600s (local)	Fewer reconnections

Add env var: `DB_TEST_BEFORE_ACQUIRE=false` for local deployments.

#### Effort: 1 hour
#### Risk: NONE — Configuration-only changes

---

Evolution 1 Summary

Optimization	Effort	Impact	Priority
SQLite backend	2 weeks	20× latency improvement	P0
Multi-hop traversal	3 days	Eliminates N-trip penalty	P0
Entity normalization	2 days	+0.06–0.15 relevance	P1
Query caching	1 day	10× for repeated queries	P1
Pool optimization	1 hour	Marginal improvement	P2
Total	~3.5 weeks

Projected DEP Audit Score After Evo 1: 8.6/10 (+1.2 from baseline 7.4)

---

Evolution 2: Expansion

What new capabilities should the Graph Kernel have?

2.1 Real-Time WebSocket Subscriptions

What: Clients subscribe to triple changes via WebSocket and receive push notifications when entities they care about are created, updated, or deleted.

Why: Enables reactive UIs, downstream pipeline triggers, and event-driven architecture without polling.

API Design

# Connect
WS /api/knowledge/subscribe

# Subscribe message
{
  "action": "subscribe",
  "filter": {
    "subjects": ["clawdbot", "graph-kernel"],
    "predicates": ["uses", "depends_on"],
    "min_confidence": 0.7
  }
}

# Notification
{
  "event": "triple_created",
  "triple": {
    "subject": "clawdbot",
    "predicate": "uses",
    "object": "gemini-2.5",
    "confidence": 0.95,
    "source": "topology-ingester"
  },
  "timestamp": "2026-02-14T12:00:00Z"
}

Implementation

rust

// src/service/websocket.rs
use axum::extract::ws::{WebSocket, WebSocketUpgrade};
use tokio::sync::broadcast;

/// Broadcast channel for triple change events
pub struct TripleEventBus {
    sender: broadcast::Sender<TripleEvent>,
}

impl TripleEventBus {
    pub fn new(capacity: usize) -> Self {
        let (sender, _) = broadcast::channel(capacity);
        Self { sender }
    }

    pub fn publish(&self, event: TripleEvent) {
        // Ignore error (no subscribers)
        let _ = self.sender.send(event);
    }

    pub fn subscribe(&self) -> broadcast::Receiver<TripleEvent> {
        self.sender.subscribe()
    }
}

// In route handlers, after successful triple insert:
state.event_bus.publish(TripleEvent::Created(triple));

#### Effort: 3 days
#### Dependencies: `axum` (WebSocket support built-in), `tokio::sync::broadcast`
#### Risk: LOW — The broadcast channel pattern is well-established. Memory bounded by channel capacity. Slow subscribers get dropped (lagging receiver).

---

2.2 Graph Visualization Endpoint

What: An endpoint that returns graph data in formats consumable by D3.js, Mermaid, or Graphviz for visual exploration.

Why: Dramatically improves developer experience. "Show me what clawdbot connects to" should produce a visual graph, not a JSON array of triples.

API Design

GET /api/knowledge/graph?subject=clawdbot&hops=2&format=d3
GET /api/knowledge/graph?subject=clawdbot&hops=2&format=mermaid
GET /api/knowledge/graph?subject=clawdbot&hops=2&format=dot

Response Formats

D3 (force-directed graph):

json

{
  "nodes": [
    {"id": "clawdbot", "group": "service", "weight": 15},
    {"id": "graph-kernel", "group": "service", "weight": 8},
    {"id": "postgresql", "group": "infrastructure", "weight": 3}
  ],
  "links": [
    {"source": "clawdbot", "target": "graph-kernel", "predicate": "uses", "confidence": 0.95},
    {"source": "graph-kernel", "target": "postgresql", "predicate": "uses", "confidence": 0.90}
  ]
}

Mermaid:

graph LR
    clawdbot -->|uses| graph-kernel
    graph-kernel -->|uses| postgresql
    clawdbot -->|uses| rag-plusplus

DOT (Graphviz):

digraph G {
    "clawdbot" -> "graph-kernel" [label="uses"];
    "graph-kernel" -> "postgresql" [label="uses"];
}

Implementation

This reuses the server-side traversal from §1.2, adding output format rendering:

rust

// src/service/visualization.rs

pub fn render_d3(paths: &[TraversalPath]) -> D3Graph {
    let mut nodes = IndexSet::new();
    let mut links = Vec::new();

    for path in paths {
        for edge in &path.edges {
            nodes.insert(edge.subject.clone());
            nodes.insert(edge.object.clone());
            links.push(D3Link {
                source: edge.subject.clone(),
                target: edge.object.clone(),
                predicate: edge.predicate.clone(),
                confidence: edge.confidence,
            });
        }
    }

    D3Graph {
        nodes: nodes.into_iter().map(|id| D3Node {
            id: id.clone(),
            group: infer_group(&id),
            weight: links.iter().filter(|l| l.source == id || l.target == id).count(),
        }).collect(),
        links,
    }
}

pub fn render_mermaid(paths: &[TraversalPath]) -> String {
    let mut lines = vec!["graph LR".to_string()];
    let mut seen = HashSet::new();

    for path in paths {
        for edge in &path.edges {
            let key = format!("{}--{}-->{}", edge.subject, edge.predicate, edge.object);
            if seen.insert(key) {
                lines.push(format!(
                    "    {} -->|{}| {}",
                    sanitize_mermaid(&edge.subject),
                    edge.predicate,
                    sanitize_mermaid(&edge.object)
                ));
            }
        }
    }

    lines.join("\n")
}

#### Effort: 2 days
#### Dependencies: Server-side traversal (§1.2)
#### Risk: LOW — Pure rendering over existing data. No new state management.

---

2.3 Batch Ingest API (High-Performance)

What: Replace the sequential triple-per-row insert with a bulk `COPY` or multi-row `INSERT` for high-throughput ingestion.

Why: The current batch handler inserts one row at a time in a transaction loop. For the topology ingester's 5,000+ triples, this means 5,000 SQL round-trips (even if within a single transaction).

Implementation

rust

// Replace sequential inserts with a multi-row VALUES clause
pub async fn bulk_insert_triples(
    pool: &PgPool,
    triples: &[KnowledgeTriple],
) -> Result<BulkInsertResult, sqlx::Error> {
    // Chunk into batches of 500 (PostgreSQL parameter limit ~32K)
    let chunk_size = 500;
    let mut added = 0;
    let mut updated = 0;

    for chunk in triples.chunks(chunk_size) {
        // Build multi-row INSERT
        let mut query = String::from(
            "INSERT INTO knowledge_graph (subject, predicate, object, confidence, source) VALUES "
        );
        let mut params: Vec<String> = Vec::new();
        let mut bind_values: Vec<&str> = Vec::new();

        for (i, triple) in chunk.iter().enumerate() {
            let offset = i * 5;
            params.push(format!(
                "(${}, ${}, ${}, ${}, ${})",
                offset + 1, offset + 2, offset + 3, offset + 4, offset + 5
            ));
        }

        query.push_str(&params.join(", "));
        query.push_str(
            " ON CONFLICT (subject, predicate, object) DO UPDATE SET \
              confidence = GREATEST(knowledge_graph.confidence, EXCLUDED.confidence), \
              source = EXCLUDED.source"
        );

        // Execute with binds
        let mut q = sqlx::query(&query);
        for triple in chunk {
            q = q.bind(&triple.subject)
                 .bind(&triple.predicate)
                 .bind(&triple.object)
                 .bind(triple.confidence)
                 .bind(&triple.source);
        }

        let result = q.execute(pool).await?;
        added += result.rows_affected() as usize;
    }

    Ok(BulkInsertResult { added, updated, total: triples.len() })
}

Alternatively, for SQLite backend, use sqlx's `PRAGMA journal_mode=WAL` + batch transaction:

rust

// SQLite-optimized batch
let mut tx = pool.begin().await?;
sqlx::query("PRAGMA synchronous = OFF").execute(&mut *tx).await?;

for triple in &triples {
    sqlx::query("INSERT OR REPLACE INTO knowledge_graph ...")
        .bind(...)
        .execute(&mut *tx)
        .await?;
}

tx.commit().await?;

#### Effort: 2 days
#### Dependencies: None
#### Risk: LOW — Multi-row INSERT is standard PostgreSQL. Chunking prevents parameter limit overflow.
#### Expected Impact: 10–50× faster ingestion (5,000 triples in <1s vs. current ~50s)

---

2.4 Temporal Versioning (Triple Validity Windows)

What: Add `valid_from` and `valid_until` columns to triples, enabling temporal knowledge management.

Why: Knowledge changes over time. "clawdbot uses gemini-1.5" was true in January but "clawdbot uses gemini-2.5" is true now. Without temporal versioning, stale knowledge persists at full confidence.

Schema Change

sql

ALTER TABLE knowledge_graph
ADD COLUMN valid_from TIMESTAMPTZ DEFAULT NOW(),
ADD COLUMN valid_until TIMESTAMPTZ DEFAULT NULL,
ADD COLUMN superseded_by BIGINT REFERENCES knowledge_graph(id) DEFAULT NULL;

-- Index for temporal queries
CREATE INDEX idx_knowledge_temporal
ON knowledge_graph (subject, predicate, valid_from, valid_until)
WHERE valid_until IS NULL;

API Changes

json

// Query with temporal filter
GET /api/knowledge?subject=clawdbot&at=2026-01-15T00:00:00Z

// Insert with validity
POST /api/knowledge
{
  "subject": "clawdbot",
  "predicate": "uses",
  "object": "gemini-2.5",
  "confidence": 0.95,
  "valid_from": "2026-02-01T00:00:00Z",
  "supersedes": {"subject": "clawdbot", "predicate": "uses", "object": "gemini-1.5"}
}

Implementation

Default: `valid_until = NULL` (currently valid)
Supersession: When a new triple supersedes an old one, set `valid_until = NOW()` on the old triple and `superseded_by = new_id`
Query: Add `WHERE valid_until IS NULL OR valid_until > $timestamp` filter

#### Effort: 1 week
#### Dependencies: Schema migration, backward-compatible API changes
#### Risk: MEDIUM — Migration of existing 3,502 triples. All existing triples get `valid_from = created_at, valid_until = NULL`. Queries need updated to filter on validity by default.

---

2.5 Community Detection (Leiden/Louvain)

What: Automatic clustering of entities into communities based on graph topology, using the Leiden or Louvain algorithm.

Why: With 221 subjects and 3,502 triples, the knowledge graph has natural clusters (e.g., "audio production projects", "infrastructure services", "personal preferences"). Discovering these clusters enables:
- Automated knowledge organization
- Cluster-level context retrieval ("give me everything about the audio cluster")
- Graph compression for visualization

Implementation

rust

// src/analysis/community.rs

/// Louvain community detection over the knowledge graph
pub struct CommunityDetector {
    adjacency: HashMap<String, Vec<(String, f64)>>,  // entity → [(neighbor, weight)]
}

impl CommunityDetector {
    /// Build adjacency from triples
    pub fn from_triples(triples: &[StoredKnowledgeTriple]) -> Self {
        let mut adj: HashMap<String, Vec<(String, f64)>> = HashMap::new();
        for triple in triples {
            adj.entry(triple.subject.clone())
                .or_default()
                .push((triple.object.clone(), triple.confidence));
            adj.entry(triple.object.clone())
                .or_default()
                .push((triple.subject.clone(), triple.confidence));
        }
        Self { adjacency: adj }
    }

    /// Run Louvain algorithm
    pub fn detect(&self) -> Vec<Community> {
        // Phase 1: Local modularity optimization
        // Phase 2: Community aggregation
        // Iterate until convergence
        // Return communities with member lists
        todo!("Implement Louvain — or use petgraph + community crate")
    }
}

Alternative: Use the `petgraph` crate for graph algorithms and implement Louvain on top, or shell out to a Python script using `networkx.community.louvain_communities`.

API

GET /api/knowledge/communities
→ {
    "communities": [
      {
        "id": 0,
        "name": "Audio Production",  // Auto-generated from dominant predicate
        "members": ["dream-weaver-engine", "cc-echelon", "cc-cinematographer", ...],
        "internal_edges": 45,
        "modularity": 0.72
      },
      ...
    ],
    "modularity_score": 0.65,
    "algorithm": "louvain"
  }

#### Effort: 1 week
#### Dependencies: `petgraph` crate (or custom implementation)
#### Risk: MEDIUM — Algorithmic complexity. The current graph size (221 nodes) is trivial for Louvain, but the implementation needs to be correct.

---

2.6 Structural Metrics Endpoint

What: Compute and expose graph-level and node-level structural metrics: PageRank, betweenness centrality, degree distribution, clustering coefficient.

Why: These metrics answer operational questions: "Which entity is most central?", "Which entities are bridges between clusters?", "Is the graph getting more connected or more fragmented?"

API

GET /api/knowledge/metrics
→ {
    "graph": {
      "node_count": 221,
      "edge_count": 3502,
      "density": 0.072,
      "avg_degree": 31.7,
      "clustering_coefficient": 0.43,
      "diameter": 7,
      "connected_components": 3
    },
    "top_pagerank": [
      {"entity": "clawdbot", "score": 0.142},
      {"entity": "mohamed-diomande", "score": 0.098},
      {"entity": "comp-core", "score": 0.076}
    ],
    "top_betweenness": [
      {"entity": "clawdbot", "score": 0.312},
      {"entity": "graph-kernel", "score": 0.187}
    ],
    "bridges": [
      {"entity": "orbit", "connects": ["infrastructure", "agent-services"]}
    ]
  }

Implementation

This reuses the `influence.rs` module in the Atlas subsystem, which already computes `TurnInfluence`, `BridgeTurn`, and `PhaseTopologyStats`. The knowledge graph metrics endpoint would apply the same algorithms to the triple graph instead of the turn DAG.

#### Effort: 3 days (reusing atlas/influence.rs patterns)
#### Dependencies: Server-side traversal for full graph access
#### Risk: LOW — The atlas subsystem already implements these algorithms. Porting to the triple graph is mechanical.

---

Evolution 2 Summary

Capability	Effort	Impact	Priority
WebSocket subscriptions	3 days	Event-driven architecture	P2
Graph visualization	2 days	10× better DX	P1
Batch ingest (bulk)	2 days	10–50× faster ingestion	P1
Temporal versioning	1 week	Knowledge lifecycle management	P2
Community detection	1 week	Automatic organization	P3
Structural metrics	3 days	Operational intelligence	P2
Total	~4 weeks

Projected DEP Audit Score After Evo 1+2: 9.1/10 (+0.5 from Evo 1)

---

Evolution 3: Transformation

What could the Graph Kernel BECOME if we reimagined it?

3.1 Federated Graph Kernel (Cross-Instance Knowledge Sharing)

Vision: Multiple Graph Kernel instances (one per user, team, or organization) can share and merge knowledge while preserving provenance boundaries.

Why this matters: Right now, the Graph Kernel is a single-user system. Mohamed's knowledge graph is isolated. But imagine multiple agents, each with their own Graph Kernel, sharing knowledge about a shared codebase while maintaining cryptographic provenance over who contributed what.

Architecture

┌──────────────────┐     ┌──────────────────┐     ┌──────────────────┐
│  GK Instance A   │     │  GK Instance B   │     │  GK Instance C   │
│  (Mohamed)       │     │  (Team Alpha)    │     │  (CI/CD Agent)   │
│                  │     │                  │     │                  │
│  Local triples   │◄───►│  Local triples   │◄───►│  Local triples   │
│  + HMAC signer   │     │  + HMAC signer   │     │  + HMAC signer   │
└────────┬─────────┘     └────────┬─────────┘     └────────┬─────────┘
         │                        │                        │
         └────────────┬───────────┘────────────────────────┘
                      │
              ┌───────▼────────┐
              │  Federation    │
              │  Relay         │
              │                │
              │  - Merge rules │
              │  - Provenance  │
              │  - Conflict    │
              │    resolution  │
              └────────────────┘

Key Design Decisions

1. Provenance preservation: Each triple retains its `source` (instance ID + original source). Federation relay annotates merged triples with `federated_from` metadata.

2. Merge semantics:
- Same (subject, predicate, object) from different instances → take highest confidence, record all sources
- Conflicting objects for same (subject, predicate) → create both triples with source attribution
- HMAC tokens are instance-scoped — federation relay issues a federation-level token

3. Access control: Each instance declares which predicates are shareable. `likes`, `wants_to` → private. `uses`, `depends_on`, `is_a` → shareable.

4. Sync protocol:

   POST /api/federation/offer
   {"triples": [...], "instance_id": "alpha", "since": "2026-02-14T00:00:00Z"}

   POST /api/federation/accept
   {"triple_ids": [1, 2, 3], "instance_id": "alpha"}

#### Effort: 3–4 weeks
#### Dependencies: WebSocket subscriptions (Evo 2.1), entity normalization (Evo 1.3)
#### Risk: HIGH — Distributed systems complexity. Merge conflicts, network partitions, eventual consistency. The HMAC model needs extension for multi-party verification.

---

3.2 Universal Context Authority for AI Agents

Vision: The Graph Kernel becomes the standard context authority for ALL AI agents — not just Clawdbot. Any agent framework (LangChain, CrewAI, AutoGen, custom) can request provenance-governed context slices.

Why this matters: The provenance engine concept (Definition 1 from the research paper) is framework-agnostic. Every AI agent that makes consequential decisions needs reproducible, auditable context construction. No one has built this as a service.

Integration Architecture

┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│  Clawdbot   │  │  LangChain  │  │  CrewAI     │  │  Custom     │
│  Agent      │  │  Agent      │  │  Agent      │  │  Agent      │
└──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘
       │                │                │                │
       └────────────────┼────────────────┼────────────────┘
                        │
                ┌───────▼────────┐
                │  Graph Kernel  │
                │  Context API   │
                │                │
                │  /api/context  │  ← Universal context request
                │  /api/verify   │  ← Universal token verification
                │  /api/register │  ← Agent registration
                └────────────────┘

Universal Context Request API

json

POST /api/context/request
{
  "agent_id": "langchain-agent-42",
  "agent_framework": "langchain",
  "conversation_id": "conv-abc-123",
  "anchor": {
    "type": "turn_id",     // or "entity", "timestamp", "content_hash"
    "value": "550e8400-e29b-41d4-a716-446655440000"
  },
  "policy": "default",     // or custom policy hash
  "max_tokens": 4096,      // Token budget for context window
  "format": "messages"     // "messages" | "text" | "triples" | "graph"
}

// Response
{
  "context": {
    "messages": [
      {"role": "user", "content": "...", "turn_id": "..."},
      {"role": "assistant", "content": "...", "turn_id": "..."}
    ],
    "total_tokens": 3847
  },
  "provenance": {
    "slice_id": "a1b2c3...",
    "admissibility_token": "d4e5f6...",
    "graph_snapshot_hash": "789abc...",
    "policy_ref": {"policy_id": "slice_policy_v1", "params_hash": "..."},
    "schema_version": "1.0.0"
  }
}

#### Effort: 6–8 weeks
#### Dependencies: SQLite backend (Evo 1.1), federation (Evo 3.1)
#### Risk: HIGH — Requires defining a universal agent context protocol. Multi-framework testing. Performance at scale with many concurrent agents.

---

3.3 Graph Kernel SDK (Rust + Python + JavaScript)

Vision: A first-party SDK that makes integrating with the Graph Kernel as easy as `pip install graph-kernel` or `npm install @openclaw/graph-kernel`.

Rust Crate (Already Exists — Productize)

The `cc-graph-kernel` crate with `default` feature is already a library. To make it a standalone SDK:

toml

# Published as `graph-kernel` on crates.io
[package]
name = "graph-kernel"
version = "0.2.0"
# ... remove internal paths, make self-contained

Python SDK

python

# pip install graph-kernel
from graph_kernel import GraphKernelClient, Triple

gk = GraphKernelClient("http://localhost:8001")

# Add knowledge
gk.add_triple(Triple("clawdbot", "uses", "gemini-2.5", confidence=0.95))

# Query
triples = gk.query(subject="clawdbot", predicate="uses")

# Traverse
paths = gk.traverse("clawdbot", predicates=["uses"], max_hops=3)

# Verify token
is_valid = gk.verify_token(admissibility_token, slice_id, ...)

# Subscribe to changes
async for event in gk.subscribe(subjects=["clawdbot"]):
    print(f"New triple: {event.triple}")

JavaScript/TypeScript SDK

typescript

// npm install @openclaw/graph-kernel
import { GraphKernel } from '@openclaw/graph-kernel';

const gk = new GraphKernel('http://localhost:8001');

// Add knowledge
await gk.addTriple({
  subject: 'clawdbot',
  predicate: 'uses',
  object: 'gemini-2.5',
  confidence: 0.95
});

// Query
const triples = await gk.query({ subject: 'clawdbot' });

// Traverse with visualization
const graph = await gk.traverse('clawdbot', {
  predicates: ['uses'],
  maxHops: 3,
  format: 'd3'
});

// Render in browser
gk.visualize(graph, document.getElementById('graph-container'));

#### Effort: 4 weeks (2 weeks Python, 2 weeks JS/TS)
#### Dependencies: Stable API (Evo 1+2 complete), OpenAPI spec
#### Risk: MEDIUM — SDK maintenance burden. Breaking API changes require SDK updates. Need CI for all three languages.

---

3.4 Graph Kernel Protocol (Open Standard)

Vision: Define an open protocol specification for provenance-governed context authorities. Any implementation (not just OpenClaw's) can comply.

Why: If the provenance engine category is real (and the research paper argues it is), the protocol should be open. This enables:
- Interoperability between provenance engines
- Academic research on the protocol semantics
- Community implementations in different languages
- Standardization through W3C or IETF

Protocol Specification

Graph Kernel Protocol v1 (GKP/1)

1. CONTEXT REQUEST
   - Client sends anchor, policy reference, budget constraints
   - Server returns evidence bundle with admissibility token

2. TOKEN VERIFICATION
   - Client presents token + provenance metadata to any verifier
   - Verifier confirms token authenticity without accessing signing secret

3. POLICY REGISTRATION
   - Client registers expansion policy with hash-stable fingerprint
   - Server returns policy reference (policy_id + params_hash)

4. KNOWLEDGE ASSERTION
   - Client asserts (subject, predicate, object, confidence, source)
   - Server stores and indexes, returns assertion ID

5. FEDERATION
   - Instances exchange knowledge with provenance preservation
   - Cross-instance tokens use federated verification

#### Deliverables
1. Protocol spec document (Markdown → RFC-style)
2. Reference implementation (the Graph Kernel itself)
3. Conformance test suite (automated tests any implementation can run)
4. Protocol buffer / JSON Schema definitions for wire format

#### Effort: 4–6 weeks (spec writing + conformance tests)
#### Dependencies: Stable API, research paper published
#### Risk: MEDIUM — Standards work is slow. Risk of over-engineering. Start with an informational document, not a formal standard.

---

3.5 W3C PROV-DM Integration

Vision: Map the Graph Kernel's provenance model to the W3C PROV Data Model, enabling interoperability with the broader provenance ecosystem.

Why: The W3C PROV family of specs (PROV-DM, PROV-O, PROV-JSON) is the international standard for provenance representation. Aligning with PROV-DM would:
- Enable export to any PROV-compliant system
- Support academic citation and reproducibility
- Integrate with existing provenance visualization tools (ProvVis, etc.)

Mapping

Graph Kernel Concept	PROV-DM Concept
`TurnSnapshot`	`prov:Entity`
`ContextSlicer.slice()`	`prov:Activity`
`SliceExport`	`prov:Entity` (derived)
`AdmissibilityToken`	Custom `openclaw:admissibilityToken`
`SlicePolicyV1`	`prov:Plan`
`GraphSnapshotHash`	`prov:Entity` (immutable record)
`source` (triple)	`prov:wasAttributedTo` → `prov:Agent`

PROV-JSON Export

json

{
  "prefix": {
    "gk": "https://openclaw.org/prov/graph-kernel/",
    "prov": "http://www.w3.org/ns/prov#"
  },
  "entity": {
    "gk:slice/a1b2c3": {
      "prov:type": "gk:SliceExport",
      "gk:sliceId": "a1b2c3",
      "gk:admissibilityToken": "d4e5f6",
      "gk:graphSnapshotHash": "789abc"
    }
  },
  "activity": {
    "gk:slicing/001": {
      "prov:type": "gk:ContextSlicing",
      "gk:policyId": "slice_policy_v1",
      "gk:policyParamsHash": "..."
    }
  },
  "wasGeneratedBy": {
    "gk:gen1": {
      "prov:entity": "gk:slice/a1b2c3",
      "prov:activity": "gk:slicing/001"
    }
  }
}

Implementation

New endpoint: `GET /api/provenance/prov-json?slice_id=a1b2c3`

#### Effort: 1 week
#### Dependencies: Stable SliceExport API
#### Risk: LOW — PROV-JSON is a well-defined format. The mapping is straightforward.

---

3.6 Graph Kernel Cloud (Managed Service)

Vision: A hosted, multi-tenant Graph Kernel service where anyone can create a provenance engine for their AI agents. Think "Supabase for provenance."

Why: If the Graph Kernel is genuinely useful (and the evaluation data says it is), most developers won't want to self-host. A managed service removes operational burden and creates a SaaS business model.

Architecture

┌─────────────────────────────────────────────────────────┐
│                   Graph Kernel Cloud                     │
│                                                          │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────┐ │
│  │  API Gateway  │  │  Auth (JWT)  │  │  Rate Limiter │ │
│  └──────┬───────┘  └──────┬───────┘  └──────┬────────┘ │
│         │                  │                  │          │
│  ┌──────▼──────────────────▼──────────────────▼────────┐│
│  │              GK Instance Pool                        ││
│  │                                                      ││
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐          ││
│  │  │ Tenant A │  │ Tenant B │  │ Tenant C │  ...     ││
│  │  │ (SQLite) │  │ (SQLite) │  │ (SQLite) │          ││
│  │  └──────────┘  └──────────┘  └──────────┘          ││
│  └─────────────────────────────────────────────────────┘│
│                                                          │
│  ┌──────────────────────────────────────────────────────┐│
│  │  Shared Infrastructure: Metrics, Logging, Billing    ││
│  └──────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────┘

Key Design Decisions

1. Tenant isolation: Each tenant gets their own SQLite database file. No shared state between tenants. HMAC secrets are per-tenant.

2. Instance pooling: Graph Kernel instances are lightweight (20MB binary + SQLite). On Cloud Run, one container can serve multiple tenants with isolated databases.

3. API key authentication: Each tenant gets an API key. JWT tokens scope access to their knowledge graph and slices.

4. Billing model:
- Free tier: 10,000 triples, 1,000 queries/day
- Pro tier: 1M triples, 100K queries/day, federation
- Enterprise: Unlimited, SLA, custom policies

5. Data residency: Tenant data stays in their chosen region. SQLite files can be replicated to R2/S3 for durability.

#### Effort: 3–4 months
#### Dependencies: All of Evo 1 + most of Evo 2
#### Risk: HIGH — Significant product and infrastructure investment. Multi-tenancy, billing, authentication, compliance. This is a company, not a feature.

---

Evolution 3 Summary

Transformation	Effort	Impact	Priority	Risk
Federated Graph Kernel	3–4 weeks	Multi-agent knowledge sharing	P2	HIGH
Universal Context Authority	6–8 weeks	Framework-agnostic provenance	P1	HIGH
Graph Kernel SDK	4 weeks	10× developer adoption	P1	MEDIUM
Graph Kernel Protocol	4–6 weeks	Open standard	P3	MEDIUM
W3C PROV-DM Integration	1 week	Standards compliance	P2	LOW
Graph Kernel Cloud	3–4 months	SaaS business model	P4	HIGH

---

Roadmap Timeline

2026 Q1 (Now)
├── Evo 1.1: SQLite Backend ████████████████ (2 weeks)
├── Evo 1.2: Multi-hop Traversal ██████ (3 days)
├── Evo 1.3: Entity Normalization ████ (2 days)
├── Evo 1.4: Query Caching ██ (1 day)
└── Evo 1.5: Pool Optimization █ (1 hour)

2026 Q2
├── Evo 2.2: Graph Visualization ████ (2 days)
├── Evo 2.3: Batch Ingest ████ (2 days)
├── Evo 2.1: WebSocket Subscriptions ██████ (3 days)
├── Evo 2.6: Structural Metrics ██████ (3 days)
├── Evo 2.4: Temporal Versioning ████████████ (1 week)
└── Evo 2.5: Community Detection ████████████ (1 week)

2026 Q3
├── Evo 3.3: Graph Kernel SDK ████████████████████████████ (4 weeks)
├── Evo 3.5: W3C PROV-DM ████████████ (1 week)
└── Evo 3.2: Universal Context API ████████████████████████████████████████ (6-8 weeks)

2026 Q4
├── Evo 3.1: Federation ████████████████████████████ (3-4 weeks)
├── Evo 3.4: Protocol Spec ████████████████████████████████ (4-6 weeks)
└── Evo 3.6: Cloud (begins) ████████████████████████████████████████████████ (ongoing)

---

Projected Impact

Phase	DEP Score	Latency	Triples	Features
Current	7.4	291ms	3,502	12 endpoints
After Evo 1	8.6	<15ms	3,502+	14 endpoints
After Evo 2	9.1	<15ms	10K+	20 endpoints
After Evo 3	9.5+	<15ms	Unlimited	Full platform

---

Risk Matrix

Risk	Probability	Impact	Mitigation
SQLite sync conflicts	MEDIUM	LOW	Last-writer-wins, conflict log
Entity normalization false positives	LOW	MEDIUM	Configurable alias map, manual override
WebSocket connection storms	LOW	MEDIUM	Connection limits, backpressure
Federation merge conflicts	HIGH	MEDIUM	Explicit conflict resolution UI
Multi-tenant security	MEDIUM	HIGH	Per-tenant SQLite, isolated HMAC keys
Protocol over-engineering	MEDIUM	LOW	Start informal, formalize only with adoption
SDK maintenance burden	HIGH	MEDIUM	Code generation from OpenAPI spec

---

This document is a living roadmap. Priorities will shift based on user feedback, production data, and the evolving AI agent ecosystem.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

Comp-Core/docs/GRAPH-KERNEL-EVO3.md

Detected Structure

Method · Evaluation · References · Figures · Code Anchors · Architecture