Grand Diomande Research · Full HTML Reader

Graph Kernel Evo³ — Evolution Cubed

**OpenClaw CompCore — Three-Phase Evolution Roadmap** **Version:** 1.0.0 · **Date:** 2026-02-14 **Baseline:** Graph Kernel v0.1.0, DEP Audit Score 7.4/10 **Author:** Mohamed Diomande

Agents That Account for Themselves proposal experiment writeup candidate score 52 .md

Full Public Reader

Graph Kernel Evo³ — Evolution Cubed

OpenClaw CompCore — Three-Phase Evolution Roadmap
Version: 1.0.0 · Date: 2026-02-14
Baseline: Graph Kernel v0.1.0, DEP Audit Score 7.4/10
Author: Mohamed Diomande

---

Overview

This document presents three evolutionary trajectories for the Graph Kernel, each building on the previous:

1. Evolution 1: Optimization — Maximize performance and correctness within the current architecture.
2. Evolution 2: Expansion — Add new capabilities that extend the system's reach.
3. Evolution 3: Transformation — Reimagine what the Graph Kernel could become.

Each evolution includes concrete implementation steps, estimated effort, dependencies, and risk assessment.

---

Evolution 1: Optimization

What can be improved within the current architecture?

1.1 SQLite Backend Migration (Native Rust)

Problem: 90

Solution: Implement `SqliteGraphStore` as a native Rust backend behind the existing `GraphStore` trait, with automatic sync from Supabase.

Implementation Plan

Phase 1: SqliteGraphStore Implementation (1 week)
├── src/store/sqlite.rs — New GraphStore implementation
│   ├── Use sqlx with sqlite feature (compile-time checked)
│   ├── Same schema as knowledge_graph table
│   ├── WAL mode for concurrent reads
│   ├── In-memory option for testing (`:memory:`)
│   └── File-based for production ([home-path])
├── Cargo.toml — New feature flag: `sqlite`
│   ├── sqlx = { features = ["sqlite", "runtime-tokio"] }
│   └── Feature: sqlite = ["sqlx/sqlite", "tokio"]
└── src/bin/graph_kernel_service.rs — Backend selection
    └── Match on DB_BACKEND env var: "sqlite" | "postgres" | "auto"

Phase 2: Sync Engine (3 days)
├── src/sync/mod.rs — Bidirectional sync module
│   ├── Full dump: Supabase → SQLite (initial load)
│   ├── Incremental: Poll Supabase for changes since last_sync_at
│   ├── Write-through: Local writes → SQLite + queue for Supabase push
│   └── Conflict resolution: Last-writer-wins with source tracking
└── Background task: tokio::spawn sync every 5 minutes

Phase 3: Hybrid Mode (2 days)
├── Reads: Always local SQLite (sub-10ms)
├── Writes: Local SQLite + async push to Supabase
├── Sync status in /health response
└── Manual sync trigger: POST /api/admin/sync

#### Effort: 2 weeks
#### Dependencies: sqlx sqlite feature, existing GraphStore trait
#### Risk: LOW — The trait abstraction already exists. SQLite is well-tested with sqlx. The only risk is sync conflicts, mitigated by last-writer-wins semantics.

#### Expected Impact
| Metric | Before | After |
|--------|--------|-------|
| Query latency | 291ms | 5–15ms |
| Multi-hop (3 hops) | 874ms | 15–45ms |
| Availability | Depends on Supabase | Works offline |

---

1.2 Server-Side Multi-Hop Traversal

Problem: Multi-hop queries require N HTTP round-trips. A 3-hop traversal costs 3 × 291ms = 874ms (or 3 × 10ms = 30ms with SQLite).

Solution: New `POST /api/knowledge/traverse` endpoint that performs BFS/DFS traversal server-side.

API Design

json
// Request
POST /api/knowledge/traverse
{
  "start": "clawdbot",           // Starting entity
  "predicates": ["uses", "depends_on"],  // Edge types to follow (null = any)
  "direction": "outgoing",       // "outgoing" | "incoming" | "both"
  "max_hops": 3,                 // Maximum traversal depth
  "max_results": 100,            // Result set cap
  "min_confidence": 0.5,         // Confidence floor
  "return_paths": true           // Include full path for each result
}

// Response
{
  "paths": [
    {
      "entities": ["clawdbot", "graph-kernel", "postgresql"],
      "edges": [
        {"subject": "clawdbot", "predicate": "uses", "object": "graph-kernel"},
        {"subject": "graph-kernel", "predicate": "uses", "object": "postgresql"}
      ],
      "hops": 2,
      "min_confidence": 0.85
    }
  ],
  "stats": {
    "entities_visited": 42,
    "edges_traversed": 67,
    "elapsed_ms": 12
  }
}

Implementation

rust
// src/service/traversal.rs
pub async fn traverse_handler(
    State(state): State<Arc<AppState>>,
    Json(request): Json<TraversalRequest>,
) -> Result<Json<TraversalResponse>, (StatusCode, Json<ErrorResponse>)> {
    let pool = state.store.pool();
    let mut visited: HashSet<String> = HashSet::new();
    let mut frontier: VecDeque<(String, Vec<TraversalEdge>, u32)> = VecDeque::new();
    let mut paths: Vec<TraversalPath> = Vec::new();

    frontier.push_back((request.start.clone(), vec![], 0));
    visited.insert(request.start.clone());

    while let Some((entity, path, depth)) = frontier.pop_front() {
        if depth > request.max_hops || paths.len() >= request.max_results {
            break;
        }

        // Query adjacent triples in one SQL call
        let triples = query_adjacent(pool, &entity, &request).await?;

        for triple in triples {
            let next_entity = if request.direction == "incoming" {
                &triple.subject
            } else {
                &triple.object
            };

            let mut new_path = path.clone();
            new_path.push(triple.clone());

            if !visited.contains(next_entity) {
                visited.insert(next_entity.clone());
                paths.push(TraversalPath::from_edges(&new_path));
                frontier.push_back((next_entity.clone(), new_path, depth + 1));
            }
        }
    }

    Ok(Json(TraversalResponse { paths, stats: ... }))
}

#### Effort: 3 days
#### Dependencies: None (uses existing pool)
#### Risk: LOW — Standard BFS over SQL. The `max_hops` and `max_results` caps prevent unbounded traversal.

---

1.3 Entity Normalization at Service Level

Problem: Entity normalization lives in a Python middleware outside the Rust service. Clients that bypass the middleware create duplicate entities.

Solution: A `canonicalize_entity()` function called in route handlers before database writes and in query handlers before lookups.

Implementation

rust
// src/service/normalize.rs

use std::collections::HashMap;
use std::sync::LazyLock;

/// Canonical entity normalization rules
static ALIAS_MAP: LazyLock<HashMap<&str, &str>> = LazyLock::new(|| {
    let mut m = HashMap::new();
    // Normalize common aliases
    m.insert("dream weaver", "dream-weaver-engine");
    m.insert("dreamweaver", "dream-weaver-engine");
    m.insert("dream-weaver", "dream-weaver-engine");
    m.insert("clawdbot", "clawdbot");
    m.insert("clawdbot-gateway", "clawdbot");
    m.insert("comp-core", "comp-core");
    m.insert("compcore", "comp-core");
    // ... loaded from config file or embedded resource
    m
});

/// Canonicalize an entity name.
///
/// Steps:
/// 1. Lowercase
/// 2. Trim whitespace
/// 3. Replace spaces/underscores with hyphens
/// 4. Check alias map for known variants
/// 5. Return canonical form
pub fn canonicalize_entity(name: &str) -> String {
    let normalized = name
        .trim()
        .to_lowercase()
        .replace(' ', "-")
        .replace('_', "-");

    ALIAS_MAP
        .get(normalized.as_str())
        .map(|s| s.to_string())
        .unwrap_or(normalized)
}

Apply in route handlers:

rust
// In add_knowledge_handler:
let triple = KnowledgeTriple {
    subject: canonicalize_entity(&triple.subject),
    object: canonicalize_entity(&triple.object),
    predicate: triple.predicate.to_lowercase(),
    ..triple
};

// In query_knowledge_handler:
let subject = params.subject.map(|s| canonicalize_entity(&s));
let object = params.object.map(|s| canonicalize_entity(&s));

#### Effort: 2 days
#### Dependencies: None
#### Risk: LOW — The alias map can start small and grow. False positives (incorrectly merging distinct entities) are the main risk, mitigated by making the map configurable.

#### Expected Impact
| Metric | Before | After |
|--------|--------|-------|
| Unique subjects | 221 (with duplicates) | ~190 (canonical) |
| Relationship relevance | 0.94 | 1.00 |
| Predicate relevance | 0.80 | 0.95+ |

---

1.4 Query Caching Layer

Problem: Identical knowledge graph queries hit PostgreSQL (or SQLite) every time.

Solution: LRU cache keyed on query parameters with TTL-based invalidation.

Implementation

rust
// src/service/cache.rs

use lru::LruCache;
use parking_lot::RwLock;
use std::hash::{Hash, Hasher};
use std::num::NonZeroUsize;
use std::time::{Duration, Instant};

pub struct QueryCache {
    cache: RwLock<LruCache<u64, CachedResult>>,
    ttl: Duration,
}

struct CachedResult {
    response: KnowledgeQueryResponse,
    cached_at: Instant,
}

impl QueryCache {
    pub fn new(max_entries: usize, ttl_secs: u64) -> Self {
        Self {
            cache: RwLock::new(LruCache::new(
                NonZeroUsize::new(max_entries).unwrap()
            )),
            ttl: Duration::from_secs(ttl_secs),
        }
    }

    pub fn get(&self, key: &KnowledgeQueryParams) -> Option<KnowledgeQueryResponse> {
        let hash = hash_params(key);
        let cache = self.cache.read();
        cache.peek(&hash)
            .filter(|r| r.cached_at.elapsed() < self.ttl)
            .map(|r| r.response.clone())
    }

    pub fn put(&self, key: &KnowledgeQueryParams, response: KnowledgeQueryResponse) {
        let hash = hash_params(key);
        let mut cache = self.cache.write();
        cache.put(hash, CachedResult {
            response,
            cached_at: Instant::now(),
        });
    }

    /// Invalidate all entries (called on write)
    pub fn invalidate(&self) {
        self.cache.write().clear();
    }
}

Cache invalidation strategy:
- Reads: Check cache first, fall through to DB on miss
- Writes: `invalidate()` on any triple insert/update/delete
- TTL: 60 seconds default, configurable via `GK_CACHE_TTL_SECS`
- Size: 1,000 entries default, configurable via `GK_CACHE_MAX_ENTRIES`

#### Effort: 1 day
#### Dependencies: `lru`, `parking_lot` (already in Cargo.toml)
#### Risk: LOW — Cache invalidation on writes prevents stale data. TTL provides a safety net.

---

1.5 Connection Pooling Optimization

Quick wins requiring only configuration changes:

ChangeCurrentRecommendedImpact
`test_before_acquire``true``false` (local), `true` (remote)-200ms on first query per connection
`min_connections`25Faster burst response
`max_connections`1020 (local SQLite supports it)2× concurrent capacity
`idle_timeout`300s600s (local)Fewer reconnections

Add env var: `DB_TEST_BEFORE_ACQUIRE=false` for local deployments.

#### Effort: 1 hour
#### Risk: NONE — Configuration-only changes

---

Evolution 1 Summary

OptimizationEffortImpactPriority
SQLite backend2 weeks20× latency improvementP0
Multi-hop traversal3 daysEliminates N-trip penaltyP0
Entity normalization2 days+0.06–0.15 relevanceP1
Query caching1 day10× for repeated queriesP1
Pool optimization1 hourMarginal improvementP2
Total~3.5 weeks

Projected DEP Audit Score After Evo 1: 8.6/10 (+1.2 from baseline 7.4)

---

Evolution 2: Expansion

What new capabilities should the Graph Kernel have?

2.1 Real-Time WebSocket Subscriptions

What: Clients subscribe to triple changes via WebSocket and receive push notifications when entities they care about are created, updated, or deleted.

Why: Enables reactive UIs, downstream pipeline triggers, and event-driven architecture without polling.

API Design

# Connect
WS /api/knowledge/subscribe

# Subscribe message
{
  "action": "subscribe",
  "filter": {
    "subjects": ["clawdbot", "graph-kernel"],
    "predicates": ["uses", "depends_on"],
    "min_confidence": 0.7
  }
}

# Notification
{
  "event": "triple_created",
  "triple": {
    "subject": "clawdbot",
    "predicate": "uses",
    "object": "gemini-2.5",
    "confidence": 0.95,
    "source": "topology-ingester"
  },
  "timestamp": "2026-02-14T12:00:00Z"
}

Implementation

rust
// src/service/websocket.rs
use axum::extract::ws::{WebSocket, WebSocketUpgrade};
use tokio::sync::broadcast;

/// Broadcast channel for triple change events
pub struct TripleEventBus {
    sender: broadcast::Sender<TripleEvent>,
}

impl TripleEventBus {
    pub fn new(capacity: usize) -> Self {
        let (sender, _) = broadcast::channel(capacity);
        Self { sender }
    }

    pub fn publish(&self, event: TripleEvent) {
        // Ignore error (no subscribers)
        let _ = self.sender.send(event);
    }

    pub fn subscribe(&self) -> broadcast::Receiver<TripleEvent> {
        self.sender.subscribe()
    }
}

// In route handlers, after successful triple insert:
state.event_bus.publish(TripleEvent::Created(triple));

#### Effort: 3 days
#### Dependencies: `axum` (WebSocket support built-in), `tokio::sync::broadcast`
#### Risk: LOW — The broadcast channel pattern is well-established. Memory bounded by channel capacity. Slow subscribers get dropped (lagging receiver).

---

2.2 Graph Visualization Endpoint

What: An endpoint that returns graph data in formats consumable by D3.js, Mermaid, or Graphviz for visual exploration.

Why: Dramatically improves developer experience. "Show me what clawdbot connects to" should produce a visual graph, not a JSON array of triples.

API Design

GET /api/knowledge/graph?subject=clawdbot&hops=2&format=d3
GET /api/knowledge/graph?subject=clawdbot&hops=2&format=mermaid
GET /api/knowledge/graph?subject=clawdbot&hops=2&format=dot

Response Formats

D3 (force-directed graph):

json
{
  "nodes": [
    {"id": "clawdbot", "group": "service", "weight": 15},
    {"id": "graph-kernel", "group": "service", "weight": 8},
    {"id": "postgresql", "group": "infrastructure", "weight": 3}
  ],
  "links": [
    {"source": "clawdbot", "target": "graph-kernel", "predicate": "uses", "confidence": 0.95},
    {"source": "graph-kernel", "target": "postgresql", "predicate": "uses", "confidence": 0.90}
  ]
}

Mermaid:

graph LR
    clawdbot -->|uses| graph-kernel
    graph-kernel -->|uses| postgresql
    clawdbot -->|uses| rag-plusplus

DOT (Graphviz):

digraph G {
    "clawdbot" -> "graph-kernel" [label="uses"];
    "graph-kernel" -> "postgresql" [label="uses"];
}

Implementation

This reuses the server-side traversal from §1.2, adding output format rendering:

rust
// src/service/visualization.rs

pub fn render_d3(paths: &[TraversalPath]) -> D3Graph {
    let mut nodes = IndexSet::new();
    let mut links = Vec::new();

    for path in paths {
        for edge in &path.edges {
            nodes.insert(edge.subject.clone());
            nodes.insert(edge.object.clone());
            links.push(D3Link {
                source: edge.subject.clone(),
                target: edge.object.clone(),
                predicate: edge.predicate.clone(),
                confidence: edge.confidence,
            });
        }
    }

    D3Graph {
        nodes: nodes.into_iter().map(|id| D3Node {
            id: id.clone(),
            group: infer_group(&id),
            weight: links.iter().filter(|l| l.source == id || l.target == id).count(),
        }).collect(),
        links,
    }
}

pub fn render_mermaid(paths: &[TraversalPath]) -> String {
    let mut lines = vec!["graph LR".to_string()];
    let mut seen = HashSet::new();

    for path in paths {
        for edge in &path.edges {
            let key = format!("{}--{}-->{}", edge.subject, edge.predicate, edge.object);
            if seen.insert(key) {
                lines.push(format!(
                    "    {} -->|{}| {}",
                    sanitize_mermaid(&edge.subject),
                    edge.predicate,
                    sanitize_mermaid(&edge.object)
                ));
            }
        }
    }

    lines.join("\n")
}

#### Effort: 2 days
#### Dependencies: Server-side traversal (§1.2)
#### Risk: LOW — Pure rendering over existing data. No new state management.

---

2.3 Batch Ingest API (High-Performance)

What: Replace the sequential triple-per-row insert with a bulk `COPY` or multi-row `INSERT` for high-throughput ingestion.

Why: The current batch handler inserts one row at a time in a transaction loop. For the topology ingester's 5,000+ triples, this means 5,000 SQL round-trips (even if within a single transaction).

Implementation

rust
// Replace sequential inserts with a multi-row VALUES clause
pub async fn bulk_insert_triples(
    pool: &PgPool,
    triples: &[KnowledgeTriple],
) -> Result<BulkInsertResult, sqlx::Error> {
    // Chunk into batches of 500 (PostgreSQL parameter limit ~32K)
    let chunk_size = 500;
    let mut added = 0;
    let mut updated = 0;

    for chunk in triples.chunks(chunk_size) {
        // Build multi-row INSERT
        let mut query = String::from(
            "INSERT INTO knowledge_graph (subject, predicate, object, confidence, source) VALUES "
        );
        let mut params: Vec<String> = Vec::new();
        let mut bind_values: Vec<&str> = Vec::new();

        for (i, triple) in chunk.iter().enumerate() {
            let offset = i * 5;
            params.push(format!(
                "(${}, ${}, ${}, ${}, ${})",
                offset + 1, offset + 2, offset + 3, offset + 4, offset + 5
            ));
        }

        query.push_str(&params.join(", "));
        query.push_str(
            " ON CONFLICT (subject, predicate, object) DO UPDATE SET \
              confidence = GREATEST(knowledge_graph.confidence, EXCLUDED.confidence), \
              source = EXCLUDED.source"
        );

        // Execute with binds
        let mut q = sqlx::query(&query);
        for triple in chunk {
            q = q.bind(&triple.subject)
                 .bind(&triple.predicate)
                 .bind(&triple.object)
                 .bind(triple.confidence)
                 .bind(&triple.source);
        }

        let result = q.execute(pool).await?;
        added += result.rows_affected() as usize;
    }

    Ok(BulkInsertResult { added, updated, total: triples.len() })
}

Alternatively, for SQLite backend, use sqlx's `PRAGMA journal_mode=WAL` + batch transaction:

rust
// SQLite-optimized batch
let mut tx = pool.begin().await?;
sqlx::query("PRAGMA synchronous = OFF").execute(&mut *tx).await?;

for triple in &triples {
    sqlx::query("INSERT OR REPLACE INTO knowledge_graph ...")
        .bind(...)
        .execute(&mut *tx)
        .await?;
}

tx.commit().await?;

#### Effort: 2 days
#### Dependencies: None
#### Risk: LOW — Multi-row INSERT is standard PostgreSQL. Chunking prevents parameter limit overflow.
#### Expected Impact: 10–50× faster ingestion (5,000 triples in <1s vs. current ~50s)

---

2.4 Temporal Versioning (Triple Validity Windows)

What: Add `valid_from` and `valid_until` columns to triples, enabling temporal knowledge management.

Why: Knowledge changes over time. "clawdbot uses gemini-1.5" was true in January but "clawdbot uses gemini-2.5" is true now. Without temporal versioning, stale knowledge persists at full confidence.

Schema Change

sql
ALTER TABLE knowledge_graph
ADD COLUMN valid_from TIMESTAMPTZ DEFAULT NOW(),
ADD COLUMN valid_until TIMESTAMPTZ DEFAULT NULL,
ADD COLUMN superseded_by BIGINT REFERENCES knowledge_graph(id) DEFAULT NULL;

-- Index for temporal queries
CREATE INDEX idx_knowledge_temporal
ON knowledge_graph (subject, predicate, valid_from, valid_until)
WHERE valid_until IS NULL;

API Changes

json
// Query with temporal filter
GET /api/knowledge?subject=clawdbot&at=2026-01-15T00:00:00Z

// Insert with validity
POST /api/knowledge
{
  "subject": "clawdbot",
  "predicate": "uses",
  "object": "gemini-2.5",
  "confidence": 0.95,
  "valid_from": "2026-02-01T00:00:00Z",
  "supersedes": {"subject": "clawdbot", "predicate": "uses", "object": "gemini-1.5"}
}

Implementation

  • Default: `valid_until = NULL` (currently valid)
  • Supersession: When a new triple supersedes an old one, set `valid_until = NOW()` on the old triple and `superseded_by = new_id`
  • Query: Add `WHERE valid_until IS NULL OR valid_until > $timestamp` filter

#### Effort: 1 week
#### Dependencies: Schema migration, backward-compatible API changes
#### Risk: MEDIUM — Migration of existing 3,502 triples. All existing triples get `valid_from = created_at, valid_until = NULL`. Queries need updated to filter on validity by default.

---

2.5 Community Detection (Leiden/Louvain)

What: Automatic clustering of entities into communities based on graph topology, using the Leiden or Louvain algorithm.

Why: With 221 subjects and 3,502 triples, the knowledge graph has natural clusters (e.g., "audio production projects", "infrastructure services", "personal preferences"). Discovering these clusters enables:
- Automated knowledge organization
- Cluster-level context retrieval ("give me everything about the audio cluster")
- Graph compression for visualization

Implementation

rust
// src/analysis/community.rs

/// Louvain community detection over the knowledge graph
pub struct CommunityDetector {
    adjacency: HashMap<String, Vec<(String, f64)>>,  // entity → [(neighbor, weight)]
}

impl CommunityDetector {
    /// Build adjacency from triples
    pub fn from_triples(triples: &[StoredKnowledgeTriple]) -> Self {
        let mut adj: HashMap<String, Vec<(String, f64)>> = HashMap::new();
        for triple in triples {
            adj.entry(triple.subject.clone())
                .or_default()
                .push((triple.object.clone(), triple.confidence));
            adj.entry(triple.object.clone())
                .or_default()
                .push((triple.subject.clone(), triple.confidence));
        }
        Self { adjacency: adj }
    }

    /// Run Louvain algorithm
    pub fn detect(&self) -> Vec<Community> {
        // Phase 1: Local modularity optimization
        // Phase 2: Community aggregation
        // Iterate until convergence
        // Return communities with member lists
        todo!("Implement Louvain — or use petgraph + community crate")
    }
}

Alternative: Use the `petgraph` crate for graph algorithms and implement Louvain on top, or shell out to a Python script using `networkx.community.louvain_communities`.

API

GET /api/knowledge/communities
→ {
    "communities": [
      {
        "id": 0,
        "name": "Audio Production",  // Auto-generated from dominant predicate
        "members": ["dream-weaver-engine", "cc-echelon", "cc-cinematographer", ...],
        "internal_edges": 45,
        "modularity": 0.72
      },
      ...
    ],
    "modularity_score": 0.65,
    "algorithm": "louvain"
  }

#### Effort: 1 week
#### Dependencies: `petgraph` crate (or custom implementation)
#### Risk: MEDIUM — Algorithmic complexity. The current graph size (221 nodes) is trivial for Louvain, but the implementation needs to be correct.

---

2.6 Structural Metrics Endpoint

What: Compute and expose graph-level and node-level structural metrics: PageRank, betweenness centrality, degree distribution, clustering coefficient.

Why: These metrics answer operational questions: "Which entity is most central?", "Which entities are bridges between clusters?", "Is the graph getting more connected or more fragmented?"

API

GET /api/knowledge/metrics
→ {
    "graph": {
      "node_count": 221,
      "edge_count": 3502,
      "density": 0.072,
      "avg_degree": 31.7,
      "clustering_coefficient": 0.43,
      "diameter": 7,
      "connected_components": 3
    },
    "top_pagerank": [
      {"entity": "clawdbot", "score": 0.142},
      {"entity": "mohamed-diomande", "score": 0.098},
      {"entity": "comp-core", "score": 0.076}
    ],
    "top_betweenness": [
      {"entity": "clawdbot", "score": 0.312},
      {"entity": "graph-kernel", "score": 0.187}
    ],
    "bridges": [
      {"entity": "orbit", "connects": ["infrastructure", "agent-services"]}
    ]
  }

Implementation

This reuses the `influence.rs` module in the Atlas subsystem, which already computes `TurnInfluence`, `BridgeTurn`, and `PhaseTopologyStats`. The knowledge graph metrics endpoint would apply the same algorithms to the triple graph instead of the turn DAG.

#### Effort: 3 days (reusing atlas/influence.rs patterns)
#### Dependencies: Server-side traversal for full graph access
#### Risk: LOW — The atlas subsystem already implements these algorithms. Porting to the triple graph is mechanical.

---

Evolution 2 Summary

CapabilityEffortImpactPriority
WebSocket subscriptions3 daysEvent-driven architectureP2
Graph visualization2 days10× better DXP1
Batch ingest (bulk)2 days10–50× faster ingestionP1
Temporal versioning1 weekKnowledge lifecycle managementP2
Community detection1 weekAutomatic organizationP3
Structural metrics3 daysOperational intelligenceP2
Total~4 weeks

Projected DEP Audit Score After Evo 1+2: 9.1/10 (+0.5 from Evo 1)

---

Evolution 3: Transformation

What could the Graph Kernel BECOME if we reimagined it?

3.1 Federated Graph Kernel (Cross-Instance Knowledge Sharing)

Vision: Multiple Graph Kernel instances (one per user, team, or organization) can share and merge knowledge while preserving provenance boundaries.

Why this matters: Right now, the Graph Kernel is a single-user system. Mohamed's knowledge graph is isolated. But imagine multiple agents, each with their own Graph Kernel, sharing knowledge about a shared codebase while maintaining cryptographic provenance over who contributed what.

Architecture

┌──────────────────┐     ┌──────────────────┐     ┌──────────────────┐
│  GK Instance A   │     │  GK Instance B   │     │  GK Instance C   │
│  (Mohamed)       │     │  (Team Alpha)    │     │  (CI/CD Agent)   │
│                  │     │                  │     │                  │
│  Local triples   │◄───►│  Local triples   │◄───►│  Local triples   │
│  + HMAC signer   │     │  + HMAC signer   │     │  + HMAC signer   │
└────────┬─────────┘     └────────┬─────────┘     └────────┬─────────┘
         │                        │                        │
         └────────────┬───────────┘────────────────────────┘
                      │
              ┌───────▼────────┐
              │  Federation    │
              │  Relay         │
              │                │
              │  - Merge rules │
              │  - Provenance  │
              │  - Conflict    │
              │    resolution  │
              └────────────────┘

Key Design Decisions

1. Provenance preservation: Each triple retains its `source` (instance ID + original source). Federation relay annotates merged triples with `federated_from` metadata.

2. Merge semantics:
- Same (subject, predicate, object) from different instances → take highest confidence, record all sources
- Conflicting objects for same (subject, predicate) → create both triples with source attribution
- HMAC tokens are instance-scoped — federation relay issues a federation-level token

3. Access control: Each instance declares which predicates are shareable. `likes`, `wants_to` → private. `uses`, `depends_on`, `is_a` → shareable.

4. Sync protocol:

   POST /api/federation/offer
   {"triples": [...], "instance_id": "alpha", "since": "2026-02-14T00:00:00Z"}

   POST /api/federation/accept
   {"triple_ids": [1, 2, 3], "instance_id": "alpha"}

#### Effort: 3–4 weeks
#### Dependencies: WebSocket subscriptions (Evo 2.1), entity normalization (Evo 1.3)
#### Risk: HIGH — Distributed systems complexity. Merge conflicts, network partitions, eventual consistency. The HMAC model needs extension for multi-party verification.

---

3.2 Universal Context Authority for AI Agents

Vision: The Graph Kernel becomes the standard context authority for ALL AI agents — not just Clawdbot. Any agent framework (LangChain, CrewAI, AutoGen, custom) can request provenance-governed context slices.

Why this matters: The provenance engine concept (Definition 1 from the research paper) is framework-agnostic. Every AI agent that makes consequential decisions needs reproducible, auditable context construction. No one has built this as a service.

Integration Architecture

┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│  Clawdbot   │  │  LangChain  │  │  CrewAI     │  │  Custom     │
│  Agent      │  │  Agent      │  │  Agent      │  │  Agent      │
└──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘
       │                │                │                │
       └────────────────┼────────────────┼────────────────┘
                        │
                ┌───────▼────────┐
                │  Graph Kernel  │
                │  Context API   │
                │                │
                │  /api/context  │  ← Universal context request
                │  /api/verify   │  ← Universal token verification
                │  /api/register │  ← Agent registration
                └────────────────┘

Universal Context Request API

json
POST /api/context/request
{
  "agent_id": "langchain-agent-42",
  "agent_framework": "langchain",
  "conversation_id": "conv-abc-123",
  "anchor": {
    "type": "turn_id",     // or "entity", "timestamp", "content_hash"
    "value": "550e8400-e29b-41d4-a716-446655440000"
  },
  "policy": "default",     // or custom policy hash
  "max_tokens": 4096,      // Token budget for context window
  "format": "messages"     // "messages" | "text" | "triples" | "graph"
}

// Response
{
  "context": {
    "messages": [
      {"role": "user", "content": "...", "turn_id": "..."},
      {"role": "assistant", "content": "...", "turn_id": "..."}
    ],
    "total_tokens": 3847
  },
  "provenance": {
    "slice_id": "a1b2c3...",
    "admissibility_token": "d4e5f6...",
    "graph_snapshot_hash": "789abc...",
    "policy_ref": {"policy_id": "slice_policy_v1", "params_hash": "..."},
    "schema_version": "1.0.0"
  }
}

#### Effort: 6–8 weeks
#### Dependencies: SQLite backend (Evo 1.1), federation (Evo 3.1)
#### Risk: HIGH — Requires defining a universal agent context protocol. Multi-framework testing. Performance at scale with many concurrent agents.

---

3.3 Graph Kernel SDK (Rust + Python + JavaScript)

Vision: A first-party SDK that makes integrating with the Graph Kernel as easy as `pip install graph-kernel` or `npm install @openclaw/graph-kernel`.

Rust Crate (Already Exists — Productize)

The `cc-graph-kernel` crate with `default` feature is already a library. To make it a standalone SDK:

toml
# Published as `graph-kernel` on crates.io
[package]
name = "graph-kernel"
version = "0.2.0"
# ... remove internal paths, make self-contained

Python SDK

python
# pip install graph-kernel
from graph_kernel import GraphKernelClient, Triple

gk = GraphKernelClient("http://localhost:8001")

# Add knowledge
gk.add_triple(Triple("clawdbot", "uses", "gemini-2.5", confidence=0.95))

# Query
triples = gk.query(subject="clawdbot", predicate="uses")

# Traverse
paths = gk.traverse("clawdbot", predicates=["uses"], max_hops=3)

# Verify token
is_valid = gk.verify_token(admissibility_token, slice_id, ...)

# Subscribe to changes
async for event in gk.subscribe(subjects=["clawdbot"]):
    print(f"New triple: {event.triple}")

JavaScript/TypeScript SDK

typescript
// npm install @openclaw/graph-kernel
import { GraphKernel } from '@openclaw/graph-kernel';

const gk = new GraphKernel('http://localhost:8001');

// Add knowledge
await gk.addTriple({
  subject: 'clawdbot',
  predicate: 'uses',
  object: 'gemini-2.5',
  confidence: 0.95
});

// Query
const triples = await gk.query({ subject: 'clawdbot' });

// Traverse with visualization
const graph = await gk.traverse('clawdbot', {
  predicates: ['uses'],
  maxHops: 3,
  format: 'd3'
});

// Render in browser
gk.visualize(graph, document.getElementById('graph-container'));

#### Effort: 4 weeks (2 weeks Python, 2 weeks JS/TS)
#### Dependencies: Stable API (Evo 1+2 complete), OpenAPI spec
#### Risk: MEDIUM — SDK maintenance burden. Breaking API changes require SDK updates. Need CI for all three languages.

---

3.4 Graph Kernel Protocol (Open Standard)

Vision: Define an open protocol specification for provenance-governed context authorities. Any implementation (not just OpenClaw's) can comply.

Why: If the provenance engine category is real (and the research paper argues it is), the protocol should be open. This enables:
- Interoperability between provenance engines
- Academic research on the protocol semantics
- Community implementations in different languages
- Standardization through W3C or IETF

Protocol Specification

Graph Kernel Protocol v1 (GKP/1)

1. CONTEXT REQUEST
   - Client sends anchor, policy reference, budget constraints
   - Server returns evidence bundle with admissibility token

2. TOKEN VERIFICATION
   - Client presents token + provenance metadata to any verifier
   - Verifier confirms token authenticity without accessing signing secret

3. POLICY REGISTRATION
   - Client registers expansion policy with hash-stable fingerprint
   - Server returns policy reference (policy_id + params_hash)

4. KNOWLEDGE ASSERTION
   - Client asserts (subject, predicate, object, confidence, source)
   - Server stores and indexes, returns assertion ID

5. FEDERATION
   - Instances exchange knowledge with provenance preservation
   - Cross-instance tokens use federated verification

#### Deliverables
1. Protocol spec document (Markdown → RFC-style)
2. Reference implementation (the Graph Kernel itself)
3. Conformance test suite (automated tests any implementation can run)
4. Protocol buffer / JSON Schema definitions for wire format

#### Effort: 4–6 weeks (spec writing + conformance tests)
#### Dependencies: Stable API, research paper published
#### Risk: MEDIUM — Standards work is slow. Risk of over-engineering. Start with an informational document, not a formal standard.

---

3.5 W3C PROV-DM Integration

Vision: Map the Graph Kernel's provenance model to the W3C PROV Data Model, enabling interoperability with the broader provenance ecosystem.

Why: The W3C PROV family of specs (PROV-DM, PROV-O, PROV-JSON) is the international standard for provenance representation. Aligning with PROV-DM would:
- Enable export to any PROV-compliant system
- Support academic citation and reproducibility
- Integrate with existing provenance visualization tools (ProvVis, etc.)

Mapping

Graph Kernel ConceptPROV-DM Concept
`TurnSnapshot``prov:Entity`
`ContextSlicer.slice()``prov:Activity`
`SliceExport``prov:Entity` (derived)
`AdmissibilityToken`Custom `openclaw:admissibilityToken`
`SlicePolicyV1``prov:Plan`
`GraphSnapshotHash``prov:Entity` (immutable record)
`source` (triple)`prov:wasAttributedTo` → `prov:Agent`

PROV-JSON Export

json
{
  "prefix": {
    "gk": "https://openclaw.org/prov/graph-kernel/",
    "prov": "http://www.w3.org/ns/prov#"
  },
  "entity": {
    "gk:slice/a1b2c3": {
      "prov:type": "gk:SliceExport",
      "gk:sliceId": "a1b2c3",
      "gk:admissibilityToken": "d4e5f6",
      "gk:graphSnapshotHash": "789abc"
    }
  },
  "activity": {
    "gk:slicing/001": {
      "prov:type": "gk:ContextSlicing",
      "gk:policyId": "slice_policy_v1",
      "gk:policyParamsHash": "..."
    }
  },
  "wasGeneratedBy": {
    "gk:gen1": {
      "prov:entity": "gk:slice/a1b2c3",
      "prov:activity": "gk:slicing/001"
    }
  }
}

Implementation

New endpoint: `GET /api/provenance/prov-json?slice_id=a1b2c3`

#### Effort: 1 week
#### Dependencies: Stable SliceExport API
#### Risk: LOW — PROV-JSON is a well-defined format. The mapping is straightforward.

---

3.6 Graph Kernel Cloud (Managed Service)

Vision: A hosted, multi-tenant Graph Kernel service where anyone can create a provenance engine for their AI agents. Think "Supabase for provenance."

Why: If the Graph Kernel is genuinely useful (and the evaluation data says it is), most developers won't want to self-host. A managed service removes operational burden and creates a SaaS business model.

Architecture

┌─────────────────────────────────────────────────────────┐
│                   Graph Kernel Cloud                     │
│                                                          │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────┐ │
│  │  API Gateway  │  │  Auth (JWT)  │  │  Rate Limiter │ │
│  └──────┬───────┘  └──────┬───────┘  └──────┬────────┘ │
│         │                  │                  │          │
│  ┌──────▼──────────────────▼──────────────────▼────────┐│
│  │              GK Instance Pool                        ││
│  │                                                      ││
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐          ││
│  │  │ Tenant A │  │ Tenant B │  │ Tenant C │  ...     ││
│  │  │ (SQLite) │  │ (SQLite) │  │ (SQLite) │          ││
│  │  └──────────┘  └──────────┘  └──────────┘          ││
│  └─────────────────────────────────────────────────────┘│
│                                                          │
│  ┌──────────────────────────────────────────────────────┐│
│  │  Shared Infrastructure: Metrics, Logging, Billing    ││
│  └──────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────┘

Key Design Decisions

1. Tenant isolation: Each tenant gets their own SQLite database file. No shared state between tenants. HMAC secrets are per-tenant.

2. Instance pooling: Graph Kernel instances are lightweight (20MB binary + SQLite). On Cloud Run, one container can serve multiple tenants with isolated databases.

3. API key authentication: Each tenant gets an API key. JWT tokens scope access to their knowledge graph and slices.

4. Billing model:
- Free tier: 10,000 triples, 1,000 queries/day
- Pro tier: 1M triples, 100K queries/day, federation
- Enterprise: Unlimited, SLA, custom policies

5. Data residency: Tenant data stays in their chosen region. SQLite files can be replicated to R2/S3 for durability.

#### Effort: 3–4 months
#### Dependencies: All of Evo 1 + most of Evo 2
#### Risk: HIGH — Significant product and infrastructure investment. Multi-tenancy, billing, authentication, compliance. This is a company, not a feature.

---

Evolution 3 Summary

TransformationEffortImpactPriorityRisk
Federated Graph Kernel3–4 weeksMulti-agent knowledge sharingP2HIGH
Universal Context Authority6–8 weeksFramework-agnostic provenanceP1HIGH
Graph Kernel SDK4 weeks10× developer adoptionP1MEDIUM
Graph Kernel Protocol4–6 weeksOpen standardP3MEDIUM
W3C PROV-DM Integration1 weekStandards complianceP2LOW
Graph Kernel Cloud3–4 monthsSaaS business modelP4HIGH

---

Roadmap Timeline

2026 Q1 (Now)
├── Evo 1.1: SQLite Backend ████████████████ (2 weeks)
├── Evo 1.2: Multi-hop Traversal ██████ (3 days)
├── Evo 1.3: Entity Normalization ████ (2 days)
├── Evo 1.4: Query Caching ██ (1 day)
└── Evo 1.5: Pool Optimization █ (1 hour)

2026 Q2
├── Evo 2.2: Graph Visualization ████ (2 days)
├── Evo 2.3: Batch Ingest ████ (2 days)
├── Evo 2.1: WebSocket Subscriptions ██████ (3 days)
├── Evo 2.6: Structural Metrics ██████ (3 days)
├── Evo 2.4: Temporal Versioning ████████████ (1 week)
└── Evo 2.5: Community Detection ████████████ (1 week)

2026 Q3
├── Evo 3.3: Graph Kernel SDK ████████████████████████████ (4 weeks)
├── Evo 3.5: W3C PROV-DM ████████████ (1 week)
└── Evo 3.2: Universal Context API ████████████████████████████████████████ (6-8 weeks)

2026 Q4
├── Evo 3.1: Federation ████████████████████████████ (3-4 weeks)
├── Evo 3.4: Protocol Spec ████████████████████████████████ (4-6 weeks)
└── Evo 3.6: Cloud (begins) ████████████████████████████████████████████████ (ongoing)

---

Projected Impact

PhaseDEP ScoreLatencyTriplesFeatures
Current7.4291ms3,50212 endpoints
After Evo 18.6<15ms3,502+14 endpoints
After Evo 29.1<15ms10K+20 endpoints
After Evo 39.5+<15msUnlimitedFull platform

---

Risk Matrix

RiskProbabilityImpactMitigation
SQLite sync conflictsMEDIUMLOWLast-writer-wins, conflict log
Entity normalization false positivesLOWMEDIUMConfigurable alias map, manual override
WebSocket connection stormsLOWMEDIUMConnection limits, backpressure
Federation merge conflictsHIGHMEDIUMExplicit conflict resolution UI
Multi-tenant securityMEDIUMHIGHPer-tenant SQLite, isolated HMAC keys
Protocol over-engineeringMEDIUMLOWStart informal, formalize only with adoption
SDK maintenance burdenHIGHMEDIUMCode generation from OpenAPI spec

---

This document is a living roadmap. Priorities will shift based on user feedback, production data, and the evolving AI agent ecosystem.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

Comp-Core/docs/GRAPH-KERNEL-EVO3.md

Detected Structure

Method · Evaluation · References · Figures · Code Anchors · Architecture