Deterministic Provenance Engines for Autonomous Agent Systems: Architecture, Implementation, and Evaluation of the Graph Kernel

Full HTML reader

Read the full artifact

Extracted abstract or opening context

# Deterministic Provenance Engines for Autonomous Agent Systems: Architecture, Implementation, and Evaluation of the Graph Kernel > **Manuscript Type:** Full Research Paper (V2 — Post-Audit Definitive Edition) > **Track:** AI Systems & Knowledge Infrastructure > **Date:** July 2026 > **Revision:** 2.0 — Incorporates DEP Audit findings, Evo³ roadmap, and implemented improvements Autonomous AI agents making consequential decisions require infrastructure that ensures every reasoning step is traceable, reproducible, and verifiable. We present the **Graph Kernel**, a deterministic provenance engine implemented as a single Rust binary (~15 KLOC) that produces cryptographically-signed, policy-governed context windows — termed *admissible evidence bundles* — for autonomous agent reasoning. Unlike general-purpose graph databases, vector stores, or RAG pipelines, the Graph Kernel introduces a formal category of infrastructure we call the **provenance engine**: a service whose output is not information retrieval but the construction of verifiable evidence with unforgeable authorization proofs. We evaluate the Graph Kernel across 27 queries spanning five categories (factual recall, relationship mapping, multi-hop reasoning, fuzzy/semantic search, and predicate-specific queries) against three baselines (keyword, BM25, vector-similarity RAG). Results demonstrate perfect relevance (1.00) on multi-hop structural queries — returning causally-connected knowledge chains rather than keyword-coincidence result sets — at sub-300ms latency over a remote PostgreSQL backend. A comprehensive Deep Engineering Posture (DEP) audit scoring 7.4/10 identified 47 findings across 12 dimensions; we implemented 10 critical fixes including native Rust entity normalization, parameterized SQL queries, server-side multi-hop traversal, and connection pool optimization, raising the projected score to 8.4/10. We further present RAG++, a complementary semantic retrieval engine (~26 KLOC Rust core with Python bindings), and describe the hybrid retrieval architecture that bridges structural graph reasoning with vector-similarity search. Comparative analysis against ten industry systems (Neo4j, Amazon Neptune, Apache Jena, Dgraph, TypeDB, Weaviate, LangChain/LlamaIndex, Microsoft GraphRAG, and Zep) establishes that no existing system provides the combination of HMAC-signed deterministic context windows, type-level admissibility enforcement, and policy-governed multi-hop provenance that the Graph Kernel offers. We conclude with a three-phase evolution roadmap spanning optimization, expansion, and transformation of the provenance engine into a universal context authority for heterogeneous agent ecosystems.

Promotion decision

What has to happen next

Convert into the standard paper schema, add citations, and render a draft PDF.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.