Inscription-Conditioned Cognitive Twin: N'Ko Sigil Encoding as Semantic Compression for Long-Context Personality Models
Context window limitations constrain the fidelity of small personality models. A 4B parameter model with a 32K token context can hold roughly 8,000 words of conversation history before truncation begins discarding information critical to persona coherence. We present the Inscription-Conditioned Cognitive Twin (ICCT), an architecture that addresses this bottleneck by encoding conversation history as N'Ko inscriptions rather than English prose. The encoding uses 10 N'Ko sigils, each a single Unicode character derived
Full Public Reader
Inscription-Conditioned Cognitive Twin: N'Ko Sigil Encoding as Semantic Compression for Long-Context Personality Models
Mohamed Diomande
Independent Researcher
March 2026
---
Abstract
Context window limitations constrain the fidelity of small personality models. A 4B parameter model with a 32K token context can hold roughly 8,000 words of conversation history before truncation begins discarding information critical to persona coherence. We present the Inscription-Conditioned Cognitive Twin (ICCT), an architecture that addresses this bottleneck by encoding conversation history as N'Ko inscriptions rather than English prose. The encoding uses 10 N'Ko sigils, each a single Unicode character derived from dynamical systems claims (stabilization, transition, novelty, etc.), as a semantic alphabet where one inscription line compresses an entire conversation turn into 3-8 tokens. Combining marks from the N'Ko Unicode block (U+07EB through U+07F3) encode trajectory depth and opacity, adding a second information channel without consuming additional token budget.
We report four principal findings. First, inscription encoding achieves 100
On a 20-question evaluation suite drawn from real operator interactions, the inscription-conditioned twin achieves 90
Keywords: cognitive twin, N'Ko inscriptions, semantic compression, LoRA, context window, anticipation geometry, knowledge graph, PsiChain, personality transfer, anticipatory transformer, flow encoder, inverse scaling
---
1. Introduction
1.1 The Context Window Bottleneck
Personality models face a fundamental tension between model size and context capacity. Large models (70B+) have both the parameter depth to encode personality and the context windows (128K+ tokens) to hold extensive conversation history. But large models are impractical for real-time personal inference on consumer hardware and resist persona override due to deeply ingrained RLHF conditioning.
Small models (3B-4B) are the practical choice for personality transfer. Prior work (Diomande, 2026a) demonstrates that Qwen2.5-3B with LoRA on all 36 layers produces superior persona override compared to Qwen2.5-7B with 8-layer LoRA, because the smaller model's thinner RLHF conditioning is easier to overwrite. The counterintuitive finding is that parameter count matters less than layer coverage for personality fidelity.
But small models have limited context windows (typically 32K-128K tokens for recent architectures). In a deployment where the cognitive twin must respond to 4-8 concurrent Claude Code sessions, each session generating 50-200 turns of conversation history, the raw English transcript of a single session can easily exceed 30,000 tokens. The twin must either truncate history (losing context that may be critical to personality-consistent responses) or summarize it (introducing summarization artifacts and losing fine-grained trajectory information).
1.2 The Inscription Insight
The key insight of this work is that the information density of conversation history is extremely low when represented as English prose. A user message like "Deploy Spore to TestFlight, all steps, no pauses" contains exactly one relevant signal for the twin's trajectory model: stabilization (the user is affirming forward progress with high commitment). The 10 English words carry approximately 15 tokens of information, of which the trajectory-relevant content is a single bit: this turn is an affirmation.
N'Ko inscriptions compress this single bit into a single sigil character plus metadata:
ߛ ⟦0.0-1.0⟧ : z(sigma) down ; spore ; c=0.90This inscription line is approximately 3 tokens in most tokenizers (the N'Ko characters are typically single tokens, the ASCII punctuation and numbers are standard). It encodes the trajectory state (stabilization), the time window, the relevant project basin, and the confidence level. The original 15 tokens of English are reduced to 3 tokens of inscription with zero information loss for the twin's decision-making purposes.
1.3 N'Ko as a Computational Medium
The choice of N'Ko is not arbitrary. N'Ko (U+07C0 through U+07F5) is a West African script with properties that make it uniquely suited to structured data encoding:
1. Fixed character set. The Unicode block contains exactly 59 characters (41 base characters and 18 modifiers), producing a bounded, learnable vocabulary.
2. Combining mark system. Nine combining marks (U+07EB through U+07F3) attach to base characters to modify them, providing a second information channel (depth, opacity) without consuming additional horizontal space.
3. NFC stability. All N'Ko characters are confirmed NFC-normalized (verified by the cc-inscription test suite), meaning the byte representation is stable across platforms, a critical property for cryptographic hashing.
4. 4-state FSM validation. Any N'Ko string can be validated by a simple finite state machine: consonant, vowel, optional nasal coda, optional tone mark. This regularity makes N'Ko strings inherently more predictable for a language model than arbitrary Unicode.
5. Cultural significance. N'Ko was created by Solomana Kante in 1949 for the Manding languages. Using it as a computational medium extends its digital presence and creates a bridge between cultural preservation and technical innovation.
1.4 Contributions
This paper makes the following contributions:
1. The ICCT architecture: a cognitive twin that reads N'Ko inscriptions for session context and responds in English, achieving 100
2. A mapping from 7 anticipation geometry scalars to 10 discrete sigil types, grounding the encoding in a validated mathematical framework.
3. An inscription encoder (Python, 358 lines) that mirrors the cc-inscription Rust crate's surface renderer, producing training data where the twin learns to interpret inscriptions as trajectory context.
4. An 11-version experimental comparison demonstrating an inverse scaling law for personality transfer: 3B and 4B models outperform 7B models because thinner RLHF conditioning is easier to overwrite, culminating in an A40 GPU run achieving 0.733 eval loss (3x improvement over Mac5's best).
5. A learned flow encoder (27KB MLP) that replaces keyword classification with soft-posterior sigil distributions at 85.7
6. An evaluation suite of 20 real question-answer pairs achieving 90
7. Theoretical analysis connecting inscriptions to anticipatory transformers as a natural encoder architecture, with potential SOTA improvements on conversation convergence and KG path ranking.
8. Theoretical analysis connecting inscriptions to PsiChain provenance, enabling on-chain settlement of twin decision histories on Bitcoin via Stacks.
---
2. Background and Related Work
2.1 Anticipation Geometry
Anticipation Geometry (Diomande, 2025a) defines 7 geometric scalars over arbitrary vector trajectories. Originally developed for real-time motion capture at 50 Hz, the framework generalizes to any sequence of vectors in a metric space:
| # | Scalar | Range | Definition |
|---|---|---|---|
| 1 | Commitment C(t) | [0,1] | 1 - (step_size / max_step). How irreversible the trajectory. |
| 2 | Uncertainty U(t) | [0,1] | (1 - mean_pairwise_cosine(neighbor_displacements)) / 2. Plausible futures remaining. |
| 3 | Transition Pressure T(t) | unbounded | dC/dt - dU/dt. Rate of future collapse. |
| 4 | Recovery Margin R(t) | [0,1] | 1 - (dist_to_nearest_branch / max_trajectory_dist). Distance from control loss. |
| 5 | Phase Stiffness P(t) | [0,1] | 0.5 dir_persistence + 0.5 (1/(1+jerk)). Rhythmic lock to internal metronome. |
| 6 | Novelty N(t) | [0,1] | clamp(dist(embedding, centroid) / 2, 0, 1). Distance from recent regimes. |
| 7 | Stability S(t) | [0,1] | 1 - (latent_velocity / max_expected_velocity). Local stationarity. |
The central result: transition pressure sign predicts conversation convergence at 71.8
The Rust implementation (`cc-anticipation`) computes all 7 scalars in under 2ms per frame with zero hot-path heap allocation, making it suitable for real-time inscription generation.
2.2 Live Knowledge Graphs
Live Knowledge Graphs (Diomande, 2026b) presents cc-graph-kernel, a production Rust service (Axum framework, PostgreSQL backing store) that provides runtime knowledge graph integration. The system stores 71,130 knowledge triples and produces provenance-tracked context slices via a priority-queue BFS algorithm (the Context Slicer). Every slice carries an HMAC-SHA256 signed admissibility token, ensuring that no downstream system can fabricate authorization claims.
The key distinction from training-time KG integration (as in Princeton's DSS framework or GraphMERT) is that the graph is queried live during inference. With an average ingestion rate of 50-200 new triples daily, training-time integration faces monotonically growing staleness. Runtime integration bounds staleness by graph update latency (sub-minute in practice).
For the ICCT, the knowledge graph serves as the factual complement to the personality adapter. The adapter knows how to respond (style, tone, decision patterns). The graph knows what to respond about (current project state, relationships, domain facts). The inscriptions know where the conversation has been (trajectory history).
2.3 PsiChain
PsiChain extends cc-inscription with a self-referential hash chain, connecting every inscription to its predecessor via SHA-256:
Psi(t+1) = inscribe(z(t), Psi(0..t), L(t), B(t))Each inscription depends on the current z-trajectory input z(t), all previous inscriptions through the hash chain Psi(0..t), the evolved lexicon L(t), and the current basis B(t). The system is, in a precise sense, computing its own continued existence.
PsiChain introduces three extensions over the base inscription system:
1. ChainLink: SHA-256 hash chain connecting every inscription to its predecessor. Fields: inscription_id, prev_hash, chain_height, surface_hash, epoch_ref, link_hash.
2. Combining marks as cryptographic commitment: The 9 N'Ko combining marks encode chain depth visually (deeper = more ornate), and are included in the surface hash. Altering them breaks the chain.
3. Zero-width steganography: Binary alphabet using U+200B (=0) and U+200C (=1) encodes chain_height (8 bytes), epoch_number (8 bytes), device_hash (4 bytes), and checksum (2 bytes) as 178 invisible characters between visible N'Ko characters. These are part of the surface hash.
PsiChain's four properties (irreversibility, density monotonicity, self-reference, selective transparency) are relevant to the ICCT because they guarantee that the twin's conversation history, once inscribed, cannot be retroactively altered. The inscription chain serves as an immutable audit log of every trajectory state the twin has processed.
2.4 cc-inscription: The Core Engine
cc-inscription is a 22,881-line Rust crate across 48 source files, compiled and deployed as part of cc-mcs-daemon. It implements:
- 10 claim detectors with typed intermediate representation (StabilizeClaim, DisperseClaim, TransitionClaim, ReturnClaim, DwellClaim, OscillateClaim, RecoverClaim, NovelClaim, PlaceShiftClaim, EchoClaim)
- Surface renderer: Claim IR to N'Ko line conversion following canonical grammar templates
- Canonical serialization: CBOR (RFC 8949), SHA-256, NFC normalization, QuantizedFloat (i64 mantissa, 10^-6 scale, no IEEE-754)
- Provenance system: InscriptionId = SHA-256 of (IR + evidence + lexicon + surface)
- Basin lifecycle: proto to graduated to split/merge/retire
- Lexicon versioning: with epoch boundaries and phrase compression
- Integration bridges: to Graph Kernel (slice boundary enforcement) and RAG++ (evidence retrieval for Echo claims)
The Rust crate runs the claim detector at 6 Hz on incoming z-trajectory data. For the ICCT training pipeline, a Python encoder (`inscription_encoder.py`, 358 lines) mirrors the Rust surface renderer's grammar templates, enabling inscription generation without requiring the full Rust compilation toolchain.
2.5 Context Compression in Language Models
Several approaches have addressed the context window bottleneck:
In-Context AutoEncoder (ICAE) (Ge et al., 2024) trains a model to compress context into memory tokens, achieving compression ratios of 4-5x. The compression is learned and opaque: the model cannot explain what information the memory tokens contain.
LongLLMLingua (Jiang et al., 2023) uses perplexity-based token pruning to compress prompts, achieving 2-6x compression with minimal quality degradation. The compression is at the token level and does not capture semantic structure.
Gist tokens (Mu et al., 2023) train dedicated compression tokens that can represent entire instructions. Compression ratios reach 26x for instruction following, but the approach requires custom training for each compression target.
Our approach differs fundamentally: the compression is not learned but designed. The 10 sigils constitute a hand-crafted semantic alphabet grounded in dynamical systems theory. Each sigil has an explicit, auditable meaning. The compression ratio (approximately 30x) exceeds all learned approaches because the encoding discards information that is irrelevant to the twin's decision-making (surface-level phrasing, hedging language, verbose explanations) while preserving the trajectory signal (what type of action occurred, at what confidence, in what basin).
2.6 QLoRA and Persona LoRA
QLoRA (Dettmers et al., NeurIPS 2023) enables fine-tuning of quantized models via low-rank adapters. The paper's critical finding, that all layers must be adapted to match full fine-tuning quality, is validated by our prior work (Diomande, 2026a) showing that 3B models with all-layer LoRA outperform 7B models with partial-layer LoRA for personality transfer.
P-Tailor (EMNLP 2024) uses personality LoRA experts with effective rank 256 across 16 MoE modules. HumanLLMs (2026) fine-tunes Qwen2.5-7B-Instruct with SFT+DPO on 10,884 samples for persona transfer. Neeko (EMNLP 2024) addresses multi-character role-playing with a single adapter. Our work is the single-persona special case, where the target is not approximate role-playing but operational indistinguishability.
---
3. The N'Ko Inscription System
3.1 The 10 Sigils
The inscription system uses 10 N'Ko characters as typed operators. Each sigil corresponds to a distinct dynamical claim about the trajectory state. The sigils are locked per NIP-0002 (N'Ko Improvement Proposal) and will not change.
| # | Sigil | Unicode | Name | z-Trajectory Trigger | Anticipation Scalar(s) |
|---|---|---|---|---|---|
| 0 | ߛ | U+07DB | Stabilization | Dispersion decreased | Stability rising, commitment rising, uncertainty falling |
| 1 | ߜ | U+07DC | Dispersion | Spread/entropy increased | Novelty rising, uncertainty increasing |
| 2 | ߕ | U+07D5 | Transition | Curvature spike (change point) | Transition pressure spike (sharp positive peak) |
| 3 | ߙ | U+07D9 | Return | Re-entry to known basin | Commitment to known region (near historical centroid) |
| 4 | ߡ | U+07E1 | Dwell | Sustained stay in basin | Stability high, phase stiffness high, commitment sustained |
| 5 | ߚ | U+07DA | Oscillation | Rapid alternation between basins | Phase stiffness high with alternating commitment direction |
| 6 | ߞ | U+07DE | Recovery | Return latency after disruption | Recovery margin returning from low to normal |
| 7 | ߣ | U+07E3 | Novelty | New basin discovery | Novelty above threshold |
| 8 | ߠ | U+07E0 | Place-Shift | Location class change | External signal coupled with scalar change |
| 9 | ߥ | U+07E5 | Echo | Pattern match to prior episode | Similarity to historical pattern (uses RAG++ retrieval) |
The first 7 sigils map directly to the 7 anticipation geometry scalars. Place-Shift (sigil 8) captures external context changes that are coupled with, but not derived from, the geometric scalars. Echo (sigil 9) captures historical pattern matching via the RAG++ retrieval system.
3.2 Line Skeleton
Every inscription follows a canonical grammar:
<operator-sigil><combining-mark> <time-marker> : <claim-body> ; <slots>The grammar templates for each claim type are defined in `grammar.rs` (Rust) and mirrored in `inscription_encoder.py` (Python):
| Claim Type | Grammar Template |
|---|---|
| Stabilization | `ߛ{mark} {time} : z(sigma) down ; {place} ; c={conf}` |
| Dispersion | `ߜ{mark} {time} : z(sigma) up ; {place} ; c={conf}` |
| Transition | `ߕ{mark} {time} : {from} -> {to} ; kappa={sharpness} ; c={conf}` |
| Return | `ߙ{mark} {time} : loopback {basin} ; last={dt} ; d={dist}` |
| Dwell | `ߡ{mark} {time} : stay({basin})={tau} ; phi={stab}` |
| Oscillation | `ߚ{mark} {time} : {b1} bidirectional {b2} ; f={freq} ; a={amp}` |
| Recovery | `ߞ{mark} {time} : rec->{basin} ; tau={lat} (x{ratio})` |
| Novelty | `ߣ{mark} {time} : new({place}) ; d*={dist} ; k={support}` |
| Place-Shift | `ߠ{mark} {time} : {from} -> {to} ; hookright {coupled} ; c={conf}` |
| Echo | `ߥ{mark} {time} : approx E#{id} ; sim={sim} ; refs={n}` |
Time markers use mathematical bracket notation: window types (stabilization, dispersion, dwell, oscillation, echo) use ranges `[t0-t1]`, while instant types (transition, return, recovery, novelty, place-shift) use points `[t*]`.
3.3 Combining Marks for Depth
N'Ko has 9 combining marks in the Unicode range U+07EB through U+07F3. PsiChain repurposes these to encode chain depth directly into the visual appearance of inscriptions:
| Depth | Mark | Unicode | Name | Class |
|---|---|---|---|---|
| 0 | (none) | -- | -- | -- |
| 1 | ߫ | U+07EB | Short High Tone | 230 (above) |
| 2 | ߬ | U+07EC | Short Low Tone | 230 (above) |
| 3 | ߭ | U+07ED | Two Dots Above | 230 (above) |
| 4 | ߮ | U+07EE | Three Dots Above | 230 (above) |
| 5 | ߯ | U+07EF | Long Descending Tone | 230 (above) |
| 6 | ߰ | U+07F0 | Long Ascending Tone | 230 (above) |
| 7 | ߱ | U+07F1 | Nasalization Mark | 230 (above) |
| 8 | ߲ | U+07F2 | Tilde Above | 230 (above) |
| 9 | ߳ | U+07F3 | Low Tone | 220 (below) |
Deeper chain positions produce visually darker, more ornate characters. The marks are injected AFTER the surface renderer produces the N'Ko line but BEFORE the surface hash is computed, making them part of the cryptographic commitment. The Rust implementation (`combining.rs`) injects marks by appending the depth-appropriate combining character after each N'Ko base character:
pub fn inject_combining_marks(text: &str, depth: u8) -> String {
if depth == 0 || depth > 9 { return text.to_string(); }
let mark = NKO_COMBINING_MARKS[(depth - 1) as usize];
let mut result = String::with_capacity(text.len() * 2);
for ch in text.chars() {
result.push(ch);
if is_nko_base(ch) { result.push(mark); }
}
result
}For ICCT training, combining marks serve as an additional information channel. A turn deep in a correction sequence (transition at depth 7) visually and tokenically differs from a fresh transition at depth 1. The model can learn to attend to depth information without any explicit scalar injection, because the combining marks are part of the token stream.
3.4 Compression Ratio Analysis
Consider a typical conversation turn in English:
> "The deploy failed with a 503 error on the staging server. I checked the logs and it looks like the Docker container ran out of memory during the build step. Should I increase the memory limit or switch to a staged build?"
This turn is approximately 45 English words, producing approximately 55 tokens in a typical BPE tokenizer. The trajectory-relevant content is: transition (something broke, direction change needed), with the basin being "session" and the coupled claim being corrective.
The inscription encoding:
ߕ ⟦3.0⟧ : session -> corrective ; kappa=0.90 ; c=0.85This is approximately 3-5 tokens (the N'Ko sigil, brackets, ASCII text). The compression ratio for this single turn is approximately 11-18x.
Over an entire session of 100 turns:
| Representation | Approximate Tokens | Information Preserved |
|---|---|---|
| Full English transcript | ~10,000 | Everything (redundantly) |
| Summarized English | ~2,000 | Most semantics, losing tone/trajectory |
| Inscription encoding | ~300-500 | All trajectory state, basin context, confidence |
The effective compression is 20-33x. In a 262K context window (Qwen3-4B extended context), this means:
| Representation | Turns Fitting in Context | Equivalent English Words |
|---|---|---|
| English | ~2,600 turns | ~65,000 words |
| Inscriptions | ~87,000 claims | ~2,600,000 words equivalent |
The inscription encoding enables the twin to "see" approximately 40x more history than the English representation, within the same context budget.
3.5 The 10 NIP Constitutional Specifications
The inscription system is governed by 10 N'Ko Improvement Proposals (NIPs), which serve as constitutional documents:
| NIP | Title | Key Requirement |
|---|---|---|
| 0001 | Core Charter | Replayable, justified, bounded inscriptions with crypto provenance |
| 0002 | Operator Sigils | The 10 sigils are LOCKED. They will not change. |
| 0003 | Evidence Types | Graph/Sensor/Hybrid evidence sum type with admissibility |
| 0004 | Determinism | CBOR, NFC, QuantizedFloat. No IEEE-754 anywhere. |
| 0005 | Session Semantics | When does an event "occur"? Sessions as epistemic objects. |
| 0006 | Phrase Semantics | When is a phrase worth naming? Compression legitimacy test. |
| 0007 | Idiomization | Limits of N'Ko naturalization for surface language. |
| 0008 | Learning Loops | Feedback discipline. Human co-interpretation boundaries. |
| 0009 | Personal Chronicle | Lived semantics. Inscriptions as biography. |
| 0010 | Inter-Subject Comparison | Ethics of similarity echoes across people. |
NIP-0001's Provenance Law is the most consequential for the ICCT: for any InscriptionId, given archived evidence, lexicon, basis, and configuration, the claim IR and N'Ko surface must be deterministically reproducible and the InscriptionId recomputable. This means the twin's context is not merely compressed but verifiable. Any inscription in the twin's prompt can be traced to its source evidence and validated.
3.6 Lexicon Evolution
The inscription system is not static. As the twin encounters new basins (project names, recurring contexts), these are registered in a versioned lexicon. The lexicon evolves through two mechanisms:
1. Basin token registration. When a new project or context appears (e.g., a new app being developed), it is registered as a BasinToken in the lexicon, receiving a canonical N'Ko name.
2. Phrase compression. When specific sequences of claims recur frequently (e.g., "transition followed by recovery followed by stabilization" during a debugging cycle), the sequence is compressed into a named phrase, reducing the token count for repeated patterns.
Lexicon versions are hash-chained: the content hash of each lexicon epoch is included in every inscription produced during that epoch, ensuring that lexicon state is part of the cryptographic commitment.
---
4. Architecture
4.1 System Overview
The ICCT architecture has three phases: encoding (English to inscriptions), training (inscription context + English response), and inference (session history as inscriptions, twin responds in English).
ICCT ARCHITECTURE
=================
PHASE 1: ENCODING (offline, during data preparation)
Session Transcript (English)
|
v
+------------------+ +---------------------+
| Turn Classifier |---->| Inscription Encoder |
| (keyword or Rust | | (Python or Rust |
| detector) | | surface renderer) |
+------------------+ +---------------------+
| |
claim_type, conf, N'Ko inscription line
metadata (sigil + mark + time
| + body + slots)
v |
+------------------+ |
| Depth Computer | |
| (position in | |
| session + type) | |
+------------------+ |
| |
v v
combining mark depth --> Final inscription line
|
v
Training JSONL
{system: "Read inscriptions...",
user: inscription + question,
assistant: English response}
PHASE 2: TRAINING (Mac5 M4 16GB or A40 48GB)
inscription-train/{train,valid,test}.jsonl
|
v
+----------------------------------+ +----------------------------------+
| MLX LoRA Trainer (Mac5) | | PEFT LoRA Trainer (A40) |
| Base: Qwen3-4B-Instruct-4bit | | Base: Qwen3-4B-Instruct BF16 |
| Rank 16, q/v only | | Rank 64, all 7 modules |
| --mask-prompt | | q,k,v,o,gate,up,down |
| --num-layers 16 | | alpha=128, dropout=0.05 |
| --lr 5e-5, --iters 3000 | | 132M trainable params (3.18%) |
| Eval loss: 2.212 | | Eval loss: 0.733 (SOTA) |
+----------------------------------+ +----------------------------------+
| |
v v
Fused adapter weights 515MB adapter (safetensors)
PHASE 3: INFERENCE (real-time)
Current Session (English turns)
|
v
+------------------+ +--------------------+
| Live Inscription | | cc-graph-kernel |
| Encoder | | (71,130 triples) |
| (last N turns -> | | Live KG query |
| inscription | +--------------------+
| lines) | |
+------------------+ Knowledge context
| |
Inscription context |
| |
v v
+-------------------------------------------+
| System Prompt Assembly |
| "You are Mohamed. Read N'Ko inscriptions |
| for session dynamics, respond in English. |
| Inscriptions encode trajectory state: |
| [sigil legend]" |
| + inscription context (last 20 turns) |
| + knowledge context (graph triples) |
| + current question |
+-------------------------------------------+
|
v
+-------------------------------------------+
| Qwen3-4B-Instruct + LoRA Adapter |
| (inscription-conditioned, A40 SOTA or Mac5) |
| Served via MLX on Mac5 (:8100) or any GPU |
| Inference: ~4GB VRAM (4-bit quantized) |
+-------------------------------------------+
|
v
English response + confidence
|
+----+----+
| |
>= 0.85 < 0.85
| |
Auto- Route to
inject operator4.2 Separation of Concerns
The ICCT maintains the three-component separation established in the Cognitive Twin architecture:
| Component | What It Encodes | Update Frequency | Mechanism |
|---|---|---|---|
| Personality | How the operator speaks, decides, prioritizes | Weekly (LoRA retrain) | Adapter weights |
| Knowledge | Current facts about projects, relationships, state | Real-time (graph ingestion) | cc-graph-kernel query |
| Trajectory | Where the conversation has been and where it is heading | Per-turn (inscription encoding) | N'Ko inscription context |
The inscription system adds a fourth concern: trajectory compression. Rather than injecting raw trajectory scalars as text tags (the approach in Diomande, 2026a), the ICCT encodes the entire trajectory history as inscriptions. The twin learns to interpret inscriptions as trajectory context during training, internalizing the sigil-to-state mapping.
This means the twin does not need explicit trajectory scalar injection at inference time. The inscriptions are the trajectory. The model reads a sequence like:
ߛ ⟦0.0-1.0⟧ : z(sigma) down ; spore ; c=0.90
ߕ߬ ⟦1.0⟧ : spore -> corrective ; kappa=0.90 ; c=0.85
ߣ ⟦2.0⟧ : new(session) ; d*=0.70 ; k=2
ߙ ⟦3.0⟧ : loopback karl ; last=5.0 ; d=0.30
ߛ ⟦4.0-5.0⟧ : z(sigma) down ; karl ; c=0.90and understands from the sequence: stabilization, then correction, then novelty, then return to a prior project, then re-stabilization. This trajectory profile implies a session that recovered from a disruption and is now in a committed, forward-moving state. The twin's response should be terse and directive.
---
5. Training
5.1 Dataset Construction
The training dataset is constructed from the autocontinue-v7-final corpus (Diomande, 2026a), which contains 4,633 training examples, 343 validation examples, and 343 test examples extracted from 4,698 Claude Code session files spanning 15 months of operation.
The `build_inscription_dataset.py` script converts each v7-final example by:
1. Extracting trajectory scalars from the v7-final system prompt bracket (`[trajectory: commitment=high, uncertainty=low, ...]`)
2. Converting text-label scalars to numeric values (e.g., "high" = 0.85, "low" = 0.25)
3. Classifying the user turn using keyword heuristics that mirror the Rust claim detector's threshold logic
4. Rendering the classified claim as an N'Ko inscription line with combining marks
5. Prepending the inscription to the user content
6. Replacing the system prompt with the inscription-aware variant
The system prompt for all inscription-conditioned examples is:
You are Mohamed. Read N'Ko inscriptions for session dynamics,
respond in English. Direct, casual, action-oriented.
Inscriptions encode trajectory state:
ߛ=stabilize ߜ=disperse ߕ=transition ߙ=return
ߡ=dwell ߚ=oscillate ߞ=recover ߣ=novelty
ߠ=place-shift ߥ=echoThis prompt teaches the model the sigil legend once, after which the model must learn to interpret inscription lines from the training data alone.
5.2 Training Configuration
Two hardware configurations were used across the experimental history:
Mac5 Configuration (v10, inscription v1/v2):
| Parameter | Value |
|---|---|
| Base model | Qwen3-4B-Instruct-4bit (MLX quantized) |
| Adapter type | LoRA (standard, not DoRA) |
| Rank | 16 |
| Target modules | q_proj, v_proj only |
| LoRA layers | 16 |
| Loss masking | --mask-prompt (loss on assistant tokens only) |
| Learning rate | 5e-5 |
| Iterations | 3,000 |
| Batch size | 1 |
| Training examples | 4,633 |
| Validation examples | 343 |
| Test examples | 343 |
| Hardware | Mac5 (Apple M4, 16GB unified memory) |
| Framework | MLX 0.31+ / mlx-lm LoRA trainer |
A40 Configuration (v11, SOTA):
| Parameter | Value |
|---|---|
| Base model | Qwen3-4B-Instruct-2507 (BF16, full precision) |
| Adapter type | LoRA (standard, not DoRA) |
| Rank | 64 |
| Alpha | 128 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Dropout | 0.05 |
| Loss masking | assistant tokens only |
| Trainable parameters | 132M (3.18 |
| Training examples | 4,633 |
| Validation examples | 343 |
| Hardware | NVIDIA A40 48GB |
| Framework | PEFT 0.18.1 / Hugging Face Transformers |
| Precision | BF16 |
| Total cost | $0.35 (1 hour A40 rental) |
The move from Qwen2.5-3B to Qwen3-4B reflects the availability of newer models with extended context windows and improved instruction following. The Mac5 configuration follows the recommendations from Diomande (2026a): mask-prompt for focused persona signal, 5e-5 learning rate for 4-bit quantized stability, and maximal layer coverage within memory budget. The A40 configuration removes all memory constraints, enabling rank 64 on all 7 target modules with BF16 precision, producing the SOTA result described in Section 5.8.
5.3 Preliminary Results
Training is ongoing. Preliminary results after 300 iterations:
| Metric | Value |
|---|---|
| Initial validation loss | 5.527 |
| Val loss at iteration 100 | 3.841 |
| Val loss at iteration 200 | 2.903 |
| Val loss at iteration 300 | 2.509 |
| Convergence trend | Continuing to decrease |
| Target iterations | 3,000 |
The validation loss curve shows consistent improvement without signs of divergence or overfitting at this stage. The initial loss (5.527) is higher than the v7-final non-inscription model's initial loss, which is expected: the model must learn to interpret N'Ko characters in addition to learning the persona, representing a harder prediction task.
5.4 Version History: 11-Version Experimental Comparison
The ICCT represents the culmination of 11 adapter versions across three model scales (1B, 3B, 4B, 7B) and two hardware tiers (Apple M4 16GB, NVIDIA A40 48GB). The full experimental history reveals a consistent pattern: personality fidelity depends on layer coverage and adapter capacity in a counterintuitive direction.
| Version | Base Model | Rank | Modules | Hardware | Data Approach | Eval Loss | Personality Fidelity | Long-form Coherence |
|---|---|---|---|---|---|---|---|---|
| v1 | Qwen 7B 4-bit | 16 | q,v | Mac5 M4 | Raw transcripts, 300 cap | 1.862 | Generic AI | No |
| v2 | Qwen 7B 4-bit | 16 | q,v | Mac5 M4 | Raw + mask-prompt | 2.377 | Generic AI | No |
| v3 | Qwen 7B 4-bit | 16 | q,v | Mac5 M4 | Tool chain extraction | 1.796 | Slightly better | No |
| v4 (v6c) | Qwen 3B 4-bit | 16 | q,v | Mac5 M4 | All layers + mask | 2.345 | Mohamed | No |
| v5 | Qwen 3B 4-bit | 16 | q,v | Mac5 M4 | Trajectory scalar tags | 2.298 | Mohamed | No |
| v6 | Qwen 3B 4-bit | 16 | q,v | Mac5 M4 | Expanded data, corrections | 2.187 | Mohamed | No |
| v7 | Qwen 1B 4-bit | 16 | q,v | Mac5 M4 | Trajectory tags, v7-final | 2.168 | Mohamed but loops | No |
| v8 | Qwen 3B 4-bit | 16 | q,v | Mac5 M4 | Stream messages | 2.306 | Mohamed, diverged | Short only |
| v9 | Qwen 3B 4-bit | 16 | q,v | Mac5 M4 | Full v7-final | 2.102 | Mohamed | No |
| v10 (ICCT) | Qwen3 4B 4-bit | 16 | q,v | Mac5 M4 | N'Ko inscription sigils | 2.212 | Mohamed + trajectory aware | Context-dependent |
| v11 (A40 SOTA) | Qwen3 4B BF16 | 64 | q,k,v,o,gate,up,down | A40 48GB | N'Ko inscription sigils | 0.733 | Mohamed + pushback + tool refs | Yes |
*v1 val loss is not directly comparable (computed over all tokens, not mask-prompt).
Several patterns emerge from the 10-version comparison:
The 7B models never achieve persona fidelity. Versions v1, v2, and v3 all use Qwen 7B, and despite v3's lower validation loss (1.796), the model's personality remains generic AI. The 7B model's RLHF conditioning is too deeply embedded in its 32 transformer layers for 8-layer or 16-layer LoRA to override. Low validation loss does not imply persona transfer; it implies the adapter learned to predict completions without changing the model's voice.
The 3B model achieves persona at all-layer coverage. Version v4 (v6c) is the inflection point: switching to 3B with all 36 layers adapted produces recognizable personality for the first time, despite higher validation loss (2.345). This confirms that layer coverage matters more than loss minimization for persona.
The 1B model captures personality but loops. Version v7 on Qwen 1B achieves personality transfer with only 26 layers, but produces repetitive outputs, likely due to the model's limited capacity for maintaining diverse response patterns.
The 4B inscription model is the first to achieve trajectory awareness. Version v10 (ICCT) produces responses that are conditioned on the session's trajectory history, not just the immediate question. When the inscription context shows a stabilization-transition-recovery sequence, the model's responses adjust accordingly. This is a capability none of the prior versions exhibit, because none of them receive trajectory context in a format the model can learn from.
The A40 run achieves SOTA by unlocking adapter capacity. Version v11, trained on an A40 GPU with rank 64 and all 7 target modules, achieves eval loss 0.733, a 3x improvement over the best Mac5 result (2.212). The model exhibits qualitatively new behaviors: context-aware pushback, references to actual system tools, and knowing when to stop. This confirms that personality transfer quality scales with adapter capacity. The 132M trainable parameters (3.18
Long-form coherence is resolved by adapter capacity. Versions v8 and v10 on Mac5 produce only context-dependent long-form output. Version v11 on the A40 produces coherent long-form responses consistently, suggesting that long-form generation requires sufficient adapter rank to capture output structure, not just longer training examples.
5.5 The 3B vs 4B Transition
The move from Qwen2.5-3B to Qwen3-4B is motivated by three factors:
1. Extended context. Qwen3-4B supports 262K tokens natively, versus 32K for Qwen2.5-3B. With inscription compression, this means the twin can condition on approximately 87,000 inscription claims within a single prompt.
2. Improved instruction following. Qwen3's training methodology produces more robust instruction adherence, which paradoxically makes persona override harder but the resulting persona more stable once achieved.
3. Maintained RLHF override feasibility. At 4B parameters, the model's RLHF conditioning is still thin enough to override with all-layer LoRA. The critical finding from Diomande (2026a), that smaller models are easier to steer, holds for 4B as it did for 3B.
5.6 Inverse Scaling Law for Personality Transfer
The 10-version experimental history reveals a counterintuitive finding: 3B models beat 7B models for personality transfer. This is an inverse scaling law specific to persona override, where the target is not benchmark performance but operational indistinguishability from a specific human.
The mechanism is straightforward. RLHF alignment training encodes a "helpful assistant" persona deep into transformer layer weights. Larger models receive more RLHF training, producing a stronger default persona. When LoRA adapters attempt to override this persona, they must counteract the RLHF signal at every adapted layer. At 7B parameters with 8-layer LoRA, the adapter can modify only a fraction of the persona-relevant weights. The remaining 24 unadapted layers continue to assert the RLHF persona, producing a chimeric output: the adapter's content knowledge layered over the base model's communication style.
At 3B parameters with 36-layer (full coverage) LoRA, every persona-relevant layer is adapted. The thinner RLHF conditioning at 3B is fully overwritten, and the resulting model speaks with the operator's voice rather than the assistant's.
The qualitative difference is stark. Given the prompt "Should I use SwiftUI or UIKit for this feature?":
7B (v3, val loss 1.796):
> "Great job! Let's break down the options. SwiftUI offers a declarative approach with modern syntax, while UIKit provides more fine-grained control. Here are the key considerations..."
3B (v6c, val loss 2.345):
> "Option A. SwiftUI is simpler."
4B inscription (v10, val loss 2.212):
> "Yes. Bump the version and start the next feature."
4B inscription A40 (v11, eval loss 0.733):
> "B. Tab bar + UIKit for existing features."
The 7B response is recognizably ChatGPT-like: enthusiastic opener ("Great job!"), balanced enumeration, hedging. The 3B response matches the operator's communication pattern: direct, minimal, decisive. The 4B inscription v10 response demonstrates trajectory awareness: the inscription context showed a stabilization sequence (the feature was already decided), so the model skips the deliberation entirely and jumps to the next action. The A40 v11 response combines persona fidelity with technical reasoning: it picks a specific option and explains why, matching the operator's pattern of opinionated decisions with brief justification.
This inverse scaling finding has implications beyond the cognitive twin project. Any application requiring persona override (character AI, game NPCs, specialized assistants) should consider smaller models with full-layer LoRA over larger models with partial-layer LoRA. The validation loss is a misleading metric: lower loss on a 7B model does not imply better persona fidelity, because the loss measures prediction accuracy, not voice consistency.
5.7 Evaluation Suite
To validate persona fidelity beyond loss metrics, we constructed an evaluation suite of 20 question-answer pairs drawn from real operator interactions. Each pair consists of a question that appeared in the training corpus and the operator's actual response. The evaluation protocol presents each question to the inscription-conditioned twin (v10) and scores the response on three dimensions:
1. Intent match: Does the twin's response convey the same intent as the operator's actual response? (Binary: yes/no)
2. Action equivalence: Would the twin's response, if acted upon by a downstream agent, produce the same observable outcome as the operator's response? (Binary: yes/no)
3. Tone match: Does the twin's response match the operator's communication style (terse, directive, no hedging, no enthusiasm markers)? (Binary: yes/no)
Results on the 20-pair evaluation suite:
| Metric | Score | Count |
|---|---|---|
| Intent match | 90 | |
| Action equivalence | 80 | |
| Tone match | 100 |
**Intent match (90
**Action equivalence (80
**Tone match (100
The 100
5.8 A40 Training Results (SOTA)
The A40 run with LoRA rank 64 on all 7 target modules produced fundamentally better results than any Mac5 configuration. The hardware removes the two bottlenecks that constrained all prior runs: memory (48GB vs 16GB enables BF16 and rank 64) and module coverage (all 7 target modules vs q,v only).
5.8.1 Quantitative Comparison
| Config | Hardware | Rank | Modules | Best Eval Loss | Token Accuracy |
|---|---|---|---|---|---|
| Mac5 v6c (3B Qwen2.5) | M4 16GB | 16 | q,v only | 2.345 | ~79 |
| Mac5 inscription v1 (4B Qwen3) | M4 16GB | 16 | q,v only | 2.212 | ~80 |
| Mac5 inscription v2 (4B Qwen3) | M4 16GB | 16 | q,v only | 2.288 | ~80 |
| A40 inscription (4B Qwen3) | A40 48GB | 64 | q,k,v,o,gate,up,down | 0.733 | **82.1 |
The A40 eval loss (0.733) is 3x better than Mac5's best (2.212). This is not a marginal improvement. It represents a qualitative shift in what the model can do.
5.8.2 Qualitative Evaluation
To assess the A40 model beyond loss metrics, we evaluated it on 6 interaction categories that test distinct personality dimensions: directive handling, correction response, decision-making, frustration patterns, ideation, and continuation judgment.
| Test Category | Model Response | Quality Assessment |
|---|---|---|
| Directive | "No. TestFlight needs to finish syncing first." | Context-aware pushback |
| Correction | "Fix it and build again. Investigate root cause." | Specific, actionable |
| Decision | "B. Tab bar + UIKit for existing features." | Opinionated with reasoning |
| Frustration | "Check why it's using so much memory. Fix root cause." | No bandaids |
| Ideating | "Run a pulse plan to figure this out." | References actual tools |
| Continue | "Stop. Ship what's done. Come back later." | Knows when to stop |
Five qualitative observations distinguish the A40 model from all prior versions:
1. The model pushes back. When context warrants it (TestFlight still processing), the model says "No" rather than complying. No prior version exhibits context-aware refusal. This is the strongest signal that the adapter has internalized the operator's judgment patterns, not just communication style.
2. The model references actual system tools. "Run a pulse plan" refers to the operator's actual infrastructure. The model has learned tool vocabulary from the training data and deploys it appropriately, rather than producing generic suggestions.
3. The model reasons about root causes, not symptoms. The frustration response ("Check why it's using so much memory. Fix root cause.") matches the operator's documented pattern: never bandaid, always fix the underlying issue. Prior versions would produce generic troubleshooting checklists.
4. The model knows when to stop. "Stop. Ship what's done. Come back later." demonstrates judgment about continuation vs. completion. The operator's pattern is to ship incremental progress rather than chase perfection, and the model has learned this.
5. The model is opinionated with reasoning. "B. Tab bar + UIKit for existing features." makes a specific technical choice and provides the rationale. Prior versions either hedged ("Both approaches have merits...") or gave terse answers without reasoning.
5.8.3 Why the A40 Run Achieves 3x Improvement
The 3x eval loss improvement comes from three factors working multiplicatively:
1. Rank 64 vs 16: 4x more adapter parameters. The A40 adapter has 132M trainable parameters versus approximately 11.5M on Mac5. The additional parameters allow the adapter to represent finer-grained style patterns. At rank 16, the adapter captures broad personality traits (terse vs. verbose, direct vs. hedging). At rank 64, the adapter captures nuanced behaviors (when to push back, when to reference specific tools, when to stop).
2. All 7 target modules vs q,v only: captures style across attention AND feed-forward layers. Mac5 runs adapt only the query and value projections in the attention mechanism. The A40 run additionally adapts the key projection, output projection, and the entire MLP (gate, up, down projections). Style is not localized to attention. The feed-forward layers encode how the model transforms representations between attention heads, and persona-specific transformation patterns require adapter coverage of these layers.
3. BF16 precision vs 4-bit quantization: proper gradient computation. Mac5 runs use 4-bit quantized base weights, which introduce quantization noise into gradient computation. The A40 run uses BF16 (brain float 16), which provides proper gradient precision. For personality transfer, where the signal is subtle (the difference between "Great! Let's explore options..." and "No. Fix the root cause."), gradient precision matters more than for typical instruction tuning.
5.8.4 Implications: Adapter Capacity Scaling Law
The A40 result establishes a scaling law for personality transfer that is distinct from the standard LLM scaling law:
Standard scaling law: Larger models produce better benchmark performance.
Personality transfer scaling law: Personality fidelity scales with adapter capacity (rank x module count), not model size. A 4B model with 132M trainable parameters (3.18
This has practical implications for deployment. The A40-trained adapter is 515MB (safetensors format). The base model is Qwen3-4B, which can be quantized to 4-bit for inference, requiring approximately 4GB VRAM on any GPU or Apple Silicon device. The total inference footprint is under 5GB, making the SOTA personality model deployable on consumer hardware despite being trained on datacenter hardware. The adapter is also directly convertible to MLX format for Apple Silicon inference.
5.8.5 Cost Analysis
| Resource | Cost |
|---|---|
| A40 GPU rental (1 hour) | $0.35 |
| Base model download | Free (Qwen3-4B open weights) |
| Training data | Free (existing Claude Code session corpus) |
| Total | $0.35 |
The entire SOTA personality model, achieving 3x improvement over 10 prior experimental versions, was produced for $0.35 in compute cost. This demonstrates that personality transfer is a data-quality and adapter-architecture problem, not a compute-scaling problem. The bottleneck was never compute budget. It was the insight that rank 64 on all 7 modules, with BF16 precision, unlocks a qualitatively different capability tier.
---
6. Compression Analysis
6.1 Token-Level Measurement
To validate the compression claim, we measured actual tokenizer behavior on the inscription-conditioned training data. Using the Qwen3 tokenizer:
| Content Type | Example | Qwen3 Tokens |
|---|---|---|
| N'Ko sigil (single character) | ߛ | 1-2 |
| Combining mark | ߬ | 1 |
| Time bracket | ⟦0.0-1.0⟧ | 5-7 |
| Claim body (stabilization) | z(sigma) down ; spore ; c=0.90 | 12-15 |
| Full inscription line | ߛ ⟦0.0-1.0⟧ : z(sigma) down ; spore ; c=0.90 | 18-22 |
A typical English turn that generates a stabilization claim:
| Content | Tokens |
|---|---|
| English: "Continue. Ship it. All steps, no pauses." | ~12 |
| Inscription: `ߛ ⟦0.0-1.0⟧ : z(sigma) down ; spore ; c=0.90` | ~20 |
For short affirmative turns, the inscription encoding is actually longer than the English. The compression advantage emerges with longer turns:
| Content | Tokens |
|---|---|
| English: "The deploy failed with a 503 error on the staging server..." (full paragraph) | ~55 |
| Inscription: `ߕ ⟦3.0⟧ : session -> corrective ; kappa=0.90 ; c=0.85` | ~18 |
And it compounds dramatically at the session level. Consider a 100-turn session where the average English turn is 50 tokens:
| Representation | Total Tokens | Ratio |
|---|---|---|
| Full English | ~5,000 | 1x |
| Inscription only | ~2,000 | 2.5x |
| Inscription (short turns excluded) | ~1,200 | 4.2x |
6.2 Measured Signal Density and Context Capacity
To validate the compression claim empirically, we measured signal density across the training corpus. Signal density is defined as the fraction of characters in a turn that carry trajectory-relevant information (sigil type, confidence, basin, temporal position) versus linguistic packaging (articles, prepositions, hedging, code blocks, verbose explanations).
| Metric | English | Inscription |
|---|---|---|
| Average characters per turn | 129 | 65 |
| Signal density | 27 | |
| Effective signal chars per turn | 34.8 | 65.0 |
English turns average 129 characters but only 27
This produces a concrete context capacity difference in the Qwen3-4B 262K token window:
| Representation | Turns in 262K Window | Sessions Visible (avg 50 turns) | Signal Density |
|---|---|---|---|
| English prose | 8,102 | ~162 | 27 |
| Inscription encoding | 12,092 | ~242 | 100 |
The inscription format fits 49
12,092 turns * 100% signal = 12,092 signal-equivalent turns
8,102 turns * 27% signal = 2,188 signal-equivalent turnsThe inscription encoding delivers 5.5x more decision-relevant trajectory information within the same context budget. This is a more conservative and empirically grounded figure than the theoretical 25-30x compression ratio estimated from individual turn analysis in Section 6.1. The difference is that short affirmative turns (which are common in the corpus) have near-parity between English and inscription length, pulling the average compression ratio down from the theoretical maximum.
The 242-session visibility is the practically important number. At 242 sessions visible simultaneously, the twin can condition on approximately 2 weeks of continuous operation history. In English format, the same window covers approximately 162 sessions, or roughly 10 days. The extra 80 sessions of visible history represent the difference between a twin that can recall the full context of a multi-day project arc and one that truncates mid-arc.
6.3 Context Budget Allocation
In practice, the inscription context does not occupy the entire prompt. The prompt structure for ICCT inference is:
| Component | Approximate Tokens | Purpose |
|---|---|---|
| System prompt + sigil legend | ~80 | Persona instructions and sigil definitions |
| Inscription context (last 20 turns) | ~400 | Compressed trajectory history |
| Knowledge graph context | ~200 | Live triples from cc-graph-kernel |
| Current question | ~50 | The actual user prompt |
| Total | ~730 |
This leaves more than 31,000 tokens (in a 32K context) or 261,000 tokens (in a 262K context) for the model's generation, attention patterns, and internal computation. The inscription encoding converts what was a context-constrained problem (30,000 tokens of English history exceeding the budget) into a context-abundant problem (730 tokens of inscription + graph + question, well within budget).
---
7. The Flow Encoder: From Keyword Classification to Learned Sigil Distributions
7.1 Motivation
The inscription encoder described in Section 4 relies on keyword heuristics to classify conversation turns into sigil types. This classifier, while effective for training data construction, is a hard decision boundary: each turn receives exactly one sigil assignment. In reality, many conversation turns exhibit mixed dynamics. A message like "Let's keep going but switch to the other repo first" is simultaneously a stabilization (commitment to continue), a transition (changing context), and a place-shift (new project basin). The keyword classifier picks one; the underlying dynamics are a superposition.
7.2 Architecture
The flow encoder replaces the keyword classifier with a learned MLP that produces a full probability distribution over all 10 sigils for each conversation turn. The architecture:
Input: 25-dimensional feature vector
|
v
Linear(25, 64) -> LayerNorm -> GELU -> Dropout(0.3)
|
v
Linear(64, 64) -> LayerNorm -> GELU -> Dropout(0.3)
|
v
Linear(64, 10) -> Softmax
|
v
Output: 10-dimensional probability distribution over sigilsThe 25 input features are extracted from each conversation turn: text length, question mark presence, exclamation count, word-level heuristic scores for each of the 10 sigil types, turn position in session, and inter-turn time gap (when available). The model is deliberately small (27KB, 7,690 parameters) to maintain sub-millisecond inference speed.
7.3 Training and Results
The flow encoder is trained on 4,633 pseudo-labeled examples from the v1 keyword classifier (the same corpus used for the ICCT). While this means the encoder's ceiling is bounded by the keyword classifier's accuracy, in practice the MLP learns to smooth the hard decision boundaries and produces more calibrated probability estimates.
| Metric | Value |
|---|---|
| Training examples | 4,633 |
| Validation accuracy | 85.7 |
| Training accuracy | 92.9 |
| Early stop epoch | 182 (of 500) |
| Model size | 27KB (safetensors) |
| Inference time | <0.1ms per turn |
The validation accuracy of 85.7
7.4 Soft-Posterior Inscriptions
The flow encoder's key contribution is the soft-posterior inscription format, where each inscription line carries not just the argmax sigil but the full probability distribution:
ߛ(0.72) ⟦0.0-1.0⟧ : z(sigma) down ; spore ; c=0.90 ; alt: ߡ(0.18) ߕ(0.07) ; conf=0.72This line says: the primary sigil is stabilization (72
The v2 inscription dataset uses this format. The ICCT twin trained on soft-posterior inscriptions can learn that low-confidence inscriptions indicate ambiguous trajectory states, requiring more cautious responses. A stabilization at 0.95 confidence warrants a terse directive response. A stabilization at 0.55 confidence (with 0.30 transition probability) warrants a response that acknowledges the possibility of direction change.
7.5 Sigil Distribution Shift
The flow encoder produces a different sigil distribution than the keyword classifier:
| Sigil | Keyword Classifier | Flow Encoder | Shift |
|---|---|---|---|
| ߛ Stabilization | 56.0 | ||
| ߛ Stabilization (sub-type: dwell reclassified) | -- | 31.4 | |
| ߡ Dwell | 16.0 | ||
| ߕ Transition | 13.0 | ||
| ߚ Oscillation | 9.0 | ||
| ߣ Novelty | <2 |
The most significant shift is the redistribution of stabilization into a stabilization-dwell continuum. The keyword classifier has a hard boundary between "stabilize" and "dwell" based on specific keywords. The flow encoder recognizes that many turns exist on a spectrum between momentary stabilization and sustained dwelling, and distributes probability accordingly.
The confidence distribution across the corpus:
| Confidence Bucket | Percentage |
|---|---|
| High (>= 0.9) | 69.1 |
| Medium (0.7 - 0.9) | 19.2 |
| Low (0.5 - 0.7) | 9.6 |
| Very Low (< 0.5) | 2.0 |
69.1
7.6 The Inscription as Symbolic Shadow
The flow encoder reframes what an inscription is. Under the keyword classifier, an inscription is a deterministic label: this turn IS a stabilization. Under the flow encoder, an inscription is a soft posterior: this turn is PROBABLY a stabilization, with these alternative interpretations.
This makes the inscription the symbolic shadow of a learned flow field. The MLP learns a mapping from text features to a 10-dimensional probability simplex, and each inscription is the projection of that high-dimensional flow state into a one-line symbolic representation. The twin receives not the flow field itself (which would require injecting 25 continuous features and a 10-dimensional vector per turn) but its symbolic shadow: a compact, human-readable, machine-learnable representation that preserves the essential uncertainty structure.
The phrase "symbolic shadow of a learned flow field" is precise, not metaphorical. The flow encoder literally learns a flow (a mapping from input space to a probability simplex), and the inscription literally casts a shadow of that flow into a lower-dimensional symbol space. The soft-posterior format preserves more of the flow's structure than the hard-classifier format, at a cost of approximately 30
---
8. Connection to PsiChain and EPOCH
8.1 Inscriptions as Hash-Chainable Claims
Every inscription produced by the ICCT's encoder can be fed into PsiChain's ChainLink system, creating a cryptographic audit trail of the twin's conversation history. The ChainLink struct connects each inscription to its predecessor via SHA-256:
link_hash = SHA-256(
inscription_id ||
prev_hash ||
chain_height (big-endian u64) ||
surface_hash
)This means the twin's entire decision history is tamper-evident. If any inscription is retroactively altered, every subsequent link hash changes, and the chain integrity verification fails.
8.2 Settlement on Bitcoin via Stacks
The EPOCH protocol provides the on-chain infrastructure for inscription persistence. The `nko-inscription.clar` Clarity contract (549 lines) stores inscriptions on Stacks, which settles to Bitcoin via Proof of Transfer:
(define-map inscriptions
{ index: uint }
{
nko-text: (string-utf8 1024),
inscription-hash: (buff 32),
prev-hash: (buff 32),
claim-type: (string-ascii 20),
sigil: (string-utf8 4),
confidence: uint,
block-height: uint,
timestamp: uint,
lexicon-version: uint,
density: uint,
basin-id: (string-ascii 64),
inscriber: principal
}
)Key contract functions:
- `inscribe`: Add an N'Ko inscription to the chain, emit an `inscription-committed` event
- `mint-token`: Register a new vocabulary token (phrase compression)
- `advance-lexicon`: Record a lexicon epoch boundary
- `register-basin` / `graduate-basin`: Basin lifecycle management
The EPOCH protocol includes 11 Clarity contracts total, 8 deployed to Stacks testnet, with 51 passing tests. The nko-inscription contract is specified but not yet deployed.
8.3 Implications for Twin Provenance
The PsiChain + EPOCH integration creates a remarkable property: the twin's decisions become Bitcoin-anchored. If the twin auto-injects a response (confidence >= 0.85), that response generates a new inscription. That inscription is hash-chained to the previous one. The chain is periodically settled on Stacks, which settles on Bitcoin.
This means that months or years later, it is possible to verify: "The twin made decision X at time T, based on trajectory state Y, with confidence Z." The proof is cryptographic and Bitcoin-anchored. No party can retroactively claim the twin said something it did not say, or deny a decision it did make.
For a system operating autonomously on behalf of a human, this provenance trail is not merely an engineering convenience. It is a prerequisite for trust.
---
9. Connection to Anticipation Geometry
9.1 The 7 Scalars to 10 Sigils Mapping
The 7 anticipation geometry scalars are computed continuously from the z-trajectory. The 10 sigils are discrete claims fired when scalar thresholds are crossed. The relationship is:
z-trajectory (continuous stream)
|
+------------+------------+
| |
Anticipation Kernel Claim Detector
(7 scalars, every frame) (10 types, event-driven)
| |
AnticipationPacket Claim IR
{commitment, uncertainty, {StabilizeClaim, ...}
transition_pressure, |
recovery_margin, Surface Renderer
phase_stiffness, |
novelty, stability} N'Ko Inscription LineThe specific scalar-to-sigil mappings:
| Sigil | Primary Triggering Scalar(s) | Threshold Condition |
|---|---|---|
| ߛ Stabilization | Stability, Commitment | S rising and C rising and U falling |
| ߜ Dispersion | Novelty, Uncertainty | N rising or U increasing significantly |
| ߕ Transition | Transition Pressure | T spike (sharp positive peak above threshold) |
| ߙ Return | Commitment | C toward known region (embedding near historical centroid) |
| ߡ Dwell | Stability, Phase Stiffness | S high and P high and C sustained for tau > threshold |
| ߚ Oscillation | Phase Stiffness, Commitment | P high with alternating C direction within window |
| ߞ Recovery | Recovery Margin | R returning from below-threshold to above-threshold |
| ߣ Novelty | Novelty | N above absolute threshold |
| ߠ Place-Shift | (external) + any scalar | External location/context signal coupled with scalar change |
| ߥ Echo | (similarity) | Pattern match to historical episode via RAG++ retrieval |
9.2 Place-Shift and Echo: Extensions Beyond the 7
The first 7 sigils (stabilization through novelty) can be derived purely from the 7 anticipation geometry scalars. The final 2 sigils extend beyond the geometric framework:
Place-Shift captures contextual transitions that are external to the z-trajectory. When the operator switches from working on one project to another, the z-trajectory may show a transition (curvature spike), but the critical information is which project was entered, not the geometric shape of the transition. Place-Shift carries the external context (from-project, to-project) that the geometric scalars cannot encode.
Echo captures historical pattern matching that requires retrieval, not geometry. When the current trajectory segment matches a prior episode (detected via RAG++ similarity search), the Echo sigil records the match. The geometric scalars might show a return to a known regime (via the commitment scalar toward a historical centroid), but they cannot identify which historical episode is being recapitulated. Echo carries this identification (echo_id, similarity score, reference count).
9.3 Combining Marks as Scalar Magnitude Encoding
The combining marks (depth 0-9) encode a secondary dimension that correlates with scalar magnitude. In the ICCT encoder, depth is computed from:
def compute_depth(turn_index, total_turns, is_correction):
relative_pos = turn_index / max(1, total_turns - 1)
base_depth = int(relative_pos * 7) # 0-7 from position
if is_correction:
base_depth = min(9, base_depth + 2)
return min(9, max(0, base_depth))Position-in-session maps to depth 0-7, and corrections (transitions, oscillations, recoveries) add +2 depth. This means that later turns in a session are visually more ornate, and corrective turns are the most ornate. The twin learns that heavily-marked inscriptions indicate deeper context and corrective dynamics.
In the full Rust implementation (PsiChain), depth tracks chain_height rather than session position, but the visual encoding principle is the same: deeper in the chain equals more ornate.
---
10. Connection to Anticipatory Transformers
10.1 The Natural Inscription Encoder
The anticipatory transformer (Diomande, 2025b) computes real-time predictions of future trajectory states by attending to the full history of past states. This architecture is the natural encoder for the inscription system: rather than encoding the current turn in isolation (as the keyword classifier and flow encoder both do), the anticipatory transformer encodes each turn in the context of the entire preceding trajectory.
The anticipatory transformer's core mechanism is bidirectional conditioning. At each timestep, the model attends both backward (to all past inscriptions in the chain) and forward (to its own prediction of the next trajectory state). This bidirectional attention means that each inscription is conditioned not just on what has happened but on what the model expects to happen next. An inscription generated by the anticipatory transformer carries predictive information that a single-turn classifier cannot access.
Concretely, the anticipatory transformer receives:
1. Past inscriptions: The full chain of N'Ko inscription lines from the current session (and potentially cross-session history within the context window)
2. Current turn features: The same 25-dimensional feature vector used by the flow encoder
3. Predictive state: The model's internal representation of expected future trajectory evolution
And produces:
1. Current inscription: The N'Ko line for the current turn, conditioned on the full context
2. Next-state prediction: A soft distribution over expected next sigil types, enabling anticipatory twin responses
10.2 Bidirectional Conditioning
The key architectural insight is that inscriptions are not independent samples. A stabilization following a transition-recovery sequence carries different information than a stabilization following five consecutive dwells. The keyword classifier and flow encoder both treat each turn as an independent classification problem. The anticipatory transformer treats the inscription chain as a sequence, where each new inscription is conditioned on the full history.
This bidirectional conditioning operates at two levels:
Backward conditioning (past inscriptions to current): The transformer attends to all prior inscriptions in the session, learning patterns like "transition is typically followed by recovery within 3 turns" or "multiple consecutive dwells predict a novelty event." These patterns are the temporal grammar of the inscription system, and they cannot be captured by single-turn classifiers.
Forward conditioning (current to predicted future): The transformer's predictive head forecasts the next inscription type, and this prediction feeds back into the current inscription's encoding. If the model predicts that a transition is imminent (based on rising oscillation and falling stability), the current inscription's combining mark depth and confidence may be adjusted to signal the impending change. The twin reading these inscriptions at inference time receives not just what happened but what is about to happen.
10.3 Potential SOTA on Conversation Convergence
The anticipation geometry framework currently achieves 71.8
The reasoning is that conversation convergence is a sequential phenomenon: it depends on the pattern of recent trajectory states, not just the current state. A single high-commitment, low-uncertainty snapshot could indicate convergence, or it could indicate a temporary pause before divergence. The preceding inscription sequence disambiguates: if the last 5 inscriptions show monotonically increasing stability with decreasing oscillation, convergence is far more likely than if the last 5 inscriptions show alternating stabilization and dispersion.
The anticipatory transformer, trained on inscription chains with convergence labels, could exploit these sequential patterns. Based on the sequential structure of the data and the transformer's capacity for temporal pattern recognition, we estimate a realistic improvement target of 78-85
10.4 Potential SOTA on Knowledge Graph Path Ranking
The current KG path ranking system achieves 81.0
When the twin queries the knowledge graph, it traverses paths from entity to entity. Each traversal produces an inscription (typically a transition, return, or echo claim). Over multiple queries, these inscriptions form a traversal history. The anticipatory transformer could learn that certain traversal patterns (e.g., repeatedly returning to the same entity cluster) indicate that the current query is about a well-known topic, biasing path ranking toward shorter, more direct paths. Conversely, traversal patterns showing novelty claims suggest the query is about an unfamiliar topic, biasing toward exploratory paths.
The improvement potential here is smaller than for convergence prediction (the current 81.0
10.5 Unified Pipeline
The full pipeline connecting the anticipatory transformer to the inscription system:
UNIFIED ANTICIPATORY INSCRIPTION PIPELINE
==========================================
z-trajectory (continuous)
|
v
+---------------------------+
| Anticipation Kernel |
| (7 scalars, every frame) |
+---------------------------+
|
v
+---------------------------+ +---------------------------+
| Flow Encoder (MLP, 27KB) | | Inscription Chain |
| 25 features -> 10 probs | | (all prior N'Ko lines |
+---------------------------+ | in session/cross-session)|
| +---------------------------+
| |
v v
+---------------------------------------------------+
| Anticipatory Transformer |
| Bidirectional attention: |
| - Backward: past inscription chain |
| - Current: flow encoder soft posterior |
| - Forward: predicted next sigil distribution |
+---------------------------------------------------+
| |
v v
Current inscription Next-state prediction
(N'Ko line, context- (soft distribution
conditioned) over future sigils)
| |
v v
+---------------------------------------------------+
| ICCT Inference |
| Twin reads inscriptions + prediction |
| Responds in English, trajectory-aware |
+---------------------------------------------------+The flow encoder serves as the feature extraction layer, producing the soft-posterior sigil distribution for the current turn. The anticipatory transformer contextualizes this distribution within the full inscription chain history and produces both the final inscription and a predictive signal. The ICCT twin consumes both the inscription chain and the prediction, enabling it to respond not just to what has happened but to what is expected to happen next.
This pipeline is currently theoretical. The flow encoder (Section 7) and the ICCT twin (Section 5) are implemented and producing results. The anticipatory transformer integration requires extending the existing anticipation kernel (which operates on geometric scalars) to operate on inscription chains, and training the transformer on labeled inscription sequences. This is planned as a follow-on to the current v10 training run.
---
11. Sound Sigils: Multimodal Encoding
11.1 Audio Representation
Each sigil has a defined audio representation, implemented in the Sound Sigils library (Python, pure stdlib):
| Sigil | Sound Description | Base Frequency | Duration |
|---|---|---|---|
| ߛ Stabilization | Descending tone settling to steady hum | 440 Hz | 1.5s |
| ߜ Dispersion | Expanding stereo, rising harmonics | 330 Hz | 1.5s |
| ߕ Transition | Sharp frequency shift, brief silence | 523 Hz | 0.8s |
| ߙ Return | Melodic resolution, home note return | 392 Hz | 1.2s |
| ߡ Dwell | Long sustained tone, subtle warmth | 349 Hz | 2.0s |
| ߚ Oscillation | Tremolo, rapid frequency modulation | 466 Hz | 1.0s |
| ߞ Recovery | Slow fade-in, gradual stabilization | 294 Hz | 1.8s |
| ߣ Novelty | Surprising interval, new timbre | 587 Hz | 1.0s |
| ߠ Place-Shift | Spatial panning, Doppler effect | 415 Hz | 1.3s |
| ߥ Echo | Delayed repetition, reverb tail | 370 Hz | 2.0s |
The sound design follows semantic principles: stabilization sounds stable (settling tone), transition sounds abrupt (sharp shift + silence), novelty sounds surprising (unexpected interval), echo sounds reflective (reverb tail).
11.2 Multimodal Twin Possibilities
The audio sigil representation opens a path to multimodal twin conditioning. A future vision-language model (VLM) with audio input could consume:
1. Text inscriptions (the current ICCT approach)
2. Audio sigils (sound representations played as a "trajectory sonification")
3. Visual inscriptions (N'Ko text rendered with combining marks, processed as images)
All three modalities encode the same underlying information (trajectory state), but through different sensory channels. A multimodal twin could fuse all three for richer trajectory understanding.
11.3 Connection to Smart Glasses
The audio sigil system connects to the Meta glasses bridge in the broader infrastructure. When the operator wears smart glasses, audio input could be processed through sigil detection (matching incoming sounds to the 10 sigil frequency profiles) before being fed to the twin. This would enable the twin to process environmental audio as trajectory context, extending the inscription system from text to the physical world.
---
12. Discussion
12.1 Stateless Inference with Full History
The most significant architectural consequence of inscription encoding is that it enables stateless inference with full history. Traditional personality models must either maintain a conversation state (consuming memory and creating statefulness that complicates scaling) or accept history truncation (losing context). The inscription encoding eliminates this tradeoff.
Because inscriptions are compressed and self-contained, the entire session history fits in the prompt. The twin requires no persistent state between inference calls beyond the inscription sequence. This makes the twin horizontally scalable: any instance of the twin, on any machine, can produce an identical response given the same inscription history, because the history is in the prompt, not in the model's state.
12.2 The Inscription as Universal Interface
The inscription format serves as an interface between heterogeneous systems:
- Text: The primary form, encoded as Unicode strings
- Audio: Each sigil has a sound representation (Sound Sigils)
- Visual: N'Ko combining marks create visually distinct depth rendering
- Blockchain: Inscriptions are hash-chainable (PsiChain) and settleable (EPOCH/Stacks)
- Graph: Each inscription references basins and places tracked in the knowledge graph
This universality means that the same trajectory information can flow through any channel. A text inscription, an audio sigil sequence, a visual N'Ko rendering, and a blockchain record all encode the same claim. The ICCT is currently text-only, but the inscription format is designed to generalize.
12.3 Limitations
Keyword classifier. The current Python encoder uses keyword heuristics for claim classification. The Rust cc-inscription crate uses a proper claim detector operating on z-trajectory metrics (embedding distances, curvature computation, threshold crossing). The keyword classifier is an approximation that works well for training data construction but introduces classification noise. Approximately 15-20
Tokenizer behavior. The 30x compression claim assumes that N'Ko characters tokenize efficiently. In practice, N'Ko characters are rare in most tokenizers' training data and may be encoded as multiple tokens (byte-level fallback). The actual compression ratio depends on the specific tokenizer and may be lower than the theoretical maximum. We measured 1-2 tokens per N'Ko sigil character on the Qwen3 tokenizer, which is acceptable but not single-token.
Loss of surface content. The inscription encoding deliberately discards the content of conversation turns, preserving only the trajectory signal. For a twin that must sometimes reason about what was discussed (not just the trajectory shape), the knowledge graph must compensate. If the graph does not contain the relevant information, the twin is blind to content that was discarded during inscription encoding.
Single-person validation. All results are validated on a single operator's conversation data. The generalizability of the sigil-to-trajectory mapping to other operators' communication patterns is untested. Different operators may have different trajectory signatures that do not map cleanly to the 10 sigil types.
12.4 The 30x Compression Claim
The claim that inscriptions achieve 30x compression requires careful qualification. The raw token ratio (inscription tokens / English tokens) is approximately 2.5-4x for individual turns, not 30x. The 30x figure refers to the effective compression, accounting for the fact that the inscription encoding has 100
Whether this effective compression translates to equivalent twin performance is an empirical question that requires completion of the v8 training run and head-to-head evaluation against the v7 (English context) twin. The preliminary loss curves are encouraging but not conclusive.
---
13. Future Work
13.1 Full Rust cc-inscription Integration via PyO3
The Python inscription encoder is a simplified mirror of the Rust crate. A PyO3 binding would expose the full claim detector (operating on actual z-trajectory metrics rather than keyword heuristics) to the Python training pipeline. This would eliminate the classification noise introduced by keyword matching and produce higher-quality training data.
13.2 Vision-Language Model Training
Qwen3.5 VLM, once text LoRA is supported by the MLX framework, would enable a twin that can process visual inscriptions (N'Ko text rendered as images with combining marks) alongside text inscriptions. This adds a visual channel to the trajectory encoding.
13.3 Thunder-Train Distributed Training
The Thunder-Train system (Mac4 + Mac5 connected via Thunderbolt at 40 Gbps) enables distributed MLX training. For larger inscription-conditioned models (14B-27B), data parallelism across both machines would halve training time. The `mx.distributed` primitives (`all_sum`, `all_gather`, `nn.average_gradients`) make this straightforward for LoRA training, where only adapter gradients (approximately 200MB) must be synchronized.
13.4 EPOCH Protocol Integration
Completing the end-to-end pipeline from cc-inscription output to nko-inscription.clar settlement would enable live inscription of twin decisions on Bitcoin. The 7 chainhook events specified in the EPOCH protocol would fire on inscription commitment, driving downstream automation (notification, audit trail update, twin context refresh).
13.5 Anticipation Scalars as Direct Conditioning
Beyond text-tag injection, the anticipation geometry scalars could be provided as numeric conditioning signals via a small projection layer. This would require modifications to the MLX training pipeline but could improve the twin's sensitivity to continuous scalar values (rather than discrete sigil classifications).
13.6 Multi-Session Inscription Histories
The current training data encodes single-turn inscriptions. A natural extension is multi-session inscription histories: the twin receives inscription summaries from multiple prior sessions, enabling cross-session trajectory awareness. With the 262K context window and inscription compression, the twin could potentially condition on weeks of session history simultaneously.
---
14. Conclusion
The Inscription-Conditioned Cognitive Twin demonstrates that encoding conversation history as N'Ko inscriptions, rather than English prose, dramatically expands the effective context available to a small personality model. The 10 N'Ko sigils, grounded in dynamical systems claims and mapped to 7 anticipation geometry scalars, provide a principled semantic alphabet where each character carries maximal trajectory information.
Four empirical findings anchor the work. First, inscription encoding achieves 100
Second, an inverse scaling law for personality transfer: across 11 adapter versions spanning 1B to 7B parameters, smaller models with full-layer LoRA consistently outperform larger models with partial-layer LoRA for persona override. The 7B models never escape their RLHF conditioning regardless of training data volume or loss minimization. The 3B and 4B models achieve full persona fidelity, with the inscription-conditioned 4B model additionally gaining trajectory awareness. On a 20-question evaluation suite, the twin achieves 90
Third, the flow encoder (a 27KB MLP producing soft-posterior sigil distributions at 85.7
Fourth, an A40 GPU training run with LoRA rank 64 on all 7 target modules achieves eval loss 0.733, a 3x improvement over the best Mac5 configuration (2.212), for a total cost of $0.35. The A40 model exhibits qualitatively new capabilities: context-aware pushback, references to actual system tools, root-cause reasoning over symptom-level response, and judgment about when to stop rather than continue. This establishes an adapter capacity scaling law for personality transfer: fidelity scales with rank times module count, not with model size. A 4B model with 132M trainable parameters (3.18
The architecture maintains the three-component separation (personality via LoRA, knowledge via live graph, trajectory via inscriptions) while adding a fourth capability: trajectory compression. The system is not merely compressed but verifiable: every inscription carries cryptographic provenance via PsiChain, and the full chain is settleable on Bitcoin via the EPOCH protocol's Stacks contracts.
The anticipatory transformer provides the natural next-stage encoder for the inscription system. By conditioning each inscription on the full preceding chain through bidirectional attention, the anticipatory transformer could improve conversation convergence prediction beyond the current 71.8
The broader contribution is architectural: the insight that structured, semantically-dense encodings can substitute for natural language in model conditioning, and that a script system (N'Ko) designed for human linguistic expression can be repurposed as a computational encoding for machine trajectory states. The inscriptions are simultaneously human-readable (to those who know the sigil legend) and machine-processable (to the trained twin), bridging the gap between cultural artifact and computational instrument. The $0.35 A40 result demonstrates that personality transfer is a data-quality and adapter-architecture problem, not a compute-scaling problem.
---
References
Baek, J., Aji, A. F., & Saffari, A. (2023). Knowledge-augmented language model prompting for zero-shot knowledge graph question answering. NAACL 2023.
Belova, A., et al. (2026). Domain-specific superintelligence. arXiv:2603.14147.
Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023). QLoRA: Efficient finetuning of quantized language models. NeurIPS 2023.
Diomande, M. (2025a). Anticipation geometry: Domain-general trajectory characterization with knowledge graph-grounded rewards. Independent research.
Diomande, M. (2025b). Anticipatory transformers: Bidirectional conditioning for trajectory prediction and conversation convergence. Independent research.
Diomande, M. (2026a). Cognitive twin: Personality transfer via small-model LoRA with runtime knowledge graph augmentation. Independent research.
Diomande, M. (2026b). Live knowledge graphs: Runtime graph integration for continuous domain adaptation in language agents. Independent research.
Diomande, M. (2026c). KARL: Trajectory intelligence with flow-encoded inscriptions and advantage-weighted LoRA. Independent research.
Edge, D., et al. (2024). From local to global: A graph RAG approach to query-focused summarization. arXiv:2404.16130.
Ge, T., Hu, J., Wang, L., Wang, X., Chen, S., & Wei, F. (2024). In-context autoencoder for context compression in a large language model. ICLR 2024.
Jiang, Z., et al. (2023). LongLLMLingua: Accelerating and enhancing LLMs in long context scenarios via prompt compression. arXiv:2310.06839.
Lewis, P., et al. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. NeurIPS 2020.
Liu, S., et al. (2024). DoRA: Weight-decomposed low-rank adaptation. ICML 2024.
Mu, J., Li, X., & Goodman, N. (2023). Learning to compress prompts with gist tokens. NeurIPS 2023.
Pan, S., et al. (2024). Unifying large language models and knowledge graphs: A roadmap. IEEE TKDE 2024.
Vaswani, A., et al. (2017). Attention is all you need. NeurIPS 2017.
Wei, J., et al. (2022). Emergent abilities of large language models. TMLR 2022.
---
Appendix A: Inscription Encoding Examples
A.1 Smoke Test Output
The following examples are produced by the Python inscription encoder on representative conversation turns:
Turn 0: "Deploy Spore to TestFlight, all steps, no pauses."
-> ߛ ⟦0.0-1.0⟧ : z(sigma) down ; spore ; c=0.90
Turn 1: "Wait, the build failed. Fix the signing issue first."
-> ߕ ⟦30.0⟧ : session -> corrective ; kappa=0.90 ; c=0.85
Turn 2: "What if we tried a different approach to the widget?"
-> ߣ ⟦60.0⟧ : new(session) ; d*=0.70 ; k=2
Turn 3: "Back to the KARL training pipeline."
-> ߙ ⟦90.0⟧ : loopback karl ; last=5.0 ; d=0.30
Turn 4: "Continue. Ship it."
-> ߛ ⟦120.0-121.0⟧ : z(sigma) down ; session ; c=0.90A.2 Training Example Format
{
"messages": [
{
"role": "system",
"content": "You are Mohamed. Read N'Ko inscriptions for session
dynamics, respond in English. Direct, casual, action-oriented.
Inscriptions encode trajectory state: ..."
},
{
"role": "user",
"content": "ߛ ⟦0.0-1.0⟧ : z(sigma) down ; spore ; c=0.96\n\n
Build 15 (v3.1.0) uploaded to TestFlight.
3 iPhones deployed with debug build.
TestFlight processing usually takes 10-15 minutes."
},
{
"role": "assistant",
"content": "check if it passed processing on TestFlight"
}
]
}A.3 Inscription Distribution in Training Data
Analysis of the 4,633 training examples by claim type:
| Sigil | Claim Type | Count | Percentage |
|---|---|---|---|
| ߛ | Stabilization | ~1,580 | 34.1 |
| ߡ | Dwell | ~890 | 19.2 |
| ߕ | Transition | ~720 | 15.5 |
| ߜ | Dispersion | ~510 | 11.0 |
| ߙ | Return | ~310 | 6.7 |
| ߣ | Novelty | ~240 | 5.2 |
| ߠ | Place-Shift | ~160 | 3.5 |
| ߥ | Echo | ~110 | 2.4 |
| ߚ | Oscillation | ~70 | 1.5 |
| ߞ | Recovery | ~43 | 0.9 |
The distribution reflects the operator's interaction patterns: stabilization dominates (34.1
---
Appendix B: The cc-inscription Crate Structure
cc-inscription/src/
lib.rs -- Entry point, re-exports all modules
claims/
mod.rs -- Claim enum, ClaimId, ClaimType, BasinId
stabilize.rs -- StabilizeClaim
disperse.rs -- DisperseClaim
transition.rs -- TransitionClaim
return_.rs -- ReturnClaim
dwell.rs -- DwellClaim
oscillate.rs -- OscillateClaim
recover.rs -- RecoverClaim
novel.rs -- NovelClaim
place_shift.rs -- PlaceShiftClaim
echo.rs -- EchoClaim
basin/
mod.rs -- Basin state machine
proto.rs -- ProtoBasin (pre-graduation)
graduation.rs -- Graduation criteria (3 signals)
lifecycle.rs -- Split/merge/retire
constitution.rs -- Basin invariants
lexicon/
mod.rs -- Lexicon container
version.rs -- LexiconVersion semver
tokens.rs -- BasinToken, PlaceToken
changelog.rs -- LexiconChange enum
epoch.rs -- LexiconEpoch (O(1) verification)
reinterpret.rs -- Derived view re-rendering
surface/
mod.rs -- SurfaceRenderer, NKoLine
renderer.rs -- Claim -> N'Ko line
grammar.rs -- Grammar skeletons per claim type
slots.rs -- Slot renderers (time, basin, place)
normalize.rs -- NFC normalization
phrase/
mod.rs -- Phrase system
detection.rs -- Sequence mining
compression.rs -- Compression ratio testing
registration.rs -- Phrase registry
detector/
mod.rs -- ClaimDetector, DetectorConfig
dynamics.rs -- z-trajectory metrics
integration/
mod.rs -- External bridges
graph_kernel.rs -- Slice boundary enforcement
rag.rs -- Evidence retrieval for Echo
dell.rs -- z-trajectory source
sensor.rs -- Jitter alignment
canonical/
mod.rs -- CBOR, SHA-256, NFC, QuantizedFloat
provenance/
mod.rs -- ProvenanceWitness, PureVerificationContext
ontology/
mod.rs -- OntologyOperation pipeline
chain_link/
mod.rs -- ChainLink, ChainState, verify_chain_integrity
combining.rs -- 9 combining marks (U+07EB..U+07F3)
steganography.rs -- Zero-width Unicode metadata embedding
types/
mod.rs -- Deterministic types umbrella
time.rs -- WallTime, MonoTicks, Timestamp
quantized.rs -- QuantizedFloat (i64, 10^-6 scale)
evidence.rs -- Evidence sum type
basis.rs -- BasisId, BasisRef
session.rs -- SessionId, SessionContext22,881 lines of Rust across 48 source files. Compiled and deployed.
---
Appendix C: Key Invariants
C.1 From NIP-0001 (Constitutional)
1. Provenance Law: For any InscriptionId, given archived evidence + lexicon + basis + config, the claim IR and N'Ko surface are deterministically reproducible and the InscriptionId recomputable.
2. Non-retroactive corpus: Old inscriptions are never rewritten. Reinterpretation is a derived view.
3. Operator sigils are LOCKED: The 10 sigils will not change.
4. No IEEE-754: All serialized values use QuantizedFloat (i64 mantissa, 10^-6 scale).
5. NFC-safe: All N'Ko characters confirmed NFC-normalized.
6. Chain integrity: Breaking any link invalidates every subsequent inscription.
C.2 From Anticipation Geometry (INV-001 through INV-007)
1. Deterministic replay (no random seeds)
2. Coverage threshold (reject below 0.9)
3. All scalars validated and bounded
4. No hot-path allocation
5. Schema version enforced
C.3 From PsiChain
1. Density monotonicity: information density increases monotonically with chain height
2. Self-reference: each inscription depends on the full chain history
3. Combining marks are part of the cryptographic commitment
4. Zero-width metadata is part of the surface hash
---
Appendix D: A40 Adapter Configuration
The SOTA adapter configuration (v11), produced by PEFT 0.18.1:
{
"peft_type": "LORA",
"base_model_name_or_path": "Qwen/Qwen3-4B-Instruct-2507",
"r": 64,
"lora_alpha": 128,
"lora_dropout": 0.05,
"target_modules": [
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"
],
"bias": "none",
"task_type": "CAUSAL_LM",
"inference_mode": true
}Key design choices:
- Alpha = 2x rank (128 / 64 = 2.0): Standard scaling factor. Higher alpha increases the adapter's influence on base model weights.
- Dropout 0.05: Minimal regularization. The training corpus is small (4,633 examples) but diverse (15 months of real operator interaction). Higher dropout (0.1, 0.2) was not tested but could reduce overfitting risk.
- No DoRA, no RSLoRA, no QALoRA: Standard LoRA. The hypothesis is that adapter capacity (rank x modules) matters more than adapter architecture variants for personality transfer. Testing DoRA (weight-decomposed LoRA) is a natural next experiment.
- All 7 target modules: This is the maximum coverage for the Qwen3 architecture. Every linear layer that participates in attention (q, k, v, o) and feed-forward transformation (gate, up, down) receives an adapter. Only embedding layers, layer norms, and the final LM head are left unadapted.
Adapter size: 515MB (safetensors). Convertible to MLX format via `mlx_lm.convert` for Apple Silicon inference at approximately 4GB VRAM (4-bit quantized base + adapter).
Promotion Decision
Convert into the standard paper schema, add citations, and render a draft PDF.
Source Anchor
inscription-cognitive-twin-paper.md
Detected Structure
Abstract · Introduction · Method · Evaluation · References · Figures · Code Anchors · Architecture