Grand Diomande Research · Full HTML Reader

N'Ko as Computational Infrastructure — Program Map

**N'Ko is not a decorative or interchangeable rendering of Manding. For machine-learning systems it is computational infrastructure.** A designed, bijective, tone-marking script changes four things that are usually treated as fixed:

Language as Infrastructure proposal experiment writeup candidate score 36 .md

Full Public Reader

N'Ko as Computational Infrastructure — Program Map

> Front door for the whole research line. One thesis, several pillars, one release plan.
> Last updated: 2026-05-28.

The thesis

N'Ko is not a decorative or interchangeable rendering of Manding. For
machine-learning systems it is computational infrastructure. A designed,
bijective, tone-marking script changes four things that are usually treated as
fixed:

1. What models can represent — tokenizers and hidden circuits behave
differently on N'Ko than on Latin Bambara.
2. How acoustic evidence aligns to symbols — a phoneme-transparent target
changes CTC decoding.
3. Whether a reported error rate is meaningful — N'Ko CER is phonemically
interpretable in a way Latin WER is not.
4. How sound itself can be encoded — the same script that makes recognition
interpretable makes a featural acoustic code possible.

Everything below is a pillar of that one claim.

---

Architecture (the unifying figure)

The trajectory geometry `z_t` is reused at three levels — decode, govern, correct.
This is the same figure that anchors the FAC tone-resolution paper.

                 N'KO AS COMPUTATIONAL INFRASTRUCTURE
       one featural codebook · one trajectory geometry · many signals

   signal (mic / mocap) ─► front-end features ─► N'Ko featural codebook
                                                 (3,024 tonal syllables)
   ┌───────────────────────────────────────────────────────────────┐
   │ 1. DECODE   Anticipatory Transformer + CTC                      │
   │     features ─► z_t (7-dim trajectory) ─► TONELESS N'Ko (~20% CER)│
   │     z_t = commitment·uncertainty·transition·novelty·…           │
   └───────────────────────────────┬───────────────────────────────┘
                                    │ toneless inscription
   ┌────────────────────────────────▼──────────────────────────────┐
   │ 2. GOVERN (AGP)  same z_t as a correction policy                │
   │     partition spans → stable / boundary / uncertain / novelty   │
   │     trust gate: train / review / exclude                        │
   └───────────────────────────────┬───────────────────────────────┘
                                    │ eligible spans
   ┌────────────────────────────────▼──────────────────────────────┐
   │ 3. CORRECT  conservative, AGP-gated                             │
   │     tone = text prior (Paper 8) × acoustic register (FAC)       │
   │           ↑ "what tone fits"     ↑ F0 read · 99% non-contour    │
   │     ─► TONED N'Ko                                               │
   └───────────────────────────────────────────────────────────────┘

   SELF-IMPROVING LOOP
     OCR lessons ─► VLM ─► toned corpus ─► AGP partition ─► stable pairs
          ▲                                                      │
          └────────────── retrain ◄── lower CER ◄───────────────┘

   Two arrows on the codebook: ANALYSIS (signal→N'Ko, the stack above) and
   SYNTHESIS (N'Ko→signal). FAC's decoder proves the synthesis arrow for sound;
   it is the template for N'Ko-as-score → body motion (computational choreography),
   which is a distinct generator, not a front-end swap.

The pillars

The program is three linguistic pillars plus an orthogonal systems track.
Read the linguistic pillars as the science; read the systems track as "how we
trained it cheaply." Do not mix them in the narrative.

### Pillar 1 — Representation: can a model see N'Ko?
Whether N'Ko is visible inside model internals (tokenization, activation geometry,
circuit duplication).
- Dead Circuits — N'Ko invisibility inside LLM activations. `[consolidated]`
- Script Invisibility Is Structural — the pattern holds across model families.
`[consolidated]`
- Artifact line: brain-scan activation profiling, custom 512-merge N'Ko BPE
(2.75x compression), vocabulary-extension surgery.

### Pillar 2 — Recognition + Measurement: can a model hear N'Ko, and is the metric real?
The core of the program. Script-native audio-to-N'Ko ASR, plus the argument that
its error metric is scientifically meaningful.
- Living Speech — construction of the script-native N'Ko ASR stack.
`[consolidated]`
- Does Script Design Matter? / Paper 4 — Script Advantage in CTC-ASR —
trajectory-conditioned CTC; N'Ko's bijective script gives a ~5.25pp CER
advantage over Latin. `[written]`
- Paper 5 — Generalization & Speaker Adaptation — unseen-word generalization,
per-speaker TTT, Djoko consensus. `[written]`
- Paper 6 — Trajectory Attention Residuals (TAR) — is depth-wise attention
routing script-dependent? `[training]`
- Transparent-Script Proposition — if the normalized script map is bijective
over the phoneme inventory, character edit distance preserves phoneme-edit
structure (up to normalization). Latin digraphs + optional tone + spelling
variation do not. This makes N'Ko CER a phonemically interpretable metric.
`[the measurement thesis — the strongest single idea]`

### Pillar 3 — Reconstruction + Tone: can a model render N'Ko and resolve its tone?
The generative dual of Pillar 2, and the place the program's biggest open hedge
(tone) gets closed.
- Paper 8 — Contextual Tone Resolution — a language model trained on
OCR-extracted toned N'Ko resolves tone from text context. `[planned]`
- FAC — Featural Acoustic Coding (`Desktop/nko-acoustic-coding/`) — resolves
tone from acoustic F0, and treats the N'Ko syllable codebook as a
sound-encoding substrate (sound → readable, editable, tone-native N'Ko →
sound). `[proposed; pitch-channel pilot done]`
- These two are halves of one problem. Paper 8 is the prior (what tone makes
linguistic sense); FAC is the evidence (what pitch the speaker produced). Tone
resolution = prior + evidence.

### Systems track (orthogonal enablers — not the linguistic thesis)
- Paper 7 — Neural Engine Offload — M4 ANE runs Whisper-scale projections at
11.6 TFLOPS via reverse-engineered private APIs. `[spike done]`
- Paper 9 — Distributed Training on Apple Neural Engine — mesh training across
Macs/iPhones/iPads. `[future]`
- Paper 10 — TurboQuant for Low-Resource Retrieval — 4-bit VQ replacing ANN
indexes for RAG++. `[future]`

---

The canonical ASR anchor (state it with its provenance)

The strongest retained ASR artifact is an archived checkpoint on a
Bambara corpus snapshot (232,476 / 29,060 / 29,060 train/val/test; lr 3e-4,
batch 32, dropout 0.1, seed 42) reporting **test CER ≈ 20
public anchor for "the 20

Honesty bounds (carry these everywhere, as the canonical paper does):
- It is not a fresh strict reproduction.
- It is not a result for later internal decoder variants (other checkpoints
exist at different CER on different corpora; e.g. a 297K trajectory CTC at
~27
- It is not a closed proof that N'Ko beats Latin under every matched setting.
- N'Ko CER still depends on normalization, reference quality, and tone/diacritic
policy — which is exactly the hole Pillar 3 fills.

The value is the measurement thesis around the anchor, not a leaderboard number.

---

The ASR architecture, and how FAC attaches

Anticipatory Transformer CTC (the central ASR object):
frozen Whisper large-v3 features `h_{1:T}` → project to 768d + temporal downsample
→ `u_{1:T'}` → 6-layer Transformer CTC head → N'Ko characters (toneless, 65
classes) → FSM syllable validation. A trajectory module estimates a 7-dim
state `z_t ∈ [0,1]^7` (commitment, uncertainty, transition pressure, recovery
margin, phase stiffness, novelty, stability) injected as an attention-logit
bias `B_ij(z_i, z_j)` before CTC emission.

Three ways FAC can attach. Choose deliberately.

- ✗ Do not swap the CTC target for the full FAC featural codebook (~3,640 tonal
syllables + diacritics). It is data-starved and would raise CER — the same
"richer target needs more data" tension found with morpheme-constrained BPE.

- ✓ Build first — parallel acoustic head (low risk). Same shared encoder, two
heads. Head A = the existing toneless CTC (the 20
Head B = an acoustic tone/register predictor trained on F0. Multitask loss
`L = L_CTC + λ · L_FAC`. Head B's tone posteriors feed the Paper-8 fusion
(text-LM prior + acoustic evidence → toned output). This attacks the tone hedge
without risking the anchor.

- 🚀 Write next — trajectory-state fusion (the new-architecture paper). Augment
`z_t` with FAC's interpretable acoustic coordinates (F0 register, pitch
contour, spectral centroid, onset transient, harmonicity). The attention bias
`B_ij` then conditions on named acoustic geometry rather than only learned latent
dynamics. This grounds the trajectory in interpretable features — the same spirit
as the transparent-script thesis — and literally fuses FAC into the anticipatory
transformer. Uniquely ours because we own both halves.

---

Why FAC matters to CER (the honest causal path)

FAC does not lower CER by hacking the decoder. It lowers CER through the
self-improving loop, whose bottleneck is tone:

train ASR → label 253K AfVoices audio → OCR toned N'Ko (Paper 8)
   → tone model fixes labels  ← FAC adds ACOUSTIC tone evidence here
   → retrain on 300K+ clean → lower CER → repeat

Better tone resolution (text prior + acoustic evidence) → cleaner labels → better
retrain → lower CER. That is the flywheel, and FAC is a new input to its weakest
step.

Caveat from the real-speech pilot (parents-audio, Manding): conversational tone is
register-dominated with only ~40-cent within-syllable excursions, so expect gains
in tone-diacritic accuracy, not a dramatic CER drop. The decisive test needs
tone-bearing material (lesson-video / sung tone), not conversation.

---

Release strategy (nothing is released yet)

1. Flagship, release now: the canonical consolidation — *N'Ko as Computational
Infrastructure* (`paper/current/paper_canonical_nko_agp_20cer.tex`). Mature,
carefully hedged, self-contained around the 20
Do not gate it on FAC. This defines the program on arXiv.
2. Companion focused papers, by venue:
- Recognition (Papers 4/5/6) → Interspeech / ICASSP.
- Representation (Dead Circuits / Script Invisibility) → ACL / EMNLP.
- Tone + Reconstruction (Paper 8 fused with FAC) → the reconstruction/tone
paper, after the tone-seam experiment.
- Systems (Paper 7) → an efficiency / MLSys venue.
3. FAC is not a standalone flagship. Empirically thin today (small real-speech
pitch effect, ties a contour-augmented lexical baseline, no synthesizer, H1
unrun). The "FAC beats LAC" framing is a section or workshop note, not a
headline. FAC ships as the Reconstruction pillar + acoustic half of tone once
the tone channel earns its place.

Labels. Program: N'Ko as Computational Infrastructure. Pillars by function:
Representation · Recognition · Measurement · Governance · Reconstruction.
Systems track named separately.

---

Status snapshot

Pillar	Papers	Status
Representation	Dead Circuits, Script Invisibility	consolidated; releasable
Recognition	4 (written), 5 (written), 6 (training)	20
Measurement	transparent-script proposition	written; the spine
Governance	AGP / Deployment	consolidated
Reconstruction	Paper 8 (planned), FAC (proposed, pilot done)	next build
Systems	7 (spike), 9/10 (future)	orthogonal

## Next moves
1. Tone-seam experiment — acoustic F0 → tone cues fused with the Paper-8 text
LM, measured on lesson-video audio with OCR'd toned N'Ko as ground truth.
Metric: Tone-Diacritic Error Rate (TDER) + toned CER. Cheap v0: pyin F0 →
register-relative tone classifier, no training. (`nko-acoustic-coding/experiments/`)
2. Release the canonical consolidation as the flagship arXiv paper.
3. Parallel FAC acoustic head on the shared encoder (Build-first option above).
4. Trajectory-state fusion writeup (the new-architecture paper).

See also: `Desktop/nko-acoustic-coding/` (FAC paper + pitch-channel experiments),
`PIPELINE.md` (full paper roadmap + self-improving loop), `paper/current/`
(canonical consolidation).

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

nko-brain-scanner/PROGRAM.md

Detected Structure

Method · Evaluation · Figures · Code Anchors · Architecture