Grand Diomande Research · Full HTML Reader

A Field Guide to the N'Ko Claim

In 1949, in Kankan, Guinea, Solomana Kante designed a writing system for Manding languages. Not a borrowed alphabet. Not a colonial compromise. A script built from the sound structure of the languages themselves. N'Ko means "I say." That name is not ornamental. It is a statement about who gets to write a language on its own terms.

Language as Infrastructure research note experiment writeup candidate score 18 .md

Full Public Reader

A Field Guide to the N'Ko Claim

Before the acronyms, before the tables, before the 20.57
simple scene.

In 1949, in Kankan, Guinea, Solomana Kante designed a writing system for Manding
languages. Not a borrowed alphabet. Not a colonial compromise. A script built from
the sound structure of the languages themselves. N'Ko means "I say." That name is
not ornamental. It is a statement about who gets to write a language on its own
terms.

The project started from a suspicion: if N'Ko was designed so cleanly for Manding
speech, then modern language technology should not treat it as an afterthought.
The experiments did not unfold in a straight line. First we looked inside language
models. Then we built script conversion tools. Then we trained speech decoders.
Then we tested trajectory ideas. Then we tried to understand what a transcript
should be allowed to become after the model emits it.

This field guide explains the claim so the rest of the series does not feel like a
wall of private shorthand.

The first finding: script invisibility

A model can display N'Ko without understanding N'Ko.

That is the central lesson from the brain-scan work. Unicode support means the
characters can enter the system. It does not mean the model has learned useful
internal representations for those characters. In the older blog drafts, we called
this the translation tax: the model was working with much weaker internal energy
for N'Ko than for English.

In one Qwen3-8B protocol, N'Ko carried an average representation tax around 2.94x.
In a cross-model protocol, the tax was 3.30x for Qwen3-8B, 3.59x for Qwen2.5-7B,
and 2.67x for Mistral-7B. The exact values depend on the protocol. The pattern
does not: the models accepted the script, but they did not process it as a script
they knew.

Arabic is the important control. Arabic is also written right to left. Modern
models handle Arabic far better because Arabic has much more training data and
tokenizer allocation. So the problem is not direction. The problem is exposure.

The second finding: N'Ko is a better measurement target for Manding ASR

Automatic speech recognition, or ASR, turns audio into text. Most Bambara speech
systems output Latin text because that is where the datasets are. But Latin
Bambara is a noisy scientific target. It can hide tone, split a single sound into
digraphs, vary apostrophes, and change word boundaries depending on transcription
convention.

N'Ko is different. It was designed around Manding sound structure. That does not
make every scoring problem disappear. Tone marks, normalization, dialect, and
reference quality still matter. But N'Ko gives character error rate, or CER, a
much closer relationship to the acoustic question than Latin word error rate, or
WER: did the model hear the sounds?

That is why the metric work matters. It is not only a cultural argument. It is a
measurement argument.

The third finding: the 20.57

The public number is 20.57

The precise sentence is:

> An archived script-native N'Ko ASR checkpoint reports 20.57
> rate on a 290,596-pair Bambara corpus snapshot under recorded settings.

The arithmetic is not vague:

text

216,225 character edits / 1,050,967 reference characters = 20.57%

The run used a 290,596-pair corpus snapshot split into 232,476 training rows,
29,060 validation rows, and 29,060 test rows. It used learning rate 0.0003, batch
size 32, dropout 0.1, seed 42, and reached best validation loss
0.6358872798606507 after 47 trained epochs.

That is the anchor. It should be talked about plainly. It should not be inflated
into a claim that the whole project is complete.

The branches that should not be merged

The project has several related names that can easily get tangled.

CTC means connectionist temporal classification. It is the training method used by
the decoder to learn character sequences without hand-aligning every audio frame
to every character.

Trajectory means a compact description of how the speech signal is moving:
commitment, uncertainty, transition pressure, recovery margin, phase stiffness,
novelty, and stability. The idea came from treating speech as motion rather than
as isolated frames.

TAR means trajectory-attention residual. It is a heavier later branch that pushed
trajectory information deeper into attention.

TTT means test-time training or test-time adaptation. It changes behavior during
evaluation or inference by adapting on the fly.

AGP means Anticipation Geometry Partition. It is not the acoustic model. It is the
post-ASR governance layer that decides what to do with transcript rows after they
exist.

Those names belong in the project, but they are not interchangeable. TAR did not
produce the 20.57
20.57
its recorded training regime.

That separation is not weakness. It is how the story stays credible.

The shape of the series

The first essay explains why models that can render N'Ko may still be blind to it.

The second explains why CER in N'Ko is a better instrument than WER in Latin
Bambara for the specific scientific question we care about.

The third walks through the model that listened in N'Ko: Whisper features,
Transformer CTC decoding, trajectory state, and the later branches that should be
kept separate.

The fourth asks what happens after the transcript. That is where AGP enters. A
low-resource speech system is not only judged by a score. It is judged by what it
does with uncertain rows, novel words, names, noisy soap-opera audio, overlapping
speakers, and corrections that look fluent but are wrong.

The whole project can be summarized like this:

text

N'Ko exposes what language models missed.
N'Ko gives Manding ASR a cleaner target.
A direct N'Ko ASR checkpoint reached 20.57% CER under recorded settings.
AGP is the discipline needed before those transcripts become corpus material.

That is the claim. It is strong enough.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

nko-brain-scanner/paper/blog-series/00-field-guide-to-the-claim.md

Detected Structure

Method · Evaluation · Architecture