N'Ko as an Extensible Phonemic Substrate for Governed Low-Resource Speech
Low-resource speech systems usually fail twice: first because there is not enough audio/text data, and second because the available evaluation scripts do not preserve the phonemic structure of the language being measured. This paper argues that N'Ko offers a different path. Because N'Ko is a phonetic, right-to-left script designed for Manding languages and equipped with tone, nasalization, and documented foreign-sound diacritics, it can function as an extensible phonemic substrate: a deterministic sound-code for co
Full Public Reader
N'Ko as an Extensible Phonemic Substrate for Governed Low-Resource Speech
Mohamed Diomande
Draft date: 2026-06-01
Abstract
Low-resource speech systems usually fail twice: first because there is not enough
audio/text data, and second because the available evaluation scripts do not preserve
the phonemic structure of the language being measured. This paper argues that N'Ko
offers a different path. Because N'Ko is a phonetic, right-to-left script designed for
Manding languages and equipped with tone, nasalization, and documented foreign-sound
diacritics, it can function as an extensible phonemic substrate: a deterministic
sound-code for constructing labels, auditing errors, and governing self-correction.
We validate the representation layer with a coverage evaluator over Manding, French,
and English phoneme inventories. Baseline N'Ko covers Manding completely and covers
63.9
foreign-sound combinations raise French to 80.6
full-compositional layer reaches 100.0
90
to a governed correction experiment on 500 N'Ko ASR rows. An ungoverned Gemma-based
proposer degrades CER from 0.3106 to 0.4701 (+15.94pp), while the AGP gate reduces
the damage to 0.3120 (+0.14pp). A minimal-edit LoRA reduces blind harm but fails
after gating (0.3156, +0.50pp) because small wrong edits slip through an edit-size
gate. The resulting conclusion is precise: N'Ko representation coverage can be
extended mechanically, but direct audio recognition and self-improving correction
still require acoustic evidence and correctness-aware governance.
1. Introduction
The narrow version of this work is "build an N'Ko ASR." That is too small. The larger
problem is that low-resource languages often lack the corpora required for modern
speech and language models, while the scripts used to measure them can collapse
phonemic distinctions into orthographic conventions. In that setting, the core
question becomes: can the system manufacture better training data from its own
errors without poisoning itself?
This paper proposes N'Ko as the substrate for that loop. The script is not treated as
a decorative output format. It is treated as a computational object: a representation
where character-level operations can be interpreted phonemically, where tone and
nasalization are explicit, and where foreign sounds can be represented by documented
diacritic rules and bounded composition.
The paper's thesis is:
> N'Ko can serve as an extensible phonemic substrate for governed low-resource speech
> systems: a mechanically auditable representation layer that constructs labels,
> measures errors, and supports self-correction under governance.
This thesis deliberately separates representation from recognition. If the input is
text or IPA, N'Ko encoding is deterministic and requires no retraining. If the input
is audio, a model must still hear the relevant features. The bridge between those
layers is Featural Acoustic Coding (FAC): a hypothesis that an acoustic model
predicting place, manner, voicing, vowel height, rounding, nasality, and tone could
compose unseen phonemes from heard features. That is the next acoustic proof, not a
completed result.
2. Background: N'Ko as Script and Standard
The Unicode Core Specification documents N'Ko in Chapter 19 as a script devised by
Solomana Kante in 1949 for Manden languages. Unicode also states that N'Ko is
right-to-left and phonetic in nature, with seven vowels, tone diacritics,
nasalization, 19 consonants, two abstract consonants, and diacritics for foreign
sounds. It further documents specific foreign-sound combinations in Table 19-3,
including sounds relevant to Arabic and French.
This standards anchor matters. The extension mechanism used in this work is not an
arbitrary invented alphabet. It has three layers:
1. Baseline N'Ko: current local IPA-to-N'Ko mappings for Manding-oriented sounds.
2. Unicode extensions: combinations documented by Unicode for foreign sounds.
3. Full compositional layer: internal computational encodings for remaining
sounds, formed by deterministic composition of existing N'Ko primitives.
The third layer is not presented as official orthographic reform. It is an internal
phonemic label layer, analogous in purpose to an ASR label alphabet or a phonemic
transcription scheme.
3. Formal Claim: Representation Coverage
Let \(P_L\) be a finite phoneme inventory for language \(L\), and let \(M_k\) be an
N'Ko mapping layer. The coverage of \(M_k\) on \(L\) is:
The experiment evaluates three mappings:
- \(M_0\): baseline local N'Ko map;
- \(M_1\): \(M_0\) plus Unicode-documented foreign-sound combinations;
- \(M_2\): \(M_1\) plus bounded internal composition.
The pass criterion is:
This is a representation theorem/evaluation, not a neural training result. It asks
whether every phoneme has a deterministic N'Ko label. It does not ask whether an
acoustic model can hear that phoneme.
4. Coverage Experiment
4.1 Implementation
The implementation lives in:
[home]/Desktop/NKo/nko/phonemic_extensions.py
[home]/Desktop/NKo/scripts/evaluate_phoneme_coverage.pyThe regression tests live in:
[home]/Desktop/NKo/tests/test_phonemic_extensions.pyVerified command:
cd [home]/Desktop/NKo
python3 scripts/evaluate_phoneme_coverage.py --threshold 0.90
python3 -m pytest -q tests/test_phonemic_extensions.py tests/test_phonetics.py tests/test_transliterate.pyThe test suite reports:
164 passed in 0.16sA corpus-label harness now operationalizes the next step:
[home]/Desktop/nko-brain-scanner/experiments/phonemic_substrate/label_ipa_corpus.pyOn the built-in English/French/Manding examples it reports 29/29 covered IPA symbols,
zero unknown symbols, and writes JSONL labels plus a JSON coverage report.
4.2 Results
language layer covered coverage missing
----------------------------------------------------------------------------------------
manding baseline 27/27 100.0% -
manding unicode_extensions 27/27 100.0% -
manding full_compositional 27/27 100.0% -
french baseline 23/36 63.9% y, ø, ə, œ, ɑ̃, ɛ̃, ɔ̃, œ̃, v, ʃ, ʒ, ʁ, ɥ
french unicode_extensions 29/36 80.6% ø, œ, ɑ̃, ɛ̃, ɔ̃, œ̃, ɥ
french full_compositional 36/36 100.0% -
english baseline 24/41 58.5% ɪ, æ, ə, ʌ, ʊ, ɑ, v, θ, ð, ʃ, ʒ, ɹ, aɪ, aʊ, ɔɪ, eɪ, oʊ
english unicode_extensions 30/41 73.2% ɪ, æ, ʌ, ʊ, ɑ, ɹ, aɪ, aʊ, ɔɪ, eɪ, oʊ
english full_compositional 41/41 100.0% -The 90
the shape of the improvement. The Unicode layer already handles a meaningful set of
foreign sounds. The full layer handles remaining vowels, rhotics, diphthongs, and
nasal vowels by composition.
4.3 Examples
English "think very good"
IPA: θɪŋk vɛɹi gʊd
N'Ko label: ߛ߳ߌ߳ߧߞ ߝ߭ߐߙ߳ߌ ߜߎ߳߫ߘ
French "tu es un bon ami"
IPA: ty e œ̃ bɔn ami
N'Ko label: ߕߎ߳ ߍ ߐ߲߳ ߓߏߣ ߊߡߌ
Manding "n ko"
IPA: n ko
N'Ko label: ߣ ߞߋThese examples matter because they show the practical data pipeline: a transcript can
be phonemized, converted to IPA, and then rendered into N'Ko labels without requiring
manual N'Ko transcription.
5. What Coverage Means, and What It Does Not Mean
The coverage result has a strong implication:
> We can construct N'Ko phonemic labels for English and French using deterministic
> rules, without training a model.
It does not imply:
> A Manding-trained acoustic model can recognize English or French audio without
> adaptation.
The distinction is the center of the paper.
5.1 Writing/encoding layer
For text or IPA:
text -> phonemizer -> IPA -> N'Ko compositional labelNo retraining is required. This is a rule system.
5.2 Recognition/hearing layer
For audio:
audio -> acoustic model -> phoneme/N'Ko sequenceTraining or adaptation is usually required. A symbol for /theta/ does not give an
acoustic model the ability to hear /theta/ if the model has never learned that
feature.
5.3 Featural acoustic layer
The possible exception is a featural acoustic model. If a model predicts features
instead of whole phonemes, then unseen phonemes can become compositions of seen
features:
v = labial + fricative + voiced
theta = dental + fricative + voiceless
French nasal vowel = vowel quality + nasalityThis is why FAC matters. FAC is not just tone reconstruction. It is the acoustic-side
version of the same compositional principle that makes N'Ko extensible on the writing
side. This remains a hypothesis until tested with audio.
6. Governed Correction Experiment
The representation result explains how to make labels. The AGP experiment explains
why a low-resource system cannot safely trust an ungoverned language model to correct
its own outputs.
6.1 Setup
The 500-row pilot compares:
- ASR baseline;
- blind acceptance of Gemma-based N'Ko proposals;
- AGP-gated acceptance of those proposals;
- a new minimal-edit LoRA trained from mined SFT examples.
The archived artifacts are stored in:
[home]/Desktop/nko-brain-scanner/artifacts/agp_pilot/6.2 Results
| Run | Blind CER | Blind Delta | Better/Same/Worse | Gated CER | Gated Delta | Accepted Worse |
|---|---|---|---|---|---|---|
| Old adapter | 0.4701 | +15.94pp | 21/169/310 | 0.3120 | +0.14pp | 18 |
| Min-edit adapter | 0.4269 | +11.63pp | 14/225/261 | 0.3156 | +0.50pp | 69 |
The old adapter proves the harm result: a model can generate valid-looking N'Ko and
still destroy the transcript. The gate proves the safety result: AGP neutralizes most
of that harm. The minimal-edit adapter proves the next failure: if the model learns
to make small wrong edits, an edit-size gate admits more errors.
6.3 Interpretation
The first loop did not close. That is an important negative result.
Old model failure:
large wrong edits -> rejected by edit-size budgetMinimal-edit model failure:
small wrong edits -> accepted by edit-size budgetTherefore the next gate must evaluate correctness/evidence. Size is necessary but not
sufficient.
7. Oracle Headroom
The oracle run asks what would happen if proposals were trustworthy. It uses reference
text as the proposal, so it is not a deployable result. It is a ceiling estimate.
Rows: 29,060
Accepted at cap=2: 3,302
Rejected: 25,758
edit_too_large rejections: 22,914
Median rejected edit size: 9
Accepted-worse: 0Oracle cap sweep:
cap=2 -0.46pp
cap=8 -5.57pp
cap=12 -9.97pp
cap=999 -29.15ppReal proposal cap sweep:
cap=2 +0.14pp
cap=8 +2.36pp
cap=12 +3.99pp
cap=999 +15.59ppThis is the two-regime proof:
- with trustworthy proposals, relaxing the budget exposes large CER headroom;
- with untrustworthy proposals, relaxing the budget admits drift.
The bottleneck is proposal correctness.
8. Direct ASR Extension Plan
The representation result gives the data construction path for English and French.
We do not need English-N'Ko or French-N'Ko human pairs. We need ordinary speech
corpora with transcripts.
Pipeline:
audio.wav + transcript
-> phonemizer
-> IPA
-> full-compositional N'Ko label
-> ASR fine-tune targetRecommended first corpora:
- English: LibriSpeech or Common Voice English;
- French: Common Voice French or MLS French;
- Manding: existing N'Ko/ASR bridge data for in-family comparison.
Pass criteria:
- 90
- zero unmapped IPA symbols after normalization;
- reversible IPA/N'Ko audit on sampled rows;
- direct ASR fine-tune reaches a non-trivial CER/PER regime against generated labels;
- error analysis separates representation failures from acoustic failures.
This experiment moves the claim from "N'Ko can represent the sound inventory" to
"N'Ko can serve as an ASR target for another language."
9. Phrase-Level Transfer
The conversation also raised a larger idea: if N'Ko can carry sounds across languages,
can it also carry phrase structures or ways of expressing things?
The answer is: possibly, but that is a second layer. Sound transfer and meaning
transfer are different.
Sound layer:
spoken form -> phonemic N'Ko representationPhrase/semantic layer:
expression -> meaning/role/frame -> target-language realizationN'Ko can help because it provides a stable phonemic substrate and a culturally
meaningful script layer. But phrase borrowing, idiom transfer, and semantic
equivalence require a language model or structured semantic representation. That
should be framed as future work, not as proven by phoneme coverage.
10. The Paper's Implication
The immediate implication is not "we never need to retrain." The correct implication
is sharper:
> N'Ko lets us separate representation scarcity from acoustic scarcity.
Representation scarcity is solved mechanically: generate N'Ko labels from text/IPA.
Acoustic scarcity remains, but the generated labels let us train or fine-tune ASR
models without requiring human N'Ko transcription for every new language.
That is a real breakthrough shape:
1. N'Ko becomes a reusable phonemic target, not just a Manding output script.
2. Low-resource systems can manufacture labels from existing transcripts.
3. The governance loop can decide which corrections become training data.
4. FAC offers a path toward zero-shot acoustic composition, but must be tested.
11. Limitations
1. The English and French inventories are finite working inventories, not exhaustive
dialectal inventories.
2. Full-compositional N'Ko is an internal computational encoding, not an official
orthographic standard.
3. The coverage experiment proves representation, not acoustic recognition.
4. The AGP min-edit loop failed to improve CER, so the self-improving loop remains
open.
5. The current correctness gate is too weak for small wrong edits.
6. Cross-lingual phrase transfer is conceptual and needs its own evaluation.
12. Conclusion
The validated result is strong but bounded. N'Ko can be extended into a high-coverage
phonemic substrate for Manding, French, and English inventories using documented
foreign-sound mechanisms and deterministic composition. That means we can generate
N'Ko phonemic labels without retraining, and those labels can become targets for ASR,
correction, and audit.
The unvalidated leap is acoustic universality. Direct audio-to-N'Ko still requires a
model that can hear the target sounds. A phoneme-based model needs training data. A
featural acoustic model may compose unseen phonemes from seen features, but that is
the next experiment.
The paper's contribution is therefore not a finished universal recognizer. It is the
architecture and evidence for one: N'Ko as the representation layer, FAC as the
acoustic-composition hypothesis, AGP as the governance layer, and SFT recycling as
the data engine. For low-resource speech, the model is not just a checkpoint. The
model is the governed loop that learns which of its own outputs deserve to become
data.
References and Source Anchors
- Unicode Consortium. Unicode Core Specification, Chapter 19, section 19.4.1 N'Ko.
https://www.unicode.org/versions/latest/core-spec/chapter-19/
- ScriptSource. N'Ko script description.
https://scriptsource.org/scr/Nkoo
- Library of Congress. N'Ko romanization table.
https://www.loc.gov/catdir/cpso/romanization/N
Reproducibility Pointers
- Coverage evaluator:
`[home]/Desktop/NKo/scripts/evaluate_phoneme_coverage.py`
- Coverage module:
`[home]/Desktop/NKo/nko/phonemic_extensions.py`
- Coverage tests:
`[home]/Desktop/NKo/tests/test_phonemic_extensions.py`
- Research bundle:
`[home]/Desktop/nko-brain-scanner/experiments/phonemic_substrate/run_research_bundle.py`
- IPA label harness:
`[home]/Desktop/nko-brain-scanner/experiments/phonemic_substrate/label_ipa_corpus.py`
- Bundle output:
`[home]/Desktop/nko-brain-scanner/artifacts/phonemic_substrate/overnight_bundle_2026-06-01/bundle_summary.md`
- Label examples:
`[home]/Desktop/nko-brain-scanner/artifacts/phonemic_substrate/overnight_bundle_2026-06-01/label_examples.jsonl`
- Validation report:
`[home]/Desktop/nko-brain-scanner/NKO-VALIDATION-REPORT.md`
Promotion Decision
Convert into the standard paper schema, add citations, and render a draft PDF.
Source Anchor
nko-brain-scanner/paper/current/nko_phonemic_substrate_paper.md
Detected Structure
Abstract · Introduction · Method · Evaluation · References · Math · Code Anchors · Architecture