Back to Language as Infrastructure
working paper2026Phonemic substrate manuscript

N'Ko as an Extensible Phonemic Substrate for Governed Low-Resource Speech

Low-resource speech systems fail when they lack both data and a measurement substrate that preserves phonemic structure. This paper evaluates N'Ko as an extensible sound-code, then connects the representation result to governed ASR correction experiments where ungoverned proposals poison the transcript and AGP prevents most damage.

Paper workspace

Live draft structure

live-draft

Artifacts

Editable Markdown manuscript

The current manuscript is source-first. A public PDF render should be produced after release review.

source-only

Related split paper: phonemic evaluation

Related final split-paper render for N'Ko phonemic evaluation.

Open artifact

Editable source

Markdown source exists. The representation result is concrete, while the acoustic/FAC bridge remains a next proof gate.

Source anchors

nko-brain-scanner/paper/current/nko_phonemic_substrate_paper.md

NKo/nko/phonemic_extensions.py

NKo/scripts/evaluate_phoneme_coverage.py

NKo/tests/test_phonemic_extensions.py

Method tags

phonemic substratecoverage evaluationgovernanceAGP

Ingest intersections

nkophonemiccoveragesubstrateagpfac

Status

Drafted; strong representation result, correction loop still requires acoustic evidence.

Key claims

01

Baseline N'Ko covers Manding completely and can be mechanically extended for other inventories.

02

Representation coverage is not equivalent to recognition accuracy.

03

Self-improving correction must be governed by acoustic evidence.

Public reading note

Readable summary public; full draft needs release review.

Standard skeleton

What this paper must keep proving

Schema

problem

Low-resource speech systems need a label substrate that preserves phonemic distinctions before recognition and correction can be governed.

method

Evaluate baseline, Unicode-extension, and bounded compositional N'Ko mappings over Manding, French, and English phoneme inventories.

implementation

Coverage evaluator, phonemic extension rules, regression tests, and AGP correction experiment comparison.

data

Finite phoneme inventories plus governed ASR correction rows. Audio recognition remains outside the representation proof.

evaluation

Coverage percentage against a 90 percent gate, regression tests, and correction CER deltas under ungoverned versus governed proposals.

references

Unicode Core Specification for N'Ko, phoneme inventory sources, ASR correction and governance literature.

openQuestions

Whether FAC feature heads can hear the same representational distinctions from audio rather than IPA/text input.

Checkpoints and references

Proof chain

experimentproven

Representation coverage gate

NKo phoneme coverage evaluator

The manuscript records 100 percent coverage under the bounded compositional layer on the tested inventories.

experimentpartial

Correction governance boundary

500-row governed correction comparison

Supports the claim that representation alone is not enough; correction still needs governance.

experimentpending

Acoustic feature proof

FAC feature-head/read-speech lane

The paper explicitly separates deterministic representation from the unsolved acoustic hearing problem.