N'Ko Acoustic Coding — Featural Acoustic Coding (FAC)
Research code and paper for the N'Ko tone-resolution seam: using acoustic evidence, especially F0, to restore tone marks that a toneless N'Ko ASR pipeline cannot recover from text alone.
Full Public Reader
N'Ko Acoustic Coding — Featural Acoustic Coding (FAC)
Research code and paper for the N'Ko tone-resolution seam: using acoustic
evidence, especially F0, to restore tone marks that a toneless N'Ko ASR pipeline
cannot recover from text alone.
The one-line thesis
The recognizer emits toneless N'Ko, but N'Ko tone is written in the signal:
speaker-relative F0 supplies evidence that a text-only prior cannot see. FAC is
the acoustic likelihood side of that estimator; AGP is the governance gate that
decides where a correction is safe.
Why N'Ko actually wins (the grounded version, not hype)
| N'Ko slot | Acoustic axis it encodes | Native? |
|---|---|---|
| Tone marks (7 marks, `U+07EB`–`U+07F1`) | pitch register + contour, with short/long variants | native |
| Nucleus (7 vowels) | spectral centroid / formant color | native |
| Onset (consonant manner) | attack transient type | native |
| Nasal coda | resonant sustain | native |
| harmonicity / spread / roughness / dynamics | timbre | designed extension |
The correction that matters: falling tone is native. Unicode names U+07EE as
`NKO COMBINING LONG DESCENDING TONE`; no designed pitch mark is needed for
falling. Designed FAC extensions are reserved for higher timbral descriptors,
not pitch.
The measured corpus prior is now generated by code, not copied by hand:
4,139 parsed syllables across 105 entries show roughly 65.8
33.3
problem register-first and confidence-gated, not a broad claim that contour
dominates all tone.
What makes this real, not speculative
It's built on assets that already exist in your stack:
- `Desktop/NKo/nko/syllable_codebook.py` — already enumerates a few thousand tonal
N'Ko syllables as a discrete featural codebook (onset/nucleus/coda/tone),
explicitly built as "the retrieval target for joint-embedding ASR." The
sound→N'Ko-code direction already exists. FAC is the inverse: code→sound.
- `Desktop/NKo/nko/phonetics.py` — the IPA / tone / Unicode layer.
- N'Ko Brain Scanner (`Desktop/nko-brain-scanner`) — your released
N'Ko-adapted model: custom BPE tokenizer (2.75× compression) + finite-state
CV/CVN constraint giving 99.8–100
obvious objection** ("frontier models can't read N'Ko"): FAC's decoder is a
constrained-decoding problem on a model that already manipulates the codebook.
Honest status
- The Unicode inventory, tone prior, text-only baseline, synthetic tone seam,
fusion self-test, and pitch-channel representational studies are runnable.
- The decisive acoustic TDER is still blocked by data: a read-speech WAV aligned
to known tone-marked N'Ko text.
- The end-to-end toned-CER claim is still blocked by one checkpoint inference
pass after that read-speech input exists.
- No result is fabricated from TTS or unaligned lesson commentary.
Build
cd Desktop/nko-acoustic-coding
export PATH="/Library/TeX/texbin:$PATH"
pdflatex main.tex && bibtex main && pdflatex main.tex && pdflatex main.texExpected: compiles clean under MacTeX pdflatex, 6 pages, references resolved,
zero undefined citations. N'Ko is shown via Unicode codepoints + Latin
transliteration + IPA (tipa) so it compiles everywhere without an N'Ko font.
Verification
cd Desktop/nko-acoustic-coding/experiments
python3 -W ignore fac_implementation_scorecard.pyOr from the repo root:
make verify # FAC scorecard + paper compile
make verify-full # also checks the local NKo Python/Swift tone layers
make test # FAC contract tests only
make clean # remove local LaTeX/Python build productsThe scorecard refreshes `experiments/corpus/tone_prior.json`, reruns the core
scripts, scans for stale tone-inventory claims, and writes:
- `experiments/artifacts/fac_implementation_scorecard.json`
- `experiments/artifacts/fac_implementation_scorecard.md`
Files
- `main.tex` — the paper (ACL format, matches your nko-brain-scanner conventions)
- `references.bib` — LAC, brain-scanner self-cite, codecs, featural-script lit
- `acl.sty`, `acl_natbib.bst` — copied from nko-brain-scanner
- `main.pdf` — built output, regenerated by `make paper`
- `figures/` — preview render
Next moves (ranked)
1. Record or obtain `read.wav` plus `gold_nko.txt`: the same speaker reading the
exact tone-marked N'Ko text.
2. Run `make read-speech AUDIO=/path/read.wav TEXT=/path/gold_nko.txt` for real
H1-H3 acoustic TDER. See `experiments/READ_SPEECH_RUNBOOK.md`.
3. Run the archived checkpoint once to convert the same correction into an
end-to-end toned-CER number. No retraining is needed for that pass.
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
nko-acoustic-coding/README.md
Detected Structure
Method · Evaluation · References · Figures · Code Anchors · Architecture