Back to corpus
working paperpreprint render candidatescore 100

Beyond Controlled Comparison: Deployment Properties of Script-Aware ASR for N'Ko

Controlled experiments show that phonetically transparent scripts yield lower CER for CTC-based ASR. But ASR systems are not evaluated in controlled conditions---they encounter unseen vocabulary, new speakers, and domain shift. This paper assembles deployment-relevant evidence for Bambara ASR systems using N'Ko (bijective script) and Latin (many-to-many script), anchored by the verified 20.57\% N'Ko trajectory checkpoint but drawing on both current and historical experiments. First, \textbf{compositional generaliza

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

Controlled experiments show that phonetically transparent scripts yield lower CER for CTC-based ASR. But ASR systems are not evaluated in controlled conditions---they encounter unseen vocabulary, new speakers, and domain shift. This paper assembles deployment-relevant evidence for Bambara ASR systems using N'Ko (bijective script) and Latin (many-to-many script), anchored by the verified 20.57\% N'Ko trajectory checkpoint but drawing on both current and historical experiments. First, \textbf{compositional generalization}: models trained only on high-frequency words are evaluated on utterances containing rare words. N'Ko's generalization gap is 3.65pp smaller than Latin's (37.81pp vs 41.46pp), confirming that character-phoneme bijection enables composition of known units into unknown words. Second, \textbf{vocabulary expansion}: full-data training recovers 13.75pp of the generalization gap equally for both scripts, but N'Ko retains a 2.58pp structural advantage on rare-word utterances---stable across training conditions. Third, \textbf{test-time training}: in an earlier internal deployment experiment, we transcribe 32,826 segments from the Djoko soap opera (an out-of-domain Malinke broadcast), apply consensus filtering to extract 5,492 candidate pairs, and measure per-speaker adaptation via online weight updates. Those historical deployment experiments suggest that the N'Ko model produces usable transcriptions on 99.4\% of out-of-domain segments versus Latin's distribution collapse, and that speaker-level TTT can reduce loss sharply for the best-adapting speaker. Taken together, these results suggest that N'Ko's phonetic transparency advantage is not limited to static benchmarks. It may extend to the deployment properties that determine whether an ASR system improves with use: generalization to new words, expansion without retraining, and adaptation to new speakers and domains. Total compute cost: under \$6 across all experiments.

Promotion decision

What has to happen next

Compile/render the source, verify references and figures, then add to the curated atlas.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.