Retrieval-Centric ASR for N'Ko: Exploiting Script Structure to Beat Sequence-to-Sequence
We present a retrieval-centric automatic speech recognition (ASR) architecture for Bambara, targeting N'Ko script output directly rather than routing through Latin transcription. The central insight is structural: N'Ko enforces a strict 1:1 phoneme-to-grapheme mapping, explicit tonal diacritics, and a mathematically complete syllable inventory of 3,024 entries (all V, VN, CV, and CVN patterns across five tones). This finite, well-structured output space makes retrieval a better fit than sequence-to-sequence decodin
Full HTML reader
Read the full artifact
Extracted abstract or opening context
Promotion decision
What has to happen next
Convert into the standard paper schema, add citations, and render a draft PDF.
Why this is not always a full paper yet
Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.