Back to corpus
research noteexperiment writeup candidatescore 18

Why Word Error Rate Is the Wrong Metric for Bambara Speech Recognition

Every Bambara ASR system published today reports Word Error Rate. MALIBA-AI's bambara-asr-v3 reports 45.73% WER. Normalized, that drops to 13.23% WER. Those numbers sound meaningful. They are not.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

Every Bambara ASR system published today reports Word Error Rate. MALIBA-AI's bambara-asr-v3 reports 45.73% WER. Normalized, that drops to 13.23% WER. Those numbers sound meaningful. They are not. WER counts the number of word-level insertions, deletions, and substitutions needed to transform a predicted transcript into the reference. It was designed for English, where words are separated by spaces, spelled consistently, and carry stable meaning across contexts. Take the Bambara word for "goat." Depending on who transcribed it, you might see: *ba*, *baa*, *bà*, or *bâ*. Those are all the same word. But WER treats each spelling as a distinct token. If your model outputs "ba" and the reference says "baa", that is a substitution error. 100% WER on a correct prediction. It gets worse. Bambara has digraphs: "ny" for the palatal nasal, "ng" for the velar nasal. Is "nyuman" one word or two? Does "n'ka" contain an apostrophe or not? Different transcribers make different choices. WER punishes the model for disagreeing with the transcriber, not for getting the language wrong. And then there is tone. Bambara is a tonal language. The word "ba" means mother, goat, or river depending on its tone. Latin script has no standard way to mark tone. Some transcribers use diacritics. Most do not. WER has no way to distinguish "got the word right but missed the tone" from "got a completely different word." It collapses a gradient of correctness into binary right/wrong.

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.