Mohamed Diomande

Full HTML reader

Read the full artifact

Extracted abstract or opening context

This document is the dense, self-contained account of a research direction that began as a reaction to a single paper and turned into a structural extension of an existing N'Ko language-technology program. The reaction paper is Lexical Acoustic Coding, abbreviated LAC, which proposes that a short sound can be transmitted between two language-model agents as a readable English sentence and then re-rendered from that sentence, trading exact sample recovery for perceptual similarity. The reaction was that natural English is the weakest part of that design and that a script engineered for sound, specifically N'Ko, would carry the same information more faithfully and far more compactly. That reaction was developed into a method called Featural Acoustic Coding, abbreviated FAC, a written paper that compiles cleanly, a battery of experiments that were actually run rather than asserted, and a strategic reconciliation with the brain-scanner research program that already produced a script-native N'Ko speech recognizer at roughly twenty percent character error rate. The honest conclusion, stated up front so nothing here oversells, is that FAC has real merit but not as a standalone competitor to LAC; its merit is highest as the acoustic tone channel and the generative dual of the recognizer that already exists, and the standalone codec framing should be demoted to a section rather than promoted to a flagship. LAC works by analyzing a waveform into a small set of interpretable acoustic descriptors that span temporal, spectral, harmonic, and psychoacoustic properties, quantizing each descriptor into a word-like label drawn from a shared vocabulary, and composing those labels into an English sentence such as the one about a mid-power punch with a moderate-onset envelope that stays front-loaded and clipped. A receiver parses the sentence back into acoustic intervals and synthesizes the nearest waveform, with a closed-loop refinement step that nudges the synthesis toward the target. LAC situates itself against three neighbors in the audio representation space: captions, which are readable but too loose to resynthesize from; neural codecs such as SoundStream, EnCodec, and the Descript Audio Codec, which reconstruct extremely well but emit opaque integer tokens with no human meaning; and raw acoustic descriptor vectors, which are interpretable but live in a continuous space that neither people nor models communicate naturally. LAC's contribution is to quantize the descriptor vector into a lexical codebook and serialize it as natural-language text so that it is simultaneously readable, editable, grounded, and native to the text channel a model already speaks. The disagreement is narrow and it is about the carrier, not the idea. Natural orthographies were never designe

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.