Mohamed Diomande

Full HTML reader

Read the full artifact

Extracted abstract or opening context

This integration uses the **RobotsMali NVIDIA NeMo models** for Bambara automatic speech recognition, complementing our English↔Bambara translation system. ### **Available Models** | Model | Architecture | Parameters | WER | Description | |-------|-------------|------------|-----|-------------| | **QuartzNet** | QuartzNet-15x5 | 19M | 46.5% | Faster, smaller model | | **Soloni** | FastConformer-TDT-CTC | 114M | 40.6% | More accurate, larger model | ### **3. Add Audio Files** Place Bambara audio files in the `audio_samples/` directory: - Formats: WAV, MP3, FLAC, OGG, M4A - Language: Bambara speech - Quality: Clear speech, minimal noise ### **Models Used** - **Base Models**: RobotsMali's fine-tuned NVIDIA NeMo models - **Training Data**: 37 hours of Bambara speech (bam-asr-all dataset) - **Framework**: NVIDIA NeMo toolkit - **License**: CC-BY-4.0 ### **Audio Processing** - **Input**: 16kHz mono WAV files (auto-converted if different) - **Preprocessing**: Resampling, normalization, format conversion - **Validation**: Audio quality checks and recommendations

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.