N'Ko Uncertainty Packet Execution Plan

Full HTML reader

Read the full artifact

Extracted abstract or opening context

- a loose chain of scripts passing text around - a real speech system with explicit uncertainty, provenance, and partition-aware routing - trajectory ASR on Vast - AGP/Gemma correction on Mac4/Mac5 - ASR partitioning (`stable|boundary|uncertain|novelty`) - a canonical segment corpus at `artifacts/corpus/segments.{jsonl,parquet}` What is still missing is the formal interface. Right now the stack is too text-centric. The next step is to make uncertainty and routing first-class. 1. **Audio evidence stays upstream** - Gemma does not replace the acoustic model. 2. **Correction stays bounded** - AGP proposes; the gate decides. 3. **Every row is attributable** - raw ASR, corrected text, and decisions remain inspectable. 4. **Partitions are policy, not decoration** - `stable`, `boundary`, `uncertain`, and `novelty` drive downstream behavior. 5. **Search and TTS consume different slices** - not every corrected utterance is valid TTS training data. - `feat_id` - `audio_id` - `audio_path` - `episode_id` - `segment_id` - `split` - `script` - `mode`

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.