Grand Diomande Research · Full HTML Reader

AGP-MLX for N'Ko Anticipation Partitioning

Current status remains `NO-GO` until the learned live proposal path clears that gate on the authoritative five-run post-TTT corpus.

Language as Infrastructure proposal experiment writeup candidate score 32 .md

Full Public Reader

AGP-MLX for N'Ko Anticipation Partitioning

Date: 2026-04-21

Release Gate

The canonical ship bar for calling this a fully live learned N'Ko corrector at
scale is defined in:

- `docs/research/agp-nko-live-corrector-release-gate-v1.md`

Current status remains `NO-GO` until the learned live proposal path clears that
gate on the authoritative five-run post-TTT corpus.

Position

`agp_mlx` is the correct substrate for the N'Ko speech-language fusion idea, but it should not replace the acoustic ASR stack yet.

The correct integration is:

text
audio
  -> Whisper large-v3 encoder
  -> trajectory CTC decoder
  -> N'Ko hypotheses
  -> anticipation partition router
  -> AGP-MLX corrective / language-prior lanes
  -> final N'Ko + Latin + optional translation

The acoustic model remains responsible for sound-to-script evidence. AGP-MLX becomes the partitioned language-prior and corrective layer.

Current AGP-MLX Runtime Ground Truth

Current promoted topology is healthy on `mac5`.

text
fast lane:       9430  hidden/q8_0
corrective lane: 9442  summary/q8_0  promoted
fallback lane:   9434  summary/q8_0
alternate lane:  9440  summary/q8_0

The promoted corrective lane is backed by:

text
model: mlx-community/gemma-4-e2b-4bit
adapter: experiments/agp_mlx/train/runs/gemma4_e2b_domain_thunder_stage1_seq512
corrective run: experiments/agp_mlx/transfer/runs/gap_focus_transfer_fm007_len32_v1_20260420_034935

This means the AGP stack already has the three things needed for an ASR integration:

  • a small local language model backbone
  • promoted corrective lanes
  • a topology-aware runtime with health/reconcile/watchdog machinery

Anticipation Partitioning

Anticipation partitioning should be the router between acoustic evidence and language-prior correction.

The existing trajectory/anticipation scalars should be converted into discrete routing regimes:

text
stable
boundary
uncertain
recovery
novelty

Stable

The ASR decoder should dominate. Do not let the language model rewrite stable acoustic evidence.

Runtime policy:

text
emit CTC result directly
skip AGP correction

Boundary

The phoneme boundary is active. Use AGP only to rank plausible continuations.

Runtime policy:

text
send n-best N'Ko hypotheses to fast lane
accept only if top-1 agreement improves and edit distance stays bounded

Uncertain

The acoustic signal is weak or ambiguous. Use the promoted corrective lane.

Runtime policy:

text
send N'Ko prefix + candidate hypotheses to corrective lane 9442
accept if route confidence and margin pass threshold

Recovery

The decoder likely made an earlier mistake. Let AGP propose bounded repair.

Runtime policy:

text
allow local edit repair over the last chunk only
never rewrite the entire utterance

Novelty

The utterance contains unknown or low-frequency material. Reduce language-model confidence.

Runtime policy:

text
preserve acoustic output
mark for later corpus review
avoid hallucinated semantic completion

Why This Fits N'Ko

N'Ko is phonetically transparent. That makes it unusually compatible with a partitioned speech-language stack.

The acoustic layer asks:

text
which phoneme / glyph did the speaker produce?

The AGP layer asks:

text
given this N'Ko prefix and these candidate glyph sequences, which continuation is structurally plausible?

That separation matters. If AGP is allowed to dominate too early, it can hallucinate fluent text that is acoustically wrong. If ASR is forced to operate alone, it loses the benefit of N'Ko lexical and morphological priors. Anticipation partitioning decides where each layer is allowed to act.

What Not To Do Yet

Do not train one monolithic ASR+SLM model first.

Reasons:

  • it hides failure modes
  • it makes latency harder to control
  • it risks language-model hallucination over acoustic evidence
  • it discards the already verified 20.57
  • it makes ablation unclear

The current verified trajectory CTC model should stay as the teacher/anchor while AGP is added as a bounded correction layer.

First Concrete Experiment

Build an `asr_agp_rescore_v1` harness.

Inputs:

  • audio file
  • current N'Ko ASR output
  • CTC n-best alternatives if available
  • trajectory scalar summary for each chunk
  • confidence / entropy / boundary metrics

Process:

text
1. run current Mac5 ASR backend
2. segment output into chunks
3. classify each chunk into anticipation partitions
4. for stable chunks: preserve ASR
5. for boundary/uncertain/recovery chunks: call AGP lanes
6. compare AGP suggestion against ASR candidate set
7. accept only bounded edits
8. emit final N'Ko + audit trace

Metrics:

  • CER before/after AGP correction
  • edit distance introduced by AGP
  • hallucination rate
  • average AGP roundtrip latency
  • partition-level acceptance rate
  • rejection rate for unsafe corrections

Success criterion:

text
CER improves without increasing hallucination or unbounded rewrite rate.

Minimal Data Needed

The first dataset does not require full retraining.

Create examples shaped like:

json
{
  "audio_id": "...",
  "chunk_index": 0,
  "partition": "uncertain",
  "nko_prefix": "...",
  "asr_candidate": "...",
  "n_best": ["...", "..."],
  "reference": "...",
  "trajectory_scalars": {
    "commitment": 0.0,
    "uncertainty": 0.0,
    "transition_pressure": 0.0,
    "recovery_margin": 0.0,
    "phase_stiffness": 0.0,
    "novelty": 0.0,
    "stability": 0.0
  }
}

This can be built from the current ASR corpus by running the existing checkpoint over validation/test audio and pairing errors with references.

Training Path

Wave 1: rescoring only.

text
ASR output -> AGP candidate ranking -> bounded accept/reject

Wave 2: correction adapter.

text
noisy N'Ko ASR output -> clean N'Ko reference

Wave 3: anticipation-conditioned adapter.

text
partition + scalars + N'Ko prefix + candidates -> correction decision

Wave 4: joint acoustic-language reranker.

text
Whisper/CTC confidence + AGP route confidence -> final decode policy

System Name

Working name:

text
AP-AGP-N'Ko

Expanded:

text
Anticipation-Partitioned AGP for N'Ko Speech-Language Decoding

Paper title candidate:

text
Anticipation-Partitioned Graph Prior Routing for Phonetically Transparent Speech Recognition

Immediate Next Step

Create:

text
experiments/agp_mlx/asr_bridge/

with:

text
build_asr_agp_eval_set_v1.py
run_asr_agp_rescore_v1.py
evaluate_asr_agp_rescore_v1.py
README.md

The first implementation should call the existing Mac5 ASR service and the existing AGP promoted topology. No model retraining is required for the first proof.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

Comp-Core/docs/research/agp-mlx-nko-anticipation-partitioning-v1.md

Detected Structure

Method · Evaluation · References · Code Anchors · Architecture