AGP-MLX N'Ko ASR Bridge
The bridge keeps the acoustic model authoritative. It classifies ASR chunks into anticipation partitions, allows AGP-style correction only for non-stable regimes, and rejects corrections that exceed a bounded edit budget.
Full Public Reader
AGP-MLX N'Ko ASR Bridge
This directory is the first executable bridge between the verified N'Ko ASR model and AGP-MLX.
The bridge keeps the acoustic model authoritative. It classifies ASR chunks into anticipation partitions, allows AGP-style correction only for non-stable regimes, and rejects corrections that exceed a bounded edit budget.
The edit budget uses both a relative limit and a small absolute floor. This matters for short N'Ko chunks where one missing vowel can be a large relative edit but still a safe local repair.
Every correction decision also carries a provisional Graph Kernel-style admissibility witness. That gives each accepted or rejected AGP correction a slice id, graph snapshot hash, policy hash, and 32-hex token so the bridge can be audited before the live kernel service is wired in.
Release Gate
The canonical release decision for this bridge lives in:
- `docs/research/agp-nko-live-corrector-release-gate-v1.md`
Use that document as the authoritative `GO` / `NO-GO` bar for the learned live
N'Ko correction path. Large `oracle_guardrail` replay wins are research-valid,
but they do not satisfy the learned live ship gate by themselves.
Current Pieces
- `partition_policy.py` maps ASR telemetry into `stable`, `boundary`, `uncertain`, `recovery`, and `novelty`.
- `admissibility.py` issues local provisional Graph Kernel-shaped admissibility witnesses.
- `context_adapters.py` adapts ASR correction decisions into existing RAG++ retrieval provenance shape.
- `rust_control_plane.py` delegates accept/reject decisions to the Rust `cc-agp-bridge` crate.
- `agp_runtime.py` resolves and health-checks the promoted AGP topology.
- `agp_text_proposal.py` is the bounded text-proposal adapter for Gemma/MLX-style correction candidates.
- `expert_router.py` maps anticipation partitions into Mixture of Anticipatory Orthogonal Experts lanes.
- `evaluate_expert_router_v1.py` emits the pre-neural expert-routing report for a bridge JSONL file.
- `build_paper4_matrix_bridge.py` converts current Paper 4 prediction/reference artifacts into bridge rows.
- `run_paper4_maoe_replay.py` runs the Paper 4 converter, MAOE router, and oracle Rust guardrail in one command.
- `run_paper4_matrix_batch_replay.py` replays every locally collected same-snapshot Paper 4 run in one pass.
- `build_gemma_nko_correction_sft.py` converts bridge rows into Gemma correction-SFT examples.
- `schema.py` defines the typed packet, correction decision, CER aggregation, and AGP prompt shape.
- `evaluate_bridge_policy_v1.py` evaluates the bounded correction policy over JSONL rows.
Expected Row Shape
{
"audio_id": "sample-001",
"chunk_index": 0,
"asr_candidate": "NKO_TEXT",
"reference": "REFERENCE_NKO_TEXT",
"proposed_text": "OPTIONAL_AGP_PROPOSAL",
"trajectory_scalars": {
"confidence": 0.7,
"uncertainty": 0.2,
"transition_pressure": 0.4,
"recovery_margin": 0.1,
"novelty": 0.0,
"stability": 0.8
}
}Smoke Run
python3 experiments/agp_mlx/asr_bridge/evaluate_bridge_policy_v1.py \
--input experiments/agp_mlx/asr_bridge/fixtures/policy_smoke.jsonl \
--output-dir experiments/agp_mlx/asr_bridge/reports/policy_smoke \
--oracle-proposalThe `--oracle-proposal` flag uses references as proposed corrections only to test the guardrails. It is not a deployable correction source.
Run the same guardrail through the Rust control plane:
python3 experiments/agp_mlx/asr_bridge/evaluate_bridge_policy_v1.py \
--input experiments/agp_mlx/asr_bridge/fixtures/policy_smoke.jsonl \
--output-dir experiments/agp_mlx/asr_bridge/reports/policy_smoke_rust_gate \
--oracle-proposal \
--rust-control-planeGenerate proposal rows without loading MLX:
python3 experiments/agp_mlx/asr_bridge/agp_text_proposal.py \
--input experiments/agp_mlx/asr_bridge/fixtures/policy_smoke.jsonl \
--output /tmp/agp_text_proposals_dry_run.jsonl \
--dry-runGenerate proposal rows from the canonical promoted corrective lane over `/resume`:
python3 experiments/agp_mlx/asr_bridge/agp_text_proposal.py \
--input experiments/agp_mlx/asr_bridge/fixtures/policy_smoke.jsonl \
--output /tmp/agp_text_proposals_resume.jsonl \
--promoted-corrective-resumeThe older `/propose` text adapter is still available for smoke comparisons:
python3 experiments/agp_mlx/asr_bridge/agp_text_proposal.py \
--input experiments/agp_mlx/asr_bridge/fixtures/policy_smoke.jsonl \
--output /tmp/agp_text_proposals_http.jsonl \
--promoted-corrective-httpAdmissibility Witnesses
The witness mirrors the Comp-Core Graph Kernel provenance contract:
- `slice_id`
- `graph_snapshot_hash`
- `policy_id`
- `policy_params_hash`
- `admissibility_token`
- `query_hash`
- `decision_hash`
The token uses the `admissibility_token_v2_hmac` shape and is truncated to 32 hex characters to match the Graph Kernel convention. It is marked `local_provisional`; production should replace this local issuer with a real Graph Kernel slice export.
Mixture of Anticipatory Orthogonal Experts
The MoVE-style takeaway for this bridge is not emotional speech generation. It
is expert routing. For N'Ko, each anticipation partition becomes a different
authority lane:
stable -> acoustic_preservation
boundary -> boundary_completion
uncertain -> uncertain_repair
recovery -> recovery_context
novelty -> novelty_quarantineRun the deterministic routing gate:
python3 experiments/agp_mlx/asr_bridge/evaluate_expert_router_v1.py \
--input experiments/agp_mlx/asr_bridge/fixtures/policy_smoke.jsonl \
--output-dir experiments/agp_mlx/asr_bridge/reports/expert_router_smokeThis emits:
- `expert_router_report.json`
- `expert_routes.jsonl`
The report does not claim CER improvement. It proves that every ASR chunk has a
specific expert lane, compute budget, TurboQuant mode, accelerator status, and
safety contract before any neural correction proposal is allowed.
Once the same-snapshot A100 run emits paired predictions and references, build
the MAOE evaluation input with:
python3 experiments/agp_mlx/asr_bridge/build_paper4_matrix_bridge.py \
--predictions /path/to/test_predictions.jsonl \
--references /path/to/test_references.jsonl \
--output experiments/agp_mlx/asr_bridge/reports/paper4_matrix_bridge/paper4_bridge.jsonlThen run both gates:
python3 experiments/agp_mlx/asr_bridge/evaluate_expert_router_v1.py \
--input experiments/agp_mlx/asr_bridge/reports/paper4_matrix_bridge/paper4_bridge.jsonl \
--output-dir experiments/agp_mlx/asr_bridge/reports/paper4_matrix_expert_router
python3 experiments/agp_mlx/asr_bridge/evaluate_bridge_policy_v1.py \
--input experiments/agp_mlx/asr_bridge/reports/paper4_matrix_bridge/paper4_bridge.jsonl \
--output-dir experiments/agp_mlx/asr_bridge/reports/paper4_matrix_oracle_guardrail \
--oracle-proposal \
--rust-control-planeOr run the full replay chain in one command:
python3 experiments/agp_mlx/asr_bridge/run_paper4_maoe_replay.py \
--run-dir /path/to/completed/paper4/run \
--output-dir experiments/agp_mlx/asr_bridge/reports/paper4_maoe_replay \
--rust-control-planeRun the learned live replay path instead of the oracle proposal path. The
canonical live mode is the packet-based `/resume` path:
python3 experiments/agp_mlx/asr_bridge/run_paper4_learned_replay.py \
--run-dir /path/to/completed/paper4/run \
--output-dir experiments/agp_mlx/asr_bridge/reports/paper4_maoe_learned_replay \
--proposal-mode promoted-corrective-resume \
--rust-control-planeThe older `/propose` adapter path is still available with
`--proposal-mode promoted-corrective-http`.
If multiple same-snapshot runs have already been collected locally, replay the
whole completed matrix in one pass:
python3 experiments/agp_mlx/asr_bridge/run_paper4_matrix_batch_replay.py \
--results-root Desktop/nko-brain-scanner/results/paper4_same_snapshot_20260422_safe_lr1e4 \
--output-root experiments/agp_mlx/asr_bridge/reports/paper4_same_snapshot_batch_replay \
--rust-control-planeRun the learned batch replay across the same local matrix:
python3 experiments/agp_mlx/asr_bridge/run_paper4_matrix_batch_learned_replay.py \
--results-root Desktop/nko-brain-scanner/results/paper4_same_snapshot_20260422_safe_lr1e4 \
--output-root experiments/agp_mlx/asr_bridge/reports/paper4_same_snapshot_batch_learned_replay \
--proposal-mode promoted-corrective-resume \
--rust-control-planeBuild a Gemma correction adapter dataset from the same bridge rows:
python3 experiments/agp_mlx/asr_bridge/build_gemma_nko_correction_sft.py \
--input experiments/agp_mlx/asr_bridge/reports/paper4_matrix_bridge/paper4_bridge.jsonl \
--output-dir experiments/agp_mlx/asr_bridge/reports/gemma_nko_correction_sftThe reference text is used as the teacher only after the same bounded policy
decides whether that correction would be admissible. If the oracle correction is
blocked, the target repeats the ASR candidate. That teaches Gemma both how to
correct and when not to correct.
This does not improve CER directly. It improves replayability and authority boundaries: if CER improves, we can trace which ASR chunk, trajectory scalars, partition, policy parameters, and evidence slice allowed the correction.
RAG++ Fit
RAG++ is the instruction/language-context layer, not the acoustic authority. The useful existing pieces are:
- `rag_plusplus.slice.client.SliceExport`
- `rag_plusplus.slice.enforcing_client.RetrievalProvenance`
- `rag_plusplus.retrieval.provenance.SliceInfo`
- `rag_plusplus.retrieval.provenance.RetrievalProvenance`
The bridge now emits `retrieval_provenance` beside `admissibility`, using the RAG++ provenance class when importable. This lets future live AGP calls carry the same chain of custody as normal RAG++ slice-scoped retrieval: query hash, slice id, policy ref, graph snapshot, admissibility token, and result hash.
Current local state: the RAG++ Python modules are importable from `core/retrieval/cc-rag-plus-plus`, but the service is not healthy on `:8000`. The PyO3 `admissibility-kernel-py` package exists, but its native extension is not built in this environment and `maturin` is not installed, so the bridge keeps the local provisional issuer until that build path is restored.
AGP Runtime Boundary
The promoted `9442` corrective lane is healthy. It still exposes `/resume` for encoded AGP hidden-state packets, and it now also exposes `/propose` for bounded text proposal smoke tests when the backbone model is loaded.
The text endpoint is an adapter surface, not the canonical AGP packet path. Early smoke tests show the current Gemma adapter is conservative/echo-biased for N'Ko correction prompts: it tends to repeat the ASR candidate instead of repairing short boundary errors. Rust correctly rejects unchanged proposals as `no_effect`.
With the few-shot N'Ko correction prompt in `agp_text_proposal.py`, the live Mac5 proposal path now matches the oracle guardrail smoke:
{
"total": 4,
"accepted": 2,
"rejected": 2,
"cer_before": 0.14285714285714285,
"cer_after": 0.047619047619047616,
"rag_plusplus_provenance": 4
}The meaningful behavior is not just the CER delta. The model over-repaired the novelty row, and Rust blocked it with `novelty_partition_blocks_language_prior`. That is the architecture working as intended: neural proposal, symbolic/trajectory gate, admissibility record.
Synthetic Supervised Stress
`build_synthetic_correction_set.py` creates a small controlled correction set from real N'Ko validation text. This is not a substitute for same-provenance ASR heldout audio. It is a pressure test for policy shape: stable rows should preserve ASR, boundary/uncertain rows should permit bounded local repairs, and novelty rows should not let the language model override acoustic authority.
Latest live Mac5 `/propose` + Rust gate stress:
{
"total": 16,
"accepted": 8,
"rejected": 8,
"accepted_improved": 3,
"accepted_neutral": 5,
"accepted_worse": 0,
"rejected_would_improve": 3,
"rejected_safe": 5,
"cer_before": 0.13333333333333333,
"cer_after": 0.1
}The additional supervised audit counters are important. Runtime admissibility does not see the reference, so it cannot know whether an accepted proposal improved CER. The report can know that afterward. In this stress run, the gate avoided harmful accepted edits, but it also accepted several neutral edits and blocked three novelty edits that would have matched the synthetic reference. That is an explicit tradeoff: the current policy favors acoustic authority over language-prior completion in novelty regimes.
For real CER improvement, the bridge still needs stronger proposal quality and a real heldout ASR error set:
- tune the N'Ko correction prompt or adapter so accepted boundary/uncertain proposals are more often helpful than neutral
- build the canonical ASR telemetry to AGP packet bridge that can call `/resume`
- keep `cc-agp-bridge` as the final authority for accepting or rejecting proposals
Prediction-Only Probe and Hard Edit Cap
`build_prediction_probe.py` builds deployment probes from prediction-only ASR JSONL. These probes do not have gold references, so they cannot report CER. Their purpose is to test whether the proposal model tries to rewrite live ASR text and whether the Rust gate blocks unsafe edits.
The local Djoko prediction probe exposed an important failure mode: after prompt and token-budget changes, the promoted corrective lane stopped truncating as aggressively, but it still proposed multi-character rewrites on longer N'Ko strings. Those rewrites can be below the relative edit threshold while still being too large for a deployable ASR correction.
The bridge now applies a hard bounded-edit policy in both Python and Rust:
- stable partitions preserve ASR
- novelty partitions block language-prior overrides
- one-character repairs can pass even when the relative ratio is high on short tokens
- proposals above the absolute edit cap are rejected regardless of relative distance
- proposals with more than one edit must also satisfy the relative edit threshold
Regression coverage lives in `core/semantic/cc-agp-bridge/src/lib.rs`, including a long-string case that rejects many small relative edits as `edit_too_large:absolute:4`.
Verified reports after the hardcap fix:
{
"smoke": {
"total": 4,
"accepted": 2,
"rejected": 2,
"accepted_improved": 2,
"accepted_worse": 0,
"cer_before": 0.14285714285714285,
"cer_after": 0.047619047619047616
},
"synthetic_stress": {
"total": 16,
"accepted": 8,
"rejected": 8,
"accepted_improved": 3,
"accepted_neutral": 5,
"accepted_worse": 0,
"cer_before": 0.13333333333333333,
"cer_after": 0.1
},
"prediction_only_probe": {
"total": 16,
"accepted": 0,
"rejected": 16,
"reference_rows": 0,
"admissibility_tokens": 16,
"rag_plusplus_provenance": 16
}
}This result should be read conservatively. The architecture is doing the right thing by refusing unsafe prediction-only rewrites, but the proposal lane is not yet proven as a CER-improving production corrector beyond the small reference-backed smoke and synthetic stress. The next real benchmark must use same-provenance ASR outputs with references.
Archived Reference-Backed ASR Eval Probe
`build_eval_results_bridge.py` converts `nko-brain-scanner/asr/eval-results-2026-03-20.json` into bridge packets. This artifact is an older ASR evaluation run, not the Paper 4 20.57
The file-head slice is mostly catastrophic ASR output. The hardcap gate rejects all proposals:
{
"total": 12,
"accepted": 0,
"rejected": 12,
"rejected_would_improve": 8,
"cer_before": 1.7804878048780488,
"cer_after": 1.7804878048780488
}That is a useful negative result: the model can sometimes propose a reference-improving cleanup, but the edit is too broad to trust without a dedicated recovery policy.
The lower-CER archived slice gives a small positive result:
{
"total": 5,
"accepted": 1,
"rejected": 4,
"accepted_improved": 1,
"accepted_worse": 0,
"rejected_would_improve": 1,
"cer_before": 0.7603686635944701,
"cer_after": 0.7511520737327189
}The accepted edit is a bounded one-character cleanup on an uncertain partition. This is the right current operating envelope: allow narrow local repairs, block broad language-model recovery until the proposal model is trained and evaluated for that regime.
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
Comp-Core/experiments/agp_mlx/asr_bridge/README.md
Detected Structure
Method · Evaluation · References · Code Anchors · Architecture