Back to corpus
experimentexperiment writeup candidatescore 24

Runbook — Phase 1: regenerate proposals on the CLEAN anchor base (mac5)

**Why:** the acoustic-gate pilot's reference-dependent numbers (proposer hit rate 1.9%, flywheel harvest precision) were measured on the 297k model against **contaminated** ane references. To make them trustworthy, regenerate Gemma correction proposals against the **anchor's clean hypotheses + clean HF references**, then re-run the gate.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

**Why:** the acoustic-gate pilot's reference-dependent numbers (proposer hit rate 1.9%, flywheel harvest precision) were measured on the 297k model against **contaminated** ane references. To make them trustworthy, regenerate Gemma correction proposals against the **anchor's clean hypotheses + clean HF references**, then re-run the gate. - `proposer_input_anchor.jsonl` (1,381 rows) — anchor clean hyps as `asr_candidate`, clean HF refs as `reference`. Already compatible with `ASRBridgePacket.from_mapping` (accepts `asr_candidate`/`reference`/`n_best`; trajectory scalars default to neutral 0.0 — acceptable, or recompute the anchor's `scalar_computer` outputs first for a fully self-consistent packet). - Proposer: `Desktop/Comp-Core/experiments/agp_mlx/asr_bridge/agp_text_proposal.py` (+ sibling `schema.py`). Lives on **mac1** — must be copied to mac5 with its package dir. 1. **Base vs LoRA.** Task #10 trained a minimal-edit LoRA adapter (SFT from rejected pairs). Decide: run base Gemma-3n-E2B-4bit (clean baseline) **or** the #10 adapter (the intended corrector). Recommend running **both** to isolate the adapter's effect. 2. **Gemma model path on mac5.** Resolve the exact `--model` (Gemma-3n-E2B-4bit) + optional `--adapter-path`. Use `[home-path]` (mlx_lm 0.31.2 — the only venv that loads gemma-3n E2B; system py3.9/mlx 0.29.3 does NOT). 1. `scp mac5:[home-path] .` 2. `reextract.py` → `proposals_anchor_clean_extracted.jsonl` (harness fix, clean N'Ko). 3. Re-run the gate against **clean** refs: adapt `robust_eval.py` to point at `decoded_anchor_native.jsonl` + the clean proposals → the trustworthy 4-condition table (baseline / raw+gate / clean+gate / clean+preserve+gate) with bootstrap CIs. 4. Compare to the contaminated-substrate numbers in `TECHNICAL-REPORT.md §8`. The honest question this answers: **does the corrector help at all once the references are clean and the base is the 20.57% anchor?** - Anchor = `UnifiedCTCHead(num_classes=66, use_trajectory=True, use_tar=False, use_ttt=False)`; native features at `/Volumes/HD1/anchor_bam_feats` (1500-frame). - Clean refs: HF `Diomande/bambara-whisper-features/corrected_pairs_290k.jsonl` (== `pairs.jsonl` for bam). - Split is seed-42 deterministic; reconstruction verified (232476/29060/29060). The 1,381 pilot utts are NOT all train — they scatter ~80/10/10. No memorization (held-out CER 0.3112 vs train 0.3081).

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.