Back to corpus
technical noteexperiment writeup candidatescore 32
Codex Handoff: Partial-Real Thunder Train Lane
This handoff is for one bounded job only: get the first completed partial-real local Thunder training window to finish cleanly and verify that it writes a real checkpoint. Do not expand scope beyond that.
Full HTML reader
Read the full artifact
Extracted abstract or opening context
This handoff is for one bounded job only: get the first completed partial-real local Thunder training window to finish cleanly and verify that it writes a real checkpoint. Do not expand scope beyond that.
The system context matters because this lane sits inside a larger architecture. The canonical acoustic science lane is running separately on Vast and is responsible for the paper-grade CER claims. That lane is the same-snapshot Paper 4 matrix on the A100 instance and it should be treated as read-only from this handoff. The local Thunder lane is different. It is the AGP and Gemma correction-adapter lane. Gemma is not the authority model. Gemma is the bounded proposal model that only acts after the AGP partition router says a chunk deserves additional compute. Rust admissibility is still the final authority for whether a proposed correction is allowed.
The current goal is therefore narrow and practical. We already proved the mechanics on a synthetic correction set. We already proved that the launchers, chunk runner, and local distributed setup can work. The next real milestone is to prove that a first bounded training window can run on a partial-real correction corpus built from actual A100 replay outputs. This is the first time the local Gemma adapter lane will be training on real replay-derived correction rows rather than the small synthetic mechanics set.
- the local and remote Thunder dataset path for the partial-real corpus - the Gemma cache warm state on `mac4` and `mac5` - the partial-real Thunder launcher and chunk runner - the dedicated adapter output path for the partial-real run - checkpoint verification and artifact mirroring back to local
- the Vast closeout watcher - the A100 instance lifecycle - the paper text or manuscript framing - the MAOE replay scripts unless needed only to read field names or confirm input format - routing logic redesign - changing the correction policy
Promotion decision
What has to happen next
Attach run IDs, datasets, metrics, and reproduction commands.
Why this is not always a full paper yet
Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.