Back to corpus
technical noteexperiment writeup candidatescore 32

Codex Handoff: Partial-Real Thunder Train Lane

This handoff is for one bounded job only: get the first completed partial-real local Thunder training window to finish cleanly and verify that it writes a real checkpoint. Do not expand scope beyond that.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

This handoff is for one bounded job only: get the first completed partial-real local Thunder training window to finish cleanly and verify that it writes a real checkpoint. Do not expand scope beyond that. The system context matters because this lane sits inside a larger architecture. The canonical acoustic science lane is running separately on Vast and is responsible for the paper-grade CER claims. That lane is the same-snapshot Paper 4 matrix on the A100 instance and it should be treated as read-only from this handoff. The local Thunder lane is different. It is the AGP and Gemma correction-adapter lane. Gemma is not the authority model. Gemma is the bounded proposal model that only acts after the AGP partition router says a chunk deserves additional compute. Rust admissibility is still the final authority for whether a proposed correction is allowed. The current goal is therefore narrow and practical. We already proved the mechanics on a synthetic correction set. We already proved that the launchers, chunk runner, and local distributed setup can work. The next real milestone is to prove that a first bounded training window can run on a partial-real correction corpus built from actual A100 replay outputs. This is the first time the local Gemma adapter lane will be training on real replay-derived correction rows rather than the small synthetic mechanics set. - the local and remote Thunder dataset path for the partial-real corpus - the Gemma cache warm state on `mac4` and `mac5` - the partial-real Thunder launcher and chunk runner - the dedicated adapter output path for the partial-real run - checkpoint verification and artifact mirroring back to local - the Vast closeout watcher - the A100 instance lifecycle - the paper text or manuscript framing - the MAOE replay scripts unless needed only to read field names or confirm input format - routing logic redesign - changing the correction policy

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.