Grand Diomande Research · Full HTML Reader

N'Ko Rerun Recovery — 2026-04-26

The newer `paper4_same_snapshot_20260422_safe_lr1e4` matrix also ran correctly, but it was **not** a faithful reproduction of the `20.57%` trajectory regime. It was a separate low-learning-rate safety matrix.

Language as Infrastructure technical note experiment writeup candidate score 18 .md

Full Public Reader

N'Ko Rerun Recovery — 2026-04-26

Decision

The `20.57

The newer `paper4_same_snapshot_20260422_safe_lr1e4` matrix also ran correctly, but it was not a faithful reproduction of the `20.57

That means the `31.12

Verified Facts

Old anchor

Source:
- `results/paper4_reproduction_35205256/results.json`
- `results/paper4_reproduction_35205256/train.log`

Key values:
- `script=nko`
- `mode=trajectory`
- `use_trajectory=true`
- `use_ttt=false`
- `lr=0.0003`
- `batch_size=32`
- `epochs_trained=47`
- `best_val_loss=0.6358872798606507`
- `test_cer=0.2057`

Safe matrix best run

Source:
- `results/paper4_same_snapshot_20260422_safe_lr1e4/nko_trajectory_ttt_290596/results.json`
- `results/paper4_same_snapshot_20260422_safe_lr1e4/nko_trajectory_ttt_290596/train.log`

Key values:
- `script=nko`
- `mode=trajectory+ttt`
- `use_trajectory=true`
- `use_ttt=true`
- `lr=0.0001`
- `batch_size=32`
- `epochs_trained=30`
- `best_val_loss=0.9368296321338029`
- `test_cer=0.3112`

Safe matrix launcher

Source:
- `docs/handoffs/vast-safe-required-matrix-2026-04-22.sh`

Key values:
- `--lr 0.0001`
- `--patience 8`

Original intended matrix defaults

Source:
- `docs/handoffs/vast-training-run-matrix-2026-04-21.json`
- `docs/handoffs/run-vast-paper4-matrix.generated.sh`

Key values:
- `--lr 0.0003`
- `--patience 10`
- required `nko_trajectory_ttt_290596`
- but no plain `nko_trajectory_290596` required run was actually included in the safe five-run matrix

What Was Correct

The safe matrix was not garbage. These parts were correct:

1. Same snapshot size was preserved:
- `train=232476`
- `val=29060`
- `test=29060`

2. Heldout artifacts were written correctly:
- `results.json`
- `test_predictions.jsonl`
- `test_references.jsonl`
- `test_metrics_by_partition.json`

3. TTT training itself completed cleanly:
- early stop
- `nan=0`
- full heldout predictions and references present

4. The result is valid for the safe low-LR matrix
- it is just not comparable to the `20.57

Root Cause

The main visible drift is the training regime:

  • old anchor: `lr=3e-4`
  • safe matrix: `lr=1e-4`

That alone is enough to explain a large loss in convergence quality here, and the validation curves confirm it:

  • old anchor `best_val_loss = 0.6359`
  • safe TTT `best_val_loss = 0.9368`
  • safe baseline `best_val_loss = 0.9560`

This is a real model-quality difference, not a logging or artifact issue.

Minimal Corrective Rerun

The next scientifically correct rerun is only two experiments:

1. `nko_trajectory_290596`
- same split
- `script=nko`
- `--use-trajectory`
- `lr=0.0003`
- `patience=10`

2. `nko_trajectory_ttt_290596`
- same split
- `script=nko`
- `--use-trajectory --use-ttt`
- `lr=0.0003`
- `patience=10`
- `ttt_chunk_size=16`
- `ttt_lr=0.01`

This is the smallest rerun that answers the real question:

"Does TTT improve the actual strong N'Ko trajectory regime?"

GPU Recommendation

Current trainer reality

The current ASR trainer stack is single-GPU. There is no verified DDP / multi-GPU training path in the code currently driving these Paper 4 runs.

That means:

  • adding more GPUs to one job will not automatically make one run faster
  • the immediate speed win comes from running multiple experiments in parallel

Fastest practical option without trainer surgery

Use 2 x A100 80GB and run the two reruns in parallel:

  • GPU 1: `nko_trajectory_290596`
  • GPU 2: `nko_trajectory_ttt_290596`

This roughly halves wall-clock time relative to sequential single-GPU execution.

If forced onto 1 GPU

Run trajectory first, then TTT.

If willing to engineer distributed training

That is a separate task. It is not a launch-time flag; it requires code changes and validation. Do not assume a 2-GPU or 4-GPU box accelerates a single Paper 4 run with the current trainer.

Recommendation

Do not rerun the whole five-run safe matrix.

Run only the two N'Ko trajectory-family experiments above under the original `3e-4` regime, on two A100s in parallel if speed matters.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

nko-brain-scanner/docs/handoffs/nko-rerun-recovery-2026-04-26.md

Detected Structure

Evaluation · References