๐ง Cognitive Twin โ Training Runbook
| Platform | Model | Cost | Time | Quality | |----------|-------|------|------|---------| | **Mac4 Local** | gemma-3-1b-it (4-bit) | $0 | 5 min | โญโญ (proof of concept) | | **Google Colab Pro** | gemma-3-12b-it (4-bit) | $0 (subscription) | 1-2h | โญโญโญโญ | | **Together AI** | Qwen3-Next-80B-A3B | ~$16-20 | 2-4h | โญโญโญโญโญ | | **Together AI** | Qwen3-235B-A22B | ~$100-200 | 8-12h | โญโญโญโญโญโญ (future) |
Full Public Reader
๐ง Cognitive Twin โ Training Runbook
> Repeatable pipeline for fine-tuning the Cognitive Twin on any platform.
> Last updated: 2026-02-18
Overview
| Platform | Model | Cost | Time | Quality |
|---|---|---|---|---|
| Mac4 Local | gemma-3-1b-it (4-bit) | $0 | 5 min | โญโญ (proof of concept) |
| Google Colab Pro | gemma-3-12b-it (4-bit) | $0 (subscription) | 1-2h | โญโญโญโญ |
| Together AI | Qwen3-Next-80B-A3B | ~$16-20 | 2-4h | โญโญโญโญโญ |
| Together AI | Qwen3-235B-A22B | ~$100-200 | 8-12h | โญโญโญโญโญโญ (future) |
Prerequisites
Data Preparation (run once, then reuse)
# 1. Merge all SFT data
cd Desktop/Comp-Core/packages/cognitive-twin
python3 scripts/local_finetune.py
# Output: local_finetune/data/train.jsonl, valid.jsonl, test.jsonl
# Stats: ~16K deduplicated records from 41K raw### Data Location
- Train: `Desktop/Comp-Core/packages/cognitive-twin/local_finetune/data/train.jsonl` (192MB, 16,360 records)
- Val: `Desktop/Comp-Core/packages/cognitive-twin/local_finetune/data/valid.jsonl` (10MB, 909 records)
- Test: `Desktop/Comp-Core/packages/cognitive-twin/local_finetune/data/test.jsonl` (10MB, 909 records)
### Data Format
Standard ChatML JSONL:
{"messages": [{"role": "system", "content": "..."}, {"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}โ ๏ธ Gemma models require strict alternation: `system? (user assistant)+`
The merge script handles this automatically.
---
Path A: Mac4 Local (MLX)
### When to use
- Quick iteration, testing data quality
- Zero cost, ~5 minutes
- Limited to 1B model (16GB RAM constraint)
Steps
# 1. Ensure MLX is installed on Mac4
ssh mac4 'pip3 install mlx mlx-lm'
# 2. Rsync data
rsync -avz Desktop/Comp-Core/packages/cognitive-twin/local_finetune/ mac4:[home-path]
# 3. Train
ssh mac4 'bash -lc "
export PATH=\$HOME/Library/Python/3.9/bin:\$PATH
mlx_lm.lora \
--model mlx-community/gemma-3-1b-it-4bit \
--data [home-path] \
--adapter-path [home-path] \
--train \
--batch-size 1 \
--num-layers 4 \
--iters 500 \
--learning-rate 5e-5 \
--max-seq-length 256 \
--steps-per-report 10 \
--grad-checkpoint
"'
# 4. Test inference
ssh mac4 'bash -lc "
export PATH=\$HOME/Library/Python/3.9/bin:\$PATH
mlx_lm.generate \
--model mlx-community/gemma-3-1b-it-4bit \
--adapter-path [home-path] \
--max-tokens 200 \
--prompt \"What projects are you working on?\"
"'
# 5. Backup adapters
rsync -avz mac4:[home-path] Desktop/Comp-Core/packages/cognitive-twin/local_finetune/adapters/### Key Parameters (Mac4)
- Model: gemma-3-1b-it-4bit (only model that fits in 16GB with LoRA)
- Seq length: 256 max (higher = OOM)
- LoRA layers: 4 (more = OOM)
- Peak memory: ~1.6 GB
- Speed: ~2 iter/sec
### Known Issues
- gemma-3-4b OOMs even at 1024 seq length
- Validation pass can OOM โ set eval steps very high (1000+)
- Sequences >256 tokens get truncated silently
---
Path B: Google Colab Pro
### When to use
- Best free option for quality training
- A100 GPU handles up to 12B models easily
- ~1-2 hours
### Account
- Email: [email]
- Plan: Colab Pro
- GPU: A100 (40GB) preferred, T4/V100 fallback
Steps
1. Go to https://colab.research.google.com
2. Sign in as [email]
3. File โ Upload notebook โ select notebooks/twin_finetune_colab.ipynb
OR create new notebook and upload train_twin.py
4. Runtime โ Change runtime type โ A100 GPU
5. Upload files:
- train.jsonl (192MB)
- valid.jsonl (10MB)
- train_twin.py
6. Run cells in order, or just:
!pip install -q unsloth trl peft accelerate bitsandbytes
!python train_twin.py
7. Download twin-lora-adapter.zip when done### Script auto-detects GPU
- A100 (40GB) โ gemma-3-12b-it, seq 2048, batch 2
- V100/T4 (16GB) โ gemma-3-4b-it, seq 1024, batch 2
- Anything smaller โ gemma-3-1b-it, seq 512, batch 1
### Files
- `notebooks/twin_finetune_colab.ipynb` โ Interactive notebook
- `notebooks/train_twin.py` โ Standalone script
- `colab-upload/` โ Pre-packaged folder with data + script
### HuggingFace Auth (required for Gemma)
You need to accept the Gemma license at https://huggingface.co/google/gemma-3-12b-it
Then login in Colab: `huggingface-cli login`
---
Path C: Together AI
### When to use
- Highest quality (80B+ parameter models)
- Serverless LoRA deployment (instant inference after training)
- ~$16-20 for 80B MoE, ~$200 for 235B
### Account
- API Key: In vault (`[home-path]`)
- Billing: https://api.together.xyz/settings/billing
Steps
export TOGETHER_API_KEY=$(grep TOGETHER [home-path] | cut -d= -f2)
# 1. Upload data (if not already uploaded)
python3 -c "
from together import Together
import httpx
client = Together([sensitive field redacted], timeout=httpx.Timeout(600.0))
train = client.files.upload(file='$HOME/Desktop/Comp-Core/packages/cognitive-twin/local_finetune/data/train.jsonl', purpose='fine-tune')
val = client.files.upload(file='$HOME/Desktop/Comp-Core/packages/cognitive-twin/local_finetune/data/valid.jsonl', purpose='fine-tune')
print(f'Train: {train.id}')
print(f'Val: {val.id}')
"
# 2. Launch fine-tuning job
curl -X POST https://api.together.xyz/v1/fine-tunes \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen3-Next-80B-A3B-Instruct",
"training_file": "<TRAIN_FILE_ID>",
"validation_file": "<VAL_FILE_ID>",
"suffix": "twin-alpha-v2",
"n_epochs": 2,
"n_evals": 10,
"n_checkpoints": 1,
"batch_size": 16,
"learning_rate": 2e-5,
"training_type": {
"type": "Lora",
"lora_r": 16,
"lora_alpha": 32
}
}'
# 3. Monitor
curl https://api.together.xyz/v1/fine-tunes/<JOB_ID> \
-H "Authorization: Bearer $TOGETHER_API_KEY" | python3 -m json.tool
# 4. After completion โ deploy as serverless LoRA
# The model_output_name from the job response is your endpoint### Current Files on Together AI
- Train: `file-55fa0beb-e510-48be-87c9-429186266cc5` (202MB, 16,360 records)
- Val: `file-1c315355-576a-4bc4-9507-174f653ed5fd` (10.9MB, 909 records)
### Current Jobs
- `ft-91cf6122-efc5` โ Qwen3-Next-80B-A3B-Instruct, twin-alpha-v2 (pending)
### Pricing (LoRA SFT)
| Model Size | $/M tokens |
|-----------|-----------|
| Up to 16B | $0.48 |
| 17-69B | $1.50 |
| 70-100B | $2.90 |
| Qwen3-235B (specialized) | $6.00 |
### Available Models for Fine-Tuning
Best value for Cognitive Twin:
1. Qwen3-Next-80B-A3B-Instruct โ 80B total, 3B active MoE. $0.48/Mtok tier (classified as โค16B due to active params). Serverless inference at $0.15/$1.50 per Mtok.
2. Qwen3-30B-A3B-Instruct-2507 โ 30B total, 3B active. Same price tier.
3. Gemma-3-4b-it โ Dense 4B. Cheapest training.
4. Qwen3-235B-A22B-Instruct-2507 โ The big one. $6/Mtok but highest quality.
---
Post-Training: Evaluation
Quick Test (20 prompts)
test_prompts = [
"What projects are you currently working on?",
"How would you approach building a new iOS app?",
"Explain the N'Ko keyboard architecture",
"What's the status of the BWB POS app?",
"How does the Dream Garden work?",
# ... add domain-specific prompts
]### Twin Fidelity Score (TFS)
Compare Twin output vs ground truth on:
1. Factual accuracy โ Does it know Mo's projects?
2. Style match โ Does it sound like the training data?
3. Task completion โ Can it actually help with real tasks?
4. Hallucination rate โ How much does it make up?
Target: TFS โฅ 0.80 (baseline was 0.772 from V1)
---
Maintenance: Retraining Pipeline
### When to retrain
- Every 2-4 weeks (new conversation data accumulates)
- After major project changes
- When TFS drops below 0.75
### Steps
1. Run density scorer on new data: `python3 scripts/density_v4.py`
2. Run V9 generators for new domains: `python3 scripts/gen_v9_*.py`
3. Re-run merge: `python3 scripts/local_finetune.py`
4. Upload new data to Together AI / Colab
5. Launch new training job with `suffix: twin-alpha-v3` (increment)
6. Evaluate and deploy if TFS โฅ 0.80
---
File Inventory
cognitive-twin/
โโโ scripts/
โ โโโ local_finetune.py # Data merge + MLX prep
โ โโโ density_v4.py # Density scoring
โ โโโ gen_v9_*.py # SFT data generators
โโโ notebooks/
โ โโโ twin_finetune_colab.ipynb # Interactive Colab notebook
โ โโโ train_twin.py # Standalone Colab script
โโโ colab-upload/ # Pre-packaged for Colab
โ โโโ train.jsonl
โ โโโ valid.jsonl
โ โโโ train_twin.py
โโโ local_finetune/
โ โโโ data/ # Merged, deduplicated data
โ โโโ adapters/ # Mac4 local LoRA adapters
โ โโโ manifest.json
โโโ data/
โ โโโ ctv3_export_v3/ # Base corpus (41K records)
โ โโโ expansion_v9/ # V9 generators output
โโโ MAC4_BENCHMARK.md # Local model benchmark results
โโโ TRAINING_RUNBOOK.md # This filePromotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
Comp-Core/packages/cognitive-twin/TRAINING_RUNBOOK.md
Detected Structure
Method ยท Evaluation ยท Code Anchors ยท Architecture