Training Your Twin While You Sleep
It started as a question: what if an AI could make decisions the way I would? Not just respond to prompts, but actually understand the patterns — the preferences, the shortcuts, the instincts that I've developed over years of working this way?
Full Public Reader
Training Your Twin While You Sleep
Somewhere on Vast.ai, four H100 GPUs are learning to think like me.
---
The Cognitive Twin has been the longest-running thread in this build.
It started as a question: what if an AI could make decisions the way I would? Not just respond to prompts, but actually understand the patterns — the preferences, the shortcuts, the instincts that I've developed over years of working this way?
On Valentine's night — the same night VisionClaw's glasses became a full agent proxy — the training finally launched.
The Corpus
Building a cognitive twin isn't about dumping your entire digital footprint into a model. It's about curating the decisions that define you.
Here's what went in:
- 163K+ conversation turns from Supabase — every interaction with my AI agents over the past year
- 979 Claude Code sessions — how I actually write code, debug problems, think through architecture
- 5,347 Apple Notes — stream of consciousness, ideas, plans, half-formed thoughts
- 20 Discord channels — how I communicate with my team of AI agents
After corpus surgery — cleaning, deduplication, quality filtering — I had 43,173 SFT records ready for training.
But that's not the interesting part.
The DPO Pairs
DPO (Direct Preference Optimization) is how you teach a model to prefer certain behaviors over others. You give it pairs: "Here's a bad response, here's a good response, learn the difference."
I created 740 pairs specifically designed to unlearn one thing: permission-seeking.
Every time an AI says "Should I...?" or "Would you like me to...?" or "Let me know if you want me to..." — that's a failure mode. The best AI assistants don't ask. They assess, decide, and act (with appropriate guardrails).
So the DPO pairs encode the transformation:
Bad: "Should I proceed with this approach?"
Good: "Proceeding with X. Here's the result."
Bad: "Would you like me to fix this bug?"
Good: "Fixed. Here's what was wrong."
Bad: "I noticed an issue. Want me to look into it?"
Good: "Found and resolved an issue: [details]."
The twin won't just know what I know — it'll default to action the way I've trained my AI to.
The Technical Setup
We're fine-tuning Qwen3-235B-A22B — a mixture-of-experts model with a trillion parameters. Using QLoRA (quantized low-rank adaptation) on 4x H100 SXMs rented from Vast.ai.
The training session got killed mid-tokenization when I was debugging a CUDA issue. But the PID survived. The training continued regardless — the system running even as I shifted focus to something else.
By 7:46 AM, tokenization was 15
The Convergence
Here's what I can't stop thinking about: the timing.
On the same Valentine's night:
- The glasses learned to see (VisionClaw becoming a full agent proxy)
- The twin learned to think (Qwen3-235B ingesting 43K records of my decisions)
Both happened overnight while I slept. Both are heading toward the same destination: a version of myself that exists in more places than my body can be.
The glasses give me eyes and ears I can't physically have. The twin gives me decision-making capacity I can't physically provide. Hardware and software, evolving in parallel.
The convergence isn't planned. It's emergent. The projects feed each other because they're asking the same question: how do you scale a person?
The Apple and the Tree
The corpus tells the story of decisions made, preferences formed, patterns repeated. The DPO pairs encode the hardest lesson: stop asking, start doing.
The twin won't be me. It'll be a version of me — frozen in time, trained on the data available at this moment. A snapshot of how I think in early 2026.
But that's okay. The goal isn't replacement. It's multiplication.
Imagine having a version of yourself that can:
- Triage incoming requests while you sleep
- Draft responses in your voice
- Make the 80
- Escalate only the 20
That's what I'm building. Not AGI. Not superintelligence. Just... leverage.
The apple doesn't fall far from the tree, even when the tree is made of tensors.
---
Next dispatch: What happens when the twin wakes up and meets the glasses.
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
content-pipeline/substack/2026-02-15-cognitive-twin.md
Detected Structure
Method · Evaluation · References · Architecture