LUME Multi-Sensor Motion Fusion — Phased Plan
The plan to go from **one camera** (today) to **four fused pose sources**: K11 Femto Bolt + Mac4 Femto Mega + 2 iPhones running MotionMix.
Full Public Reader
LUME Multi-Sensor Motion Fusion — Phased Plan
Generated 2026-05-20. Status: not started — post-launch track.
What this is
The plan to go from one camera (today) to four fused pose sources:
K11 Femto Bolt + Mac4 Femto Mega + 2 iPhones running MotionMix.
Hard rule — do not start before launch
The bar opens single performer, single Bolt. One camera = one timeline, no
fusion, no clock sync required. This entire plan is the post-launch "smarter
track." The trigger to begin Phase 1 is: the bar has launched on the single
Bolt and is stable. Starting earlier is the scope-creep we explicitly cut.
Current state (the starting line)
| Source | State |
|---|---|
| K11 Femto Bolt | ✅ Live — pyorbbecsdk COLOR_SENSOR → MediaPipe BlazePose → emits canonical pose JSON on UDP 9705, LUMM 27-bone on 9702, MJPG recording. Verified 2026-05-19. |
| Mac4 Femto Mega | ❌ No pose. Orbbec RGB is not a UVC webcam; needs the pyorbbecsdk-color path (same fix already proven on the Bolt). |
| iPhone ×2 (MotionMix) | ❌ Not wired. MotionMix does on-device pose but keeps it for its own synth. |
Canonical capture-node contract (frozen — the Bolt publisher already emits it):
`{"frame":int,"t":float,"w":int,"h":int,"landmarks":[{"x","y","z","vis"}×33]}`
as UDP datagrams. Every capture node emits this, tagged with a `source` id.
Architecture decisions already locked (from the 2026-05-19 redundancy audit):
- `cc-collection` is the canonical fusion path (1,027-line Kalman engine, parses 33-landmark frames).
- `cc-window-aligner` is parked for LUME — real code (20,215 LOC) but mocopi-shaped, its `PacketPayload` has no MediaPipe variant.
- `femto-bridge`'s 128D `skeleton_128d::Encoder` is kept — the only 128D code in the stack.
Phases
### Phase 1 — Mega as a second capture node (Mac4)
- Apply the pyorbbecsdk `COLOR_SENSOR` fix to the Femto Mega on Mac4 — identical
class of fix to the Bolt (Orbbec RGB not exposed as UVC).
- Write `mega_pose_pub.py`: Mega color stream → MediaPipe BlazePose → emit the
canonical pose JSON over UDP with `source="mega"`.
- Gate: Mac4 emits 33-landmark pose JSON at ~15-30 fps, confirmed with a
UDP listener.
- Effort: ~half a session.
- Risk: pyorbbecsdk color stream on macOS arm64 is unverified (the Bolt fix
was on Windows). If it fails, fall back to the Mega RGB via a different
capture path. Flag early.
### Phase 2 — iPhone capture-node mode (MotionMix)
- Add a "capture node" mode to the MotionMix iOS app: emit pose JSON over UDP
(same schema, `source="iphone-A"` / `"iphone-B"`) instead of / alongside the
on-device synth path.
- The K11 Bolt publisher is the reference implementation for the emit format.
- Gate: each iPhone emits canonical pose JSON, visible on the LAN.
- Effort: ~half a session (iOS build + UDP emit).
- Risk: Wi-Fi latency/jitter for the iPhone datagrams — measure it; wired
is not an option for a phone.
### Phase 3 — Clock sync layer
- Four independent device clocks drift. Fusion is only meaningful on a common
timebase. Two options: NTP-discipline every machine to one server, OR a sync
beacon — a cue all cameras see/hear at t0 to zero each offset.
- This is a dependency of Phase 4 (the aligner needs synced clocks to align
against).
- Gate: the four sources' timestamps agree within one frame (~33 ms) after
sync.
- Effort: ~half to one session.
### Phase 4 — N-source temporal aligner (cc-collection)
- Engineering gap #1. Generalize `cc-collection/fusion/temporal_align.rs`
from 2 streams to N keyed sources — each keyed by `source` id, aligned to a
canonical timestamp.
- Gate: the aligner ingests 4 keyed streams, outputs time-aligned frame
sets; existing tests stay green + new N-source tests pass.
- Effort: ~one session — genuine Rust work.
- Depends on: Phase 3 (synced clocks), and Phases 1-2 for real test streams.
### Phase 5 — FusedSkeleton → FemtoSkeleton adapter
- Engineering gap #2. `cc-collection` produces `FusedSkeleton`;
`femto-bridge`'s encoder consumes `FemtoSkeleton` and currently assumes a
single camera. Build the adapter + the multi-source merge.
- Gate: a fused multi-source skeleton encodes to 128D dynamics that match
the single-source path within tolerance on overlapping joints.
- Effort: ~half a session.
- Depends on: Phase 4.
### Phase 6 — End-to-end integration + the payoff
- Wire the full chain: 4 publishers → `cc-collection` fuse → FemtoSkeleton
adapter → `femto-bridge` encoder → `cc-brain` SAN (that last seam is already
wired).
- The payoff this unlocks: occlusion robustness (one camera blocked, the
others cover), 360° capture, and materially better SAN training data.
- Gate: end-to-end run with all 4 sources; the performer stays tracked
through a deliberate single-camera occlusion.
- Effort: ~half a session.
- Depends on: Phases 1-5.
Dependency order
Phase 1 (Mega) ─┐
Phase 2 (iPhone)┼─> Phase 4 (N-aligner) ─> Phase 5 (adapter) ─> Phase 6 (integrate)
Phase 3 (clocks)┘ ▲
└─ Phase 3 must land before Phase 4 is meaningfulPhases 1, 2, 3 are independent of each other and can run in parallel.
Total: roughly 2 focused sessions of real work, plus the capture-node
plumbing.
What makes this tractable
1. The Bolt publisher finished 2026-05-19 is the proven template every other
capture node copies — the hard "Orbbec RGB isn't a webcam" discovery is done.
2. The Mega blocker is a known fix (pyorbbecsdk color), solved once already.
3. `cc-collection` already has the Kalman fusion engine — Phase 4 generalizes
existing code, it doesn't write fusion from scratch.
4. The 128D encoder and the SAN seam already exist and are unchanged.
The only genuinely new engineering is Phase 4 (2→N aligner) and Phase 5 (the
adapter). Everything else is replicate-the-pattern, a known fix, or wiring.
## Related memory
- `k11-bolt-record-hotkey-2026-05-19.md` — the Bolt publisher (the template)
- `lume-sensor-capture-architecture-2026-05-19.md` — the redundancy audit + cc-collection decision
- `motion-score-composer-2026-05-19.md` — the music runtime the fused pose ultimately feeds
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
lume-commerce/docs/LUME_MULTI_SENSOR_FUSION_PLAN.md
Detected Structure
Method · Evaluation · Code Anchors · Architecture