Path F: Live Performance Optimized -- Minimize Latency at All Costs
Every design decision is filtered through one question: does this add latency? If yes, reject it. Direct UDP everywhere. No middleware, no brokers, no HTTP coordination. Publishers bind [ip] and consumers hardcode IPs. The system is static: every connection is pre-configured at deploy time. No dynamic discovery, no subscriptions, no health checks. The absolute minimum number of network hops between any sensor and any output. Where possible, merge publisher and consumer onto the same process.
Full Public Reader
Path F: Live Performance Optimized -- Minimize Latency at All Costs
Concept
Every design decision is filtered through one question: does this add latency? If yes, reject it. Direct UDP everywhere. No middleware, no brokers, no HTTP coordination. Publishers bind [ip] and consumers hardcode IPs. The system is static: every connection is pre-configured at deploy time. No dynamic discovery, no subscriptions, no health checks. The absolute minimum number of network hops between any sensor and any output. Where possible, merge publisher and consumer onto the same process.
Architecture
K11 (all sensor -> visual, ZERO network hops)
+-----------------------------------------------------------+
| Femto Bolt USB-C |
| -> pyorbbecsdk in-process |
| -> UDP loopback :9700 LUMD |
| -> Unity LumeUdpReceiver (same machine, <1ms) |
| |
| UMA-8 USB-A |
| -> sounddevice in-process |
| -> UDP loopback :9701 LUMF |
| -> Unity LumeAudioFftReceiver (same machine, <1ms) |
| |
| mocopi BLE |
| -> Sony app -> :12351 |
| -> mocopi_bridge.py |
| -> UDP loopback :9702 LUMM |
| -> Unity LumeMocopiReceiver (same machine, <1ms) |
| |
| Unity: sensor -> processing -> visuals (HDMI) |
| Total sensor-to-pixel: 5-15ms (camera latency dominant) |
+-----------------------------------------------------------+
|
| UDP multicast: LUMF + LUMM to Tailscale peers
| (fire-and-forget, no ACK, no retry)
v
MotionMix iOS (receives LUMM + LUMF, processes locally)
+-----------------------------------------------------------+
| LUMM -> MocopiFeatureExtractor -> 128D[76:100] |
| LUMF -> ambient audio features -> 128D[104:108] (NEW) |
| CoreMotion -> EchelonBridge -> 128D[0:75] |
| Vision -> pose features -> 128D[63:69] |
| 128D canonical -> SAN -> music params |
| music params -> StrudelWebEngine -> Mac5 :9600 |
| (direct UDP push, not HTTP, NEW) |
+-----------------------------------------------------------+
|
| UDP: raw Strudel params (6 floats, 24 bytes)
v
Mac5 (receives params, synthesizes audio, ZERO processing)
+-----------------------------------------------------------+
| Strudel.js: receives params, plays music, speakers out |
| Total motion-to-sound: 30-60ms |
| breakdown: mocopi BLE ~15ms + bridge ~2ms + |
| Tailscale ~10ms + iOS processing ~5ms + |
| Tailscale ~10ms + Strudel render ~8ms |
+-----------------------------------------------------------+Latency Budget
| Hop | Path | Latency |
|---|---|---|
| 1 | Femto ToF capture | ~5ms |
| 2 | pyorbbecsdk -> Python | ~1ms |
| 3 | UDP loopback :9700 | <0.5ms |
| 4 | Unity receive + reproject | ~2ms |
| 5 | Unity render + display | ~5ms |
| Visual total | Sensor -> pixel | ~13ms |
| 1 | mocopi BLE -> Sony app | ~15ms |
| 2 | Sony -> :12351 -> bridge | ~2ms |
| 3 | LUMM loopback :9702 | <0.5ms |
| 4 | LUMM Tailscale -> iOS | ~10ms |
| 5 | iOS MocopiExtractor | ~1ms |
| 6 | iOS SAN + ParamMapper | ~3ms |
| 7 | Params Tailscale -> Mac5 | ~10ms |
| 8 | Strudel.js render | ~8ms |
| Audio total | Skeleton -> sound | ~50ms |
Strengths
- Fastest possible visual response. ~13ms sensor-to-pixel is essentially the camera's own latency. The processing pipeline adds <8ms.
- Fastest possible audio response. ~50ms skeleton-to-sound is within the perceptible synchrony window (human perceives audio-visual sync within ~80ms).
- No infrastructure dependencies. No NATS, no message brokers, no health check services. Just UDP sockets.
- Battle-tested. This is essentially how the system works today (minus LUMM to iOS). The existing code IS this architecture.
Weaknesses
- Rigid topology. Adding a new consumer means editing publisher code to add a destination. No dynamic discovery.
- No monitoring. If packets drop, nobody knows. If a publisher crashes, silence is the only signal.
- Hardcoded IPs. Tailscale IPs are stable but still configuration. If Mac5 changes IP (Tailscale reinstall), manual update.
- Fire-and-forget UDP. No delivery guarantees. In practice, local loopback is reliable, but Tailscale hops can have 100
- No coordination layer. No session start/stop, no state synchronization, no beat clock across machines. Each machine runs independently.
- Music-to-visual sync gap. K11 visuals react to local sensors. Mac5 audio reacts to iOS-processed sensors. These two reaction paths have different latencies and no shared clock. A transient beat hit on the audio may not visually sync with the depth-triggered impulse.
Verdict
ADOPT as the data plane. Direct UDP is correct for all real-time sensor data. The latency budget is the ground truth that every other architecture must beat or match. But the control plane needs something more than fire-and-forget: session management, health monitoring, and beat synchronization are real requirements that raw UDP alone cannot solve.
Promotion Decision
Promote into a technical note or architecture paper with implementation anchors.
Source Anchor
evo-cube-output/lume-full-system-architecture/stage1-path-f.md
Detected Structure
Method · Evaluation · Figures · Code Anchors · Architecture · is Stage Research