Grand Diomande Research · Full HTML Reader

LUME Agent Contracts — v1

**Status:** authoritative as of 2026-05-24. Lock these contracts before parallel agent work continues. If anything below changes, **update this doc first**, then the agents.

Agents That Account for Themselves proposal backlog reference score 22 .md

Full Public Reader

LUME Agent Contracts — v1

Status: authoritative as of 2026-05-24. Lock these contracts before parallel
agent work continues. If anything below changes, update this doc first, then
the agents.

This document exists because the bar's runtime is forked across three agents
(K11 audio, Mac4 visuals/cameras, Mac1 dev) and they will silently drift unless
they all code against the same packet schema, IPs, ports, and conventions.

---

1. Architecture overview

┌─────────────────────────────┐         UDP pose JSON          ┌─────────────────────────────┐
│            Mac4             │  ────────────────────────────> │             K11             │
│  Femto Bolt (vertical)      │      Tailscale [ip]    │  Rekordbox + loopMIDI       │
│  Femto Mega (horizontal)    │            :9705               │  body→MIDI/keyboard bridge  │
│  MediaPipe pose publisher   │                                │  Speakers + 115 stems       │
│  Unity / DYK visuals (local)│                                │                             │
│  MotionMix iPhone (optional)│                                │                             │
└─────────────────────────────┘                                └─────────────────────────────┘
                                                                              │
                                                                              ▼
                                                                       Bar speakers

Mac1 is dev only — not in the live runtime. Its job is the repo, this doc,
and ensuring the bridge code is committed.

Twitch / OBS / NDI / hotspot are explicitly out of scope for v1. The bar
runs local. Stream comes back later.

Known constraints (verified 2026-05-22)

- Mac4 cannot run Bolt depth. `pyorbbecsdk` on macOS reports "Femto Bolt is
unavailable on macOS due to Depth Engine." Mega depth works (~30 fps), Bolt
is RGB-only on Mac4. Dual-depth on Mac4 is not pursued.
- K11 is where Bolt depth lives if/when needed (the original Bolt depth
pipeline was proven on K11/Windows).
- K11 Pose Coach Viewer is the current Bolt owner. While
`C:\temp\lume_pose_viewer.py` is open it owns the Bolt color stream, emits
`source=orbbec_bolt`, `role=bolt-rgb` pose JSON to local UDP `:9705`, and
overlays body plus hand landmarks for live debugging.
- Mac4 keeps the Mega depth → Unity baseline at `[ip]:9700`. Don't
touch it.
- Mac4 RGB publisher uses the Mega, not the Bolt, for v1. Bolt RGB +
MediaPipe is a future path; permissions need to be granted to the runtime
process before it can capture.

---

2. Network endpoints

EdgeFromToProtocol
Pose feedMac4 publisher`[ip]:9705` (K11 Tailscale)UDP
Wide pose assistMotionMix iPhone`[ip]:9705` (K11 Tailscale)UDP
Bridge bindK11 bridge`[ip]:9705`UDP listen

Why Tailscale, not LAN: UDP pose JSON over Tailscale routes between machines
regardless of subnet, sidesteps the hotspot / NDI mDNS problem entirely.
Mac4 must have Tailscale running.

Firewall — K11 (Windows Defender will block inbound UDP by default):

powershell
New-NetFirewallRule -DisplayName "LUME pose UDP 9705" `
  -Direction Inbound -Protocol UDP -LocalPort 9705 `
  -Action Allow -RemoteAddress [ip]/10 -Profile Any

Scoped to the Tailscale CGNAT range (`[ip]/10`) — not the public internet.

Status 2026-05-22: ✅ End-to-end verified. K11 firewall rule
`LUME pose UDP 9705` added (`Get-NetFirewallRule -DisplayName 'LUME pose UDP 9705'`
returns enabled). Bridge listens on `[ip]:9705`, loopMIDI port named `LUME`
is the active MIDI out. Mac1→Tailscale→K11 sender test pushed
151/151 packets through to the bridge cleanly, 0 bad, MIDI emitted at 15 Hz.
Mac4's publisher hitting `[ip]:9705` will be picked up without any
further K11-side config.

Mac4 Tailscale source IP: `[ip]`. If the K11 agent wants to
scope the firewall tighter than the whole CGNAT range, use this single
`-RemoteAddress`.

Reachability sanity check (Mac4-side, before writing the publisher):

bash
# from Mac4:
echo '{"frame":0,"t":0.0,"w":1,"h":1,"landmarks":[]}' | nc -u -w1 [ip] 9705
# Then on K11, confirm bridge logs "bad" count incremented (landmarks!=33 is OK -- proves packet arrived).

---

3. Pose JSON schema — FROZEN, do not extend without updating this doc

Each UDP datagram is a single UTF-8 JSON object:

json
{
  "frame":     12345,
  "t":         1748053200.123,
  "w":         1280,
  "h":         720,
  "landmarks": [
    {"x": 0.512, "y": 0.428, "z": -0.011, "vis": 0.99},
    ... (EXACTLY 33 entries, in BlazePose order)
  ],
  "source":    "bolt",
  "presence":  0.95
}
FieldTypeRequiredMeaning
`frame`intyesMonotonic per-publisher frame counter
`t`floatyesUnix seconds, publisher clock
`w`intyesImage width in pixels
`h`intyesImage height in pixels
`landmarks`array[33]yesBlazePose 33-landmark array (order: see Mediapipe docs)
`landmarks[i].x`float [0,1]yesNormalized image x (left=0, right=1)
`landmarks[i].y`float [0,1]yesNormalized image y (top=0, bottom=1)
`landmarks[i].z`floatyesRelative depth, hips≈0, negative=closer to camera
`landmarks[i].vis`float [0,1]yesBlazePose visibility
`source`stringOPTIONAL v1, REQUIRED v2`"bolt"`, `"mega"`, `"fused"`, `"iphone_motionmix"`, `"insta360_virtual"`, or `"replay"`
`image_landmarks`array[33]optionalNormalized image-space BlazePose landmarks. K11 live viewer emits this; bridge prefers it for framing and hand/arm geometry.
`hands`arrayoptionalMediaPipe Hands detections. Each hand has `handedness`, `landmarks` (21 normalized image points), and optional `world_landmarks`.
`segmentation`objectoptionalCompact person/torso/hand segmentation evidence. K11 viewer emits schema `lume.segmentation.v1`; bridge stores summary columns plus raw JSON.
`presence`float [0,1]optionalOverall body-present confidence
`confidence`float [0,1]optionalSynonym for `presence` (publishers may emit either or both; consumers prefer `presence` when both are present)
`ts_ns`intoptionalHigh-resolution publisher timestamp in nanoseconds (`time.time_ns()`)
`center``{x,y}`optionalBody center in normalized image space
`body_center``{x,y}`optionalSame as `center`; prefer this name going forward
`bounds``{min_x,min_y,max_x,max_y}`optionalBody bounding box in normalized image space
`role`stringoptionalPublisher role, e.g. `"wide_pose"` for MotionMix/iPhone range assist
`stage`stringoptionalHuman-readable stage/room id for apartment capture
`camera`objectoptionalSource camera metadata for virtual/programmable camera feeds; consumers may ignore
`body`objectoptionalMotionMix body metrics: center, span, coverage, framing
`motion`objectoptionalMotionMix motion metrics: energy, core, legs, wrist velocity, hip velocity
`capabilities`array[string]optionalSource capability hints, e.g. `["pose","range","velocity"]`

Coordinate notes:
- `x` is NOT mirror-flipped on the publisher. The vertical Bolt and the
horizontal Mega will produce different "left/right" meanings — the fuser
reconciles this in v2; v1 picks one camera as canonical.
- `y` is image-space (top=0). The bridge inverts to get "height in space" =
`1 - y`.
- `z` is BlazePose's relative scale, not metric depth.

Bridge tolerance: unknown extra fields are ignored. Missing `source` is
tolerated in v1 and treated as `"bolt"`. Missing or wrong-size `landmarks`
increments the `bad` counter and the packet is dropped.

---

4. Source-tagging convention

`source`Used byWhen
`"bolt"`Mac4 Bolt publishereach camera publishes independently
`"orbbec_bolt"`K11 LUME Pose Coach Viewercurrent live-control source when the K11 viewer owns the Bolt camera; role is `"bolt-rgb"`
`"mega"`Mac4 Mega publishereach camera publishes independently
`"fused"`Mac4 fuser (v2)a single stream that already merged both cameras
`"iphone_motionmix"`MotionMix iPhone publisheroptional wide-angle/range pose assist for room-scale staging; `role` identifies placement such as `"phone_front"`, `"phone_side_left"`, `"phone_side_right"`, or `"phone_torso"`
`"insta360_virtual"`MotionMix + Insta360 SDK previewprogrammable 360 source; `role` identifies the active virtual crop such as `"insta360-front-wide"`
`"replay"`Synthetic replay toolbridge tests and Rekordbox MIDI LEARN

v1 live-control expectation: exactly ONE raw source is accepted for live
controls. On K11 today that is typically `orbbec_bolt:bolt-rgb`. Additional raw
sources, including `iphone_motionmix:*`, may share `:9705` for database capture
only because the bridge records filtered packets when `--record-motion-enable`
is active. Keep them filtered with `--accept-source` until a fuser emits a
single canonical stream.

v2 live-control expectation: a fuser sends `"fused"` packets. Two raw
streams accepted for control on the same port without fusion is explicitly not
supported
-- the bridge would compute landmark velocity across two cameras'
frames and produce garbage. Raw multi-source data is safe as recorded evidence,
not as the direct live gesture source.

---

5. Component locations — where things live

ComponentMachinePath
LUME Pose Coach Viewer (current live Bolt owner)K11 / reporepo: `core/audio-media/cc-echelon/tools/lume-rekordbox-bridge/lume_pose_viewer.py`; runtime: `C:\temp\lume_pose_viewer.py`; task: `LumePoseViewer`
Bolt publisher (template, K11 runtime)K11`C:\lume\services\bolt-skeleton-pub\bolt_rgb_blazepose_pub_v2.py`
Bolt publisher (canonical, in repo)repo`core/motion/lume-pose-publishers/bolt_rgb_blazepose_pub_v2.py` (SCP'd from K11 2026-05-22)
Mac4 publisher (Mega RGB → MediaPipe, v1)Mac4`mac4_pose_udp_publisher.py` (working; to be committed by Mac4 agent into `core/motion/lume-pose-publishers/`)
MotionMix LUME Sensor ModeiPhone / repo`Desktop/MotionMixApp/MotionMixApp/Services/JointDataUploader.swift` (`LumeSensorPublisher`)
MotionMix SAN training streamiPhone / repo`Desktop/MotionMixApp/MotionMixApp/Services/SANTrajectoryLogger.swift` posts NDJSON to `http://[ip]:9471/san-frame`
`pose_replay.py` (synthetic packet emitter)repo`core/audio-media/cc-echelon/tools/lume-rekordbox-bridge/pose_replay.py`
`lume_san_receiver.py`repo / K11K11 HTTP receiver task `LumeSanReceiver`; stores MotionMix SAN training frames in `body_motion_san.sqlite3`
`lume_multisource_report.py`repo / K11checks `body_motion.sqlite3` and `body_motion_san.sqlite3` for source coverage, accepted vs filtered streams, multi-angle overlap, and SAN training arrival
`lume_segmentation_report.py`repo / K11summarizes recent `lume.segmentation.v1` evidence from `body_motion.sqlite3`
Bridge sourcerepo (Comp-Core)`core/audio-media/cc-echelon/tools/lume-rekordbox-bridge/lume_rekordbox_bridge.py`
Bridge runtimeK11`C:\temp\lume_rekordbox_bridge.py`; launcher `C:\temp\run_lume_bridge_logged.cmd`; task `MotionMixLumeBridge`
Motion evidence DBK11`C:\lume\data\body_motion.sqlite3`
SAN training evidence DBK11`C:\lume\data\body_motion_san.sqlite3`
K11 setup scriptrepo`core/audio-media/cc-echelon/tools/lume-rekordbox-bridge/setup-bridge.ps1`
`lume-stem-live` binaryK11`C:\lume\build\cc-echelon\target\release\lume-stem-live.exe` (dormant in v1)
Stem libraryK11`C:\lume\stems\{soundcloud,bandcamp}\<track>\{bass,drums,other,vocals}.wav`

Mac4's publisher must be ported from the K11 file, not written from scratch.
The K11 file is the proven template — it emits exactly the schema in §3 and was
verified live for the lume-stem-live test.

K11 motion evidence recorder

When the bridge is launched with `--record-motion-enable`, it writes valid pose
traffic and bridge events to SQLite at `C:\lume\data\body_motion.sqlite3`.

Tables:

- `motion_sessions` - one row per bridge runtime session.
- `pose_frames` - raw pose JSON plus source, role, peer, accepted/filter status,
frame dimensions, landmark counts, segmentation summaries, and derived bridge features.
- `pose_events` - gesture fires, voice-coach guidance, and other discrete bridge
events.
- `motion_labels` - labeled training/calibration segments with start/end times
for examples such as `hand_raise`, `head_nod`, `seated_idle`, and torso leans.

K11 hand gesture runtime

As of 2026-05-24, the K11 Pose Coach Viewer also loads
`C:\lume\services\bolt-skeleton-pub\hand_landmarker.task` and emits optional
`hands` in the same UDP packet as the body landmarks. The bridge uses this
field when available for raised-hand play/pause, then falls back to body-pose
wrists if no hands are detected.

Current live gesture command path:

  • `LumePoseViewer` owns the Bolt camera and sends body + hand JSON to `:9705`.
  • `MotionMixLumeBridge` accepts only `orbbec_bolt:bolt-rgb` for live control.
  • The launcher includes `--gesture-enable --hand-raise-enable`.
  • A deliberate hand raise fires `Z` into Rekordbox for play/pause.

K11 Pose Coach segmentation runtime

As of 2026-05-24, the K11 Pose Coach Viewer also loads
`C:\lume\services\bolt-skeleton-pub\selfie_segmenter.tflite` and runs MediaPipe
`ImageSegmenter` every two frames by default.

The viewer does not send full masks over UDP. It overlays the person mask in
the K11 window, then emits compact `segmentation` evidence:

  • `person`: mask coverage, mean score, and bbox.
  • `torso`: mask overlap with the pose-derived torso ROI.
  • `hands`: mask overlap with hand-landmark ROIs.
  • `quality`: a combined long-running confidence score.

The bridge recorder auto-migrates `pose_frames` with:

  • `segmentation_quality`
  • `segmentation_person_coverage`
  • `segmentation_torso_coverage`
  • `segmentation_hand_coverage`
  • `segmentation_hand_count`
  • `segmentation_json`

Check the live lane on K11:

powershell
python C:\temp\lume_segmentation_report.py --window-seconds 120 --limit 3

This is evidence capture, not model training yet. Training and calibration should
consume this database later to learn Mohamed-specific seated, torso, hand,
head-nod, and room-position patterns.

MotionMix SAN trajectory capture is recorded by a separate K11 scheduled task:

- task: `LumeSanReceiver`
- endpoint: `POST http://[ip]:9471/san-frame`
- DB path: `C:\lume\data\body_motion_san.sqlite3`
- tables: `san_receiver_sessions`, `san_training_batches`,
`san_training_frames`

The sidecar DB is intentional. The live bridge writes pose frames continuously to
`body_motion.sqlite3`; SAN training uses its own SQLite file to avoid blocking
live gesture control when long capture sessions are running. MotionMix also
keeps phone-local JSONL backup files under `Documents/san-training`.

The bridge label-control port is local to K11 by default:

powershell
python C:\temp\lume_motion_label.py segment hand_raise --seconds 5
python C:\temp\lume_motion_label.py mark bad_tracking

Guided capture and calibration tools:

powershell
python C:\temp\lume_guided_capture.py --profile quick
python C:\temp\lume_motion_calibrate.py
python C:\temp\lume_multisource_report.py --window-seconds 300

`lume_motion_calibrate.py` writes `C:\lume\data\gesture_calibration.json` and
only proposes thresholds when enough baseline and positive labels exist.

---

6. MIDI / keyboard contract (K11 bridge → Rekordbox)

Bridge emits on MIDI channel 1, via the first MIDI out port whose name
matches the substring `loopMIDI` (default loopMIDI port name).

CCSource featureRangeSuggested Rekordbox target
1body energy (movement intensity)0-127Low-pass filter cutoff / FX wet
7arms height (wrists up = high)0-127Master volume / deck gain
10hand spread (wide arms)0-127Crossfader
11body height (head up / crouched)0-127Reverb wet
74hip sway (left ↔ right)0-127Low EQ / filter motion

Rate-limited to 30 Hz, smoothed (EMA α=0.25), only re-sent on value change.

Keyboard gestures (only when bridge launched with `--gesture-enable`):

GestureKeyRekordbox default
Arms straight up (rising edge of arms_height > 0.85)SPACEPlay/pause
Sharp energy spike (rising edge of energy > 0.75)QHot cue 1

Both debounced with 0.6-0.8s cooldown.

---

7. Rekordbox track strategy — DECISION NEEDED

Mohamed needs to pick before the K11 agent runs Prompt 4.

OptionSetupProsCons
A. Full tracks + built-in stemsLoad original full tracks into Rekordbox library. Use Rekordbox 6.7+ built-in stem feature. MIDI maps to stem volumes.Simple, Rekordbox-native. One deck = one track.Requires Rekordbox stem feature license tier.
B. 4 stems as 4 decksLoad each of `{bass, drums, other, vocals}.wav` onto deck 1-4. MIDI crossfades stem-by-stem live.Body literally controls each stem. Maximum expressiveness.Synchronization (4 tracks in lock-step) is fiddly. Uses all 4 decks per "song."

Default v1 recommendation: A. B is the long-term aesthetic goal but B
requires Rekordbox 4-deck mode + tight tempo lock; not v1.

---

8. v1 vs v2 — what each agent ships when

### v1 (now)
- ✅ Dual-Femto probe done. Bolt depth NOT available on macOS; Mega depth fine.
- ✅ Mac4 publisher (Mega RGB → MediaPipe) emits canonical schema at ~12 fps
(`--pose-every 2`; can raise) to `[ip]:9705`. tcpdump-verified.
- ✅ K11 audio side fully wired and end-to-end verified 2026-05-22:
loopMIDI port `LUME` live, bridge binds MIDI out, firewall rule added,
151/151 synthetic packets through Mac1→Tailscale→K11→bridge, MIDI emitted.
Run command on K11: `python C:\temp\lume_rekordbox_bridge.py --midi-port LUME --gesture-enable`.
- ⏸ Mohamed does Rekordbox MIDI LEARN (CC 1, 7, 10, 11, 74) once Mac4 starts
publishing — or earlier, using `pose_replay.py` with `--motion sweep` for
obvious CC isolation.
- Acceptance: moving on Mac4 changes Rekordbox on K11.
- DYK visuals stay on the current Mega-depth single-camera baseline at
`[ip]:9700`. No tuning until the body signal is solid.

### v2 (after v1 is alive)
- Mac4 runs both Femtos.
- Mac4 fuses to single `"fused"` stream → K11.
- Stage-space calibration (4-corner method, no AprilTag yet).
- DYK visuals consume the fused mask.

### v3 (later, post-bar-stable)
- Streaming back on (OBS audio capture is unchanged; rekordbox.exe still
outputs the audio).
- Bridge + publisher registered as persistent services (NSSM on K11, launchd
on Mac4).
- Network MIDI extension if Mac4 also needs to send keyboard events.

---

9. Per-agent task pointers

### K11 audio agent
- Reads §2 firewall block, §3 schema, §6 MIDI contract, §7 Rekordbox decision.
- Owns: loopMIDI install, Python deps, bridge runtime, Rekordbox MIDI LEARN
walkthrough, persistence after v1.
- Touches no Unity, no NDI, no OBS, no hotspot.

### Mac4 Unity/Femto agent
- Reads §3 schema, §4 source convention, §5 publisher template path.
- Owns: dual-Femto probe, pose publisher per camera, DYK visual baseline, v2
fusion.
- The publisher is a clone of the K11 file (§5), not a rewrite. The
schema is non-negotiable in v1.

### Mac1 dev agent
- Owns: this doc, the bridge code in the repo, git pushes.
- Does not run live workloads.
- If anything in §2-§7 changes, this doc gets edited first.

---

10. Anti-drift rule

When an agent reports back, the first line of its report should reference the
section of this doc it touched. If a report describes a packet field, IP, port,
or convention that isn't in §3 or §2 above, stop and update this doc first.
That is how the fork stays unified.

Promotion Decision

Keep as idea/proposal unless evidence and implementation anchors exist.

Source Anchor

Comp-Core/core/audio-media/cc-echelon/docs/LUME_AGENT_CONTRACTS.md

Detected Structure

Method · Code Anchors · Architecture