Grand Diomande Research · Full HTML Reader

Stage 4: FORGE — Enhanced CC Architecture (Production-Grade)

Day 1-2: B (single phone works) + D (sound upgrade) + E (training starts) Day 3-5: A (Mocopi + Direct Trigger) Week 2: C (multi-cam content pipeline) Month 2: F (live performance) ```

Language as Infrastructure architecture technical paper candidate score 30 .md

Full Public Reader

Stage 4: FORGE — Enhanced CC Architecture (Production-Grade)

The Order of Operations

B → D → E → A → C → F

Day 1-2:  B (single phone works) + D (sound upgrade) + E (training starts)
Day 3-5:  A (Mocopi + Direct Trigger)
Week 2:   C (multi-cam content pipeline)
Month 2:  F (live performance)

---

Layer 0: Body — ENHANCEMENTS

### Current: Camera crop mask works.
### Enhanced:
- Posture validation: After calibration confirm, run 3-second check — if chest not centered (Vision BodyPose shoulders not within center 60
- Distance estimation: Use shoulder width in pixels to estimate camera distance. If too far (shoulders < 20
- Session memory: Save calibration crop values to UserDefaults. Next session pre-loads your last good position.

Deliverable: Update calibration flow in PerformView.swift (iOS) and index.html (web).

---

Layer 1: Sensors — ENHANCEMENTS

### Current: CoreMotion works. Everything else untested.
### Enhanced:

Priority 1: Verify Mocopi hardware (2 hours)
- Pair 6 sensors with iPhone 3 (Motionlink)
- Verify UDP packets reach Mac1 on :12351
- Check mocopi-bridge.py parses real data correctly
- Document actual bone data format (may differ from spec)

Priority 2: Unified sensor format
- All sensors write to capture server in the SAME JSON schema:

json

{
  "device_id": "string",
  "source": "iphone|watch|mocopi|webcam|ipad",
  "timestamp": 1234567890.123,
  "data": {
    "joints_3d": [[x,y,z], ...],     // if available
    "joints_2d": [[x,y,conf], ...],  // if camera
    "imu": {"accel":[x,y,z], "gyro":[x,y,z], "quat":[w,x,y,z]},
    "skeleton": {"bones": [{"id":0, "pos":[x,y,z], "rot":[w,x,y,z]}, ...]},
    "echelon_latent": [16 floats],
    "echelon_section": 0-6
  }
}

- Capture server normalizes all inputs into this schema.

Priority 3: Watch companion app
- Defer. Watch WCSession code exists in SensorService but no Watch target.
- Low priority — wrist IMU from Mocopi is higher quality.

Deliverable: Mocopi hardware test report. Unified sensor schema in capture_server.py.

---

Layer 2: Brain — ENHANCEMENTS

### Current: Echelon Brain works. CLIP is manual. Only chest-flex classified.
### Enhanced:

Priority 1: Automated CLIP pipeline
- After each session, auto-extract frames (2fps, chest-crop)
- Run CLIP ViT-B/32 on Mac5
- Update cluster assignments
- Save to session directory
- Script: `auto_clip_embed.py` (runs as post-session hook)

Priority 2: BodyInstrumentMapper handles Mocopi
- Current: 14 Vision joints → 5 triggers
- Enhanced: 27 Mocopi bones → 6 instruments (from Dashboard extraction)
- Dual mode: Vision (camera) OR Mocopi (sensors), auto-detected

Priority 3: CLIP → Strudel layer mapping
- Cluster 0 (active groove) → all 4 Strudel layers
- Cluster 1 (transitions) → harmonic + textural only
- Cluster 2 (minimal) → single percussion layer
- `applyClusterTransition()` already built in tone-house.js — wire to real-time CLIP inference

Deliverable: auto_clip_embed.py, BodyInstrumentMapper Mocopi support, CLIP→Strudel wiring.

---

Layer 3: Music — ENHANCEMENTS (PATH D: Sound Design)

### Current: Functional but sounds amateur.
### Enhanced:

Priority 1: Sample-based drums (replace synthesized)
- Download/create: 909 kick, 808 clap, metallic hat, sub bass one-shots
- Load as WAV samples in Tone.js (`Tone.Player` or `Tone.Sampler`)
- Replace the synthesized kick/clap/hat with sample playback
- Result: Immediately sounds professional

Priority 2: Mastering chain

Individual voices → Mixer → EQ (low cut 30Hz, slight 3kHz boost)
                          → Compressor (ratio 4:1, threshold -18dB)
                          → Limiter (ceiling -1dB)
                          → Master output

Tone.js has `Tone.Compressor`, `Tone.EQ3`, `Tone.Limiter`
Add to ToneHouse.init() after masterComp

Priority 3: Direct Trigger Engine
- New class: `DirectTriggerEngine` (JS for web, Swift for iOS)
- Input: Mocopi bone velocities (27 bones) OR Vision joint velocities (14 joints)
- Output: Direct synth parameter control
- Mapping:

  bone.velocity > threshold → trigger(bone.instrument, bone.velocity)
  bone.position.y → pitch (continuous)
  bone.rotation → filter cutoff (continuous)

- Runs ALONGSIDE Strudel (not replacement). Strudel = background pattern. Direct Trigger = foreground accents.

Priority 4: Smooth transitions between energy states
- Current: hard-switch patterns on state change (with crossfade volume dip)
- Enhanced: Pattern morphing — gradually swap individual steps over 2 bars
- Bar 1: 75
- Bar 2: 25
- Bar 3: 100

Deliverable: Sample-based drums, mastering chain, DirectTriggerEngine, pattern morphing.

---

Layer 4: Visualization — ENHANCEMENTS

### Current: Metal orb works. TD bridges built but untested.
### Enhanced:

Priority 1: Add flex direction to orb
- When chest flex detected, spike the orb in the L or R direction
- Pass `flexDirection` (-1.0 left, 0 center, 1.0 right) to Metal shader
- Shader offsets particle spawn position based on direction

Priority 2: Test TouchDesigner on Mac2
- Open TD, paste cc_osc_setup.py, execute
- Run osc_bridge.py on Mac1
- Verify OSC data arrives in TD
- Build basic orb + skeleton view

Priority 3: TD visual themes per genre
- House: warm orange/purple palette, flowing particles
- Techno: cold blue/white, sharp geometric patterns
- Ambient: green/teal, slow aurora waves
- Jazz: gold/brown, smooth ribbon trails

Deliverable: flexDirection in Metal shader, TD basic test, genre-themed visuals.

---

Layer 5: Recording/Production — ENHANCEMENTS

### Current: ReplayKit built but untested. No automation.
### Enhanced:

Priority 1: Test ReplayKit recording
- Open MotionMixApp, start a session, hit record
- Verify the exported .mov includes body + orb + overlays
- Check audio is included (app audio, not mic)

Priority 2: Auto-reel-cutter
- Script: `auto_reel_cutter.py`
- Input: session video + energy timeline from capture server
- Process: find build→peak arcs, extract 15-45s clips at 1.25x speed
- Output: 3-5 clips per session in `Desktop/cc-reels/`
- Runs automatically when session ends

Priority 3: OBS scene configuration
- Install OBS on Mac1 (if not installed)
- Configure 4 scenes: split-screen, TD fullscreen, multi-cam, PiP
- Add obs-websocket for automated scene switching
- Connect: iPhone screen via USB capture, Mac2 TD via NDI, audio via BlackHole

Priority 4: Premiere ExtendScript automation (Mac4)
- Script: `cc_premiere_auto.jsx`
- Auto-import multi-cam + TD render + metadata
- Sync by timecode
- Apply CC LUT (dark, high contrast, orange accent)
- Insert cuts at section transitions
- Export 3 formats

Deliverable: ReplayKit test, auto_reel_cutter.py, OBS 4-scene config, Premiere script.

---

Layer 6: Training — ENHANCEMENTS (PATH E: Training Flywheel)

### Current: Scripts exist. No automation. 0 trained models.
### Enhanced:

The Flywheel:

Session ends
    │
    ├─► auto_clip_embed.py runs on Mac5 (5 min)
    │   └─ Updates cluster vocabulary
    │
    ├─► Session data syncs to Mac4+5 (Syncthing)
    │
    └─► Nightly cron (2 AM):
        ├─ If sessions ≥ 5: retrain chest flex model
        ├─ If sessions ≥ 15: start CC-MotionGen training
        ├─ If sessions ≥ 30: start EchelonDiT training
        └─ Update movement vocabulary from new clusters

Deliverable:
- `cc_training_cron.py` — nightly training orchestrator
- LaunchAgent on Mac5: `com.cc.nightly-training`
- Syncthing sync rule: `[home-path]` → Mac4+Mac5

---

Layer 7: Rehearsal — ENHANCEMENTS

### Current: Linear extrapolation. Not connected to trained models.
### Enhanced:

Phase 1 (Now): Linear extrapolation is FINE for the first month. The rehearsal bridge works, it just predicts using velocity. This is acceptable.

Phase 2 (After MotionGen trains): Drop-in replacement.

python

# In rehearsal-bridge.py:
# Replace: predicted = linear_extrapolate(history)
# With:    predicted = motiongen.predict(history[-20:])

The interface stays the same (`GET /predict`). Only the prediction method changes.

Phase 3 (After EchelonDiT trains): Pre-generate audio.

python

# After predicting latent state 2s ahead:
# audio_tokens = echelondit.generate(predicted_latent)
# Pre-load into Strudel's buffer

Deliverable: No code changes needed now. Document the drop-in replacement path. Build Phase 2 when MotionGen is trained.

---

Implementation Priority (Hydra Execution Order)

#	Task	Layer	Path	Effort	Depends On
1	Test Mocopi hardware	L1	A	2h	Hardware
2	Sample-based drums	L3	D	4h	None
3	Mastering chain (EQ+comp+limiter)	L3	D	1h	None
4	Test ReplayKit recording	L5	B	30m	None
5	auto_clip_embed.py	L6	E	2h	None
6	auto_reel_cutter.py	L5	C	3h	None
7	Direct Trigger Engine	L3	A	6h	#1 (Mocopi test)
8	BodyInstrumentMapper Mocopi support	L2	A	3h	#1
9	Flex direction in Metal shader	L4	B	2h	None
10	Test TouchDesigner on Mac2	L4	C	2h	Mac2 available
11	OBS 4-scene configuration	L5	F	4h	#10 (TD for NDI)
12	cc_training_cron.py + LaunchAgent	L6	E	3h	None
13	Pattern morphing transitions	L3	D	3h	None
14	Premiere ExtendScript	L5	C	6h	Mac4 available
15	Calibration posture validation	L0	B	2h	None

Total estimated effort: ~44 hours across 15 tasks.
Parallelizable: Tasks 2, 3, 5, 6, 9, 12, 15 can all run simultaneously (no dependencies).

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

omega-output/cc-7layer-enhancement-20260323/04-architecture-enhanced.md

Detected Structure

Method · Figures · Code Anchors · Architecture · is Stage Research