Grand Diomande Research · Full HTML Reader

Generative Output Overview

Generative output is the part of the system that turns body state into sound, visuals, camera decisions, DJ control, and inscriptions.

Embodied Trajectory Systems architecture technical paper candidate score 32 .md

Full Public Reader

Generative Output Overview

Generative output is the part of the system that turns body state into sound,
visuals, camera decisions, DJ control, and inscriptions.

The old version of this page overstated SAN training and treated the output layer
as if one trained model controlled everything. The current source shows multiple
lanes with different maturity levels.

Output Lanes

Music:

- Rust Echelon audio handle renders audio.
- MotionMixApp bridges brain state to audio.
- SAN can produce adaptive output parameters.
- `DiffusionService` can generate token grids through phone hub, CoreML, or
fallback paths.

Visuals:

- Unity/LUME receives skeleton/body/motion data.
- Mac4 currently feeds Unity from mocopi locally.
- DYK/LUME visuals should eventually consume stable BodyTruth, but direct local
mocopi feed is currently the more stable visual path.

DJ / AirDeck:

- K11 owns Rekordbox command safety.
- Gesture detection can use camera pose even when mocopi is off.
- mocopi improves accuracy but must not be a hard requirement for basic camera
gestures.

Photography / camera:

- camera nodes and StageView/ShootView manage streams, captures, and sessions.

Inscription:

ClaimBridge maps Echelon dynamics into controlled N'Ko-style motion claims.
Full natural motion-to-language phrasing remains future work.

Documents In This Section

- [san-pipeline.md](san-pipeline.md): source-grounded SAN runtime and caveats.
- [audio-engine.md](audio-engine.md): audio rendering and output path.
- [lume-visuals.md](lume-visuals.md): Unity/LUME visual behavior.
- [motion-lexicon.md](motion-lexicon.md): gesture vocabulary and body-state
terms without overstating training.
- [photography.md](photography.md): camera node and shoot workflows.

Generation Decision Boundary

Use this mental model:

text

body state
  -> output parameters
  -> audio / visual / camera / DJ / inscription action

Some mappings are neural or model-assisted. Some are rule-based. Some are safety
gated. The docs should identify which is which.

SAN Boundary

SAN is a real Rust pipeline and service. However, current local documentation
must not claim:

exact current training count;
exact validation loss;
replacement of all heuristics;
full 128D end-to-end training;
that the current manifest has 135K parameters.

The verified local manifest count is 164,248 parameters.

Diffusion Boundary

The current app generation service is best described as:

text

conditioned one-step flow/token generation with fallback

not:

text

full deployed multi-step diffusion

That distinction matters because the current CoreML encoder input is 104D while
the broader dynamics vector is 128D-ish.

Performance Boundary

For live use:

- K11 should remain the final Rekordbox command gate.
- Mac4 should feed Unity directly where that path is stable.
- MotionMix BodyTruth should become the shared state authority after sustained
stability checks.
- Gesture docs should always include camera-only fallback behavior.

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

computational-choreography/04-generative-output/overview.md

Detected Structure

Evaluation · Architecture