Grand Diomande Research · Full HTML Reader

Audio Engine — Real-Time Music Synthesis

**1. Rust Echelon Audio** (primary, when Echelon is available) - The Rust `MotionSynth` synthesizer rendered into `AVAudioSourceNode` - Parameters come from `echelon_bridge_brain_to_audio()` which maps z* directly to synthesizer state - Lower latency, mathematically coherent with z*

Embodied Trajectory Systems research note experiment writeup candidate score 18 .md

Full Public Reader

Audio Engine — Real-Time Music Synthesis

Two Audio Paths

The system has two ways to produce sound, and they can be blended:

1. Rust Echelon Audio (primary, when Echelon is available)
- The Rust `MotionSynth` synthesizer rendered into `AVAudioSourceNode`
- Parameters come from `echelon_bridge_brain_to_audio()` which maps z* directly
to synthesizer state
- Lower latency, mathematically coherent with z*

2. Swift AudioEngine (fallback + blending layer)
- Five AVAudioSourceNode voices: kick, clap, hihat, bass, pad
- Pattern sequencer: 16-step patterns per voice, evolving based on SAN output
- `StrudelEngine` integration: SAN pattern params → Tidal Cycles pattern string
→ Web Audio API in an embedded WebView

The `forceStrudelMode` flag on AudioEngine switches between these paths.

The Synthesizer Voice Architecture

When running in Swift fallback mode, each voice is a real-time synthesizer:

Kick:
- Sine burst with exponential frequency sweep (starting freq → 55Hz)
- Exponential amplitude decay
- Distortion saturation for punch
- SAN `audioKick[0]` → amplitude, `audioKick[1]` → pitch shift

Clap:
- White noise burst + bandpass filter
- Short attack, fast decay
- `audioHihat[0]` drives amplitude (shared bus with hihat)

Hihat (closed + open):
- White noise with steep highpass filter
- Closed: 8ms decay. Open: 200ms decay
- `audioHihat[1]` → pitch (filter center frequency shift)

Bass:
- Sawtooth oscillator through 2-pole lowpass filter
- Pitch quantized to musical scale based on `z*[phase]`
- `audioBass[2]` → filter cutoff (more tension → brighter bass)
- `audioBass[3]` → filter resonance

Pad:
- Detuned oscillator pair (two sawtooths, slight frequency offset)
- Long attack (200ms), sustain, long release
- Through reverb + delay effects chain
- `audioPad[0]` → amplitude, `audioPad[3]` → filter cutoff (brightness)

BPM and Rhythm

The base BPM is derived from `lexicon.tempo` (the beat frequency estimate from
z[8..15] periodicity). This is modulated by the performance style:

swift
// In AudioEngine
var bpm: Double = 124  // base
// At each EchelonBridge step:
bpm = baseBpm + (lexicon.tempo - 0.5) * tempoRange
// tempoRange ≈ ±20 BPM at the extremes

The sequencer fires on a CADisplayLink-driven timer that advances the 16-step
position at `bpm/60 × 4` Hz (16th notes at the current BPM). Beat 0 fires
`onBeatZero` → haptic trigger on the Apple Watch.

The Effects Chain

Each voice → AVAudioMixerNode
                  ↓
          AVAudioUnitReverb (room reverb, wet level from z*[grounding])
                  ↓
          AVAudioUnitDelay (feedback delay, delay time from z*[phase])
                  ↓
          Master volume fader
                  ↓
          AVAudioOutputNode (speaker / headphones)

The reverb wet level is modulated by `grounding`: high grounding (stable, floor-
connected) → less reverb (dry, direct). Low grounding (aerial, floating) → more
reverb (spacious, ambient).

The delay time is modulated by `phase`: on the downbeat (phase ≈ 0) the delay is
set to a note-synchronized value; off-beat it can be slightly detuned for groove.

FFT Visualizer

AudioEngine taps the mixer output and runs a 256-point FFT at ~30fps:

swift
// 128 magnitude bins published for the visualizer
@Published var fftMagnitudes: [Float] = Array(repeating: 0, count: 128)

These drive the orb visualization in PerformView — the visual pulsation of the
orb is synchronized to the actual frequency content of the audio output.

SAN Parameter Injection

The SAN outputs (`audioKick`, `audioBass`, etc.) are blended into the AudioEngine
parameters at a rate controlled by `SANService.mixFactor`:

swift
// In AudioEngine, every tick:
let effectiveAmplitude = lerp(heuristicAmplitude, san.audioKick[0], san.mixFactor)

Current verified behavior is more conservative: `SANService.mixFactor` defaults
to `0.0`, and the Rust `SANConfig` default also starts at pure heuristic blend.
SAN only becomes audible when a consumer actually blends its output and
`mixFactor` is raised above zero. Calibration confidence is evidence for the SAN
state, not proof that the learned mapping has taken over the audio engine.

Multi-Genre Mode

AudioEngine supports simultaneous two-genre blending:

swift
var isBlendMode = false
var blendGenreB: Genre = .ambient
var crossfade: Float = 0.5  // 0 = all A, 1 = all B

Two independent sequencer states (`state` and `stateB`) run in parallel. Their
outputs are mixed at the master fader by `crossfade`. This enables live transitions
between styles (e.g., fade from House to Ambient as the movement becomes more
sustained and grounded) without any interruption in audio output.

Camera-Only Mode

When running on a camera-node iPhone (e.g., iPhone 14 Pro Max in photography-only
mode), AudioEngine sets `cameraOnlyMode = true`, which disables all audio output.
The EchelonBridge's `isEnabled` is also false on camera nodes, so no SAN or audio
processing runs at all on those devices.

Genres

Five built-in styles (each is a preset of voice patterns + BPM range + filter envelope):

StyleIndexBPM rangeCharacter
House0120–1304-on-floor kick, filtered bass, disco-influenced
Techno1130–1454-on-floor kick, hard clap, minimal
Jazz280–120Swing hihat, chromatic bass, sparse kick
Electro3110–125Syncopated kick, bass-lead melody, vintage feel
Ambient460–80Sparse percussion, long pad, wide reverb

Style is set via `EchelonBridge.setStyle(_:)` → `echelon_audio_set_style(audio, style)`.
The SAN can modulate within a style's parameter space but does not switch styles
autonomously — style switching is an operator decision.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

computational-choreography/04-generative-output/audio-engine.md

Detected Structure

Method · Evaluation · Architecture