DELL (Dual-Equilibrium Latent Learning) Architecture Diagrams
```mermaid graph TB subgraph Input["Input Layer"] X[Motion State x_t<br/>6-dim:<br/>• Left hand velocity<br/>• Right hand velocity<br/>• Torso orientation] C[Context c_t<br/>1-dim:<br/>• Commitment scalar] end
Full Public Reader
DELL (Dual-Equilibrium Latent Learning) Architecture Diagrams
## Overview
Comprehensive visualization of the DELL neural architecture, training procedure, and inference deployment across Python (training) and Rust (inference).
Architecture Diagram
graph TB
subgraph Input["Input Layer"]
X[Motion State x_t<br/>6-dim:<br/>• Left hand velocity<br/>• Right hand velocity<br/>• Torso orientation]
C[Context c_t<br/>1-dim:<br/>• Commitment scalar]
end
subgraph FastEq["Fast Equilibrium (τ_F = 16.7ms)"]
HF[Hidden State h^F<br/>128-dim<br/>Updates: 60Hz]
WF1[MLP Layer 1<br/>W_f1: 135→256<br/>ReLU activation]
WF2[MLP Layer 2<br/>W_f2: 256→128]
X --> WF1
Z -.feedback.-> WF1
WF1 --> WF2
WF2 --> |α_F = 1.0| HF
HF -.-> HF
end
subgraph SlowEq["Slow Equilibrium (τ_S = 400ms)"]
HS[Hidden State h^S<br/>128-dim<br/>Updates: 2.5Hz]
WS1[MLP Layer 1<br/>W_s1: 135→256<br/>ReLU activation]
WS2[MLP Layer 2<br/>W_s2: 256→128]
X --> WS1
Z -.feedback.-> WS1
WS1 --> WS2
WS2 --> |α_S = 0.025| HS
HS -.-> HS
end
subgraph Gating["Gating Network"]
WG[Gate MLP<br/>W_g: 256→128<br/>Sigmoid output]
G[Gate Vector g_t<br/>128-dim<br/>Range: 0,1]
HF --> WG
HS --> WG
WG --> G
end
subgraph Latent["Latent State"]
Z[z_t = g_t ⊙ h^F + 1-g_t ⊙ h^S<br/>128-dim weighted combination]
G --> Z
HF --> Z
HS --> Z
end
subgraph Output["Output Layer (Audio Parameters)"]
WY[Linear Projection<br/>W_y: 129→32]
Y[Audio Parameters y_t<br/>32-dim:<br/>• FM ratios 0-3<br/>• FM indices 4-7<br/>• ADSR 8-15<br/>• Filter params 16-23<br/>• Reverb 24-27<br/>• Master gain 28-31]
Z --> WY
C --> WY
WY --> Y
end
style FastEq fill:#e3f2fd
style SlowEq fill:#fff3e0
style Gating fill:#f3e5f5
style Latent fill:#e8f5e9
style Output fill:#fce4ecEquilibrium Dynamics
Fast Equilibrium Update Rule
graph LR
subgraph t["Time Step t"]
HF_t[h^F_t]
end
subgraph Compute["Compute f_F"]
Input[" x_t, z_t "]
MLP["2-Layer MLP<br/>f_F: ℝ^135 → ℝ^128"]
Input --> MLP
end
subgraph Update["Equilibrium Update"]
EQ["h^F_t+1 = 1-α_F · h^F_t + α_F · f_Fx_t, z_t<br/><br/>α_F = dt/τ_F = 0.0167s / 0.0167s = 1.0"]
end
subgraph t1["Time Step t+1"]
HF_t1[h^F_t+1]
end
HF_t --> EQ
MLP --> EQ
EQ --> HF_t1
style EQ fill:#bbdefbInterpretation: Fast equilibrium fully replaces its state every frame (α=1.0), making it maximally responsive to immediate motion changes.
Slow Equilibrium Update Rule
graph LR
subgraph t["Time Step t"]
HS_t[h^S_t]
end
subgraph Compute["Compute f_S"]
Input[" x_t, z_t "]
MLP["2-Layer MLP<br/>f_S: ℝ^135 → ℝ^128"]
Input --> MLP
end
subgraph Update["Equilibrium Update"]
EQ["h^S_t+1 = 1-α_S · h^S_t + α_S · f_Sx_t, z_t<br/><br/>α_S = dt/τ_S = 0.0167s / 0.4s = 0.042"]
end
subgraph t1["Time Step t+1"]
HS_t1[h^S_t+1]
end
HS_t --> EQ
MLP --> EQ
EQ --> HS_t1
style EQ fill:#ffe0b2Interpretation: Slow equilibrium **retains 95.8
Training Procedure
flowchart TD
Start([Start Training]) --> LoadData[Load Training Dataset<br/>15 hours of mocopi + audio]
LoadData --> Preprocess[Preprocess Data<br/>• Normalize motion to -1,1<br/>• Compute commitment from EKF<br/>• Extract audio params from PM]
Preprocess --> InitModel[Initialize DELL Model<br/>• Random weights Xavier init<br/>• h^F_0 = 0, h^S_0 = 0]
InitModel --> Epoch{Epoch < 100?}
Epoch -->|Yes| Batch[Sample Mini-Batch<br/>Batch size: 32<br/>Seq length: 300 frames 5s]
Batch --> Forward[Forward Pass<br/>Unroll for T=300 steps]
Forward --> Loss[Compute Loss<br/>L = L_recon + 0.1·L_smooth + 0.05·L_div]
Loss --> Backward[Backward Pass<br/>Backprop through time BPTT]
Backward --> Clip[Gradient Clipping<br/>Max norm: 1.0]
Clip --> Update[Update Weights<br/>AdamW lr=1e-3]
Update --> Val{Every 10 batches?}
Val -->|Yes| Validate[Validate on Held-Out<br/>Compute test loss]
Val -->|No| Batch
Validate --> EarlySTop{Test loss<br/>improving?}
EarlySTop -->|Yes| Batch
EarlySTop -->|No, 5 epochs| SaveModel[Save Best Model<br/>dell_weights.pt]
SaveModel --> Export[Export to Rust<br/>Convert to .safetensors]
Epoch -->|No| Export
Export --> End([Training Complete])
style Loss fill:#ffcdd2
style Export fill:#c8e6c9Loss Function Breakdown
graph TD
subgraph Losses["Loss Components"]
L_total[Total Loss L]
L_recon[Reconstruction Loss<br/>L_recon = MSEy_pred, y_true<br/>Weight: 1.0]
L_smooth[Temporal Smoothness<br/>L_smooth = mean‖y_t+1 - y_t‖²<br/>Weight: 0.1]
L_div[Equilibrium Divergence<br/>L_div = -mean‖h^F - h^S‖²<br/>Weight: 0.05<br/>Negative = maximize divergence]
end
L_recon --> L_total
L_smooth --> L_total
L_div --> L_total
subgraph Purpose["Design Rationale"]
P1[Reconstruction:<br/>Match target audio accurately]
P2[Smoothness:<br/>Prevent audio pops/clicks]
P3[Divergence:<br/>Prevent mode collapse<br/>to single equilibrium]
end
L_recon -.-> P1
L_smooth -.-> P2
L_div -.-> P3
style L_total fill:#1976d2,color:#fff
style L_recon fill:#4caf50
style L_smooth fill:#ff9800
style L_div fill:#9c27b0Inference Deployment (Rust)
sequenceDiagram
participant Train as Training (Python)
participant Export as Export Script
participant Safetensors as .safetensors File
participant Rust as Echelon (Rust)
participant Audio as Audio Thread
Note over Train: Train DELL model<br/>100 epochs (~6 hours)
Train->>Export: python export_to_rust.py
activate Export
Export->>Export: Load dell_weights.pt
Export->>Export: Convert to f32 (from f64)
Export->>Export: Validate dimensions
Export->>Safetensors: Save as dell_v1.safetensors
deactivate Export
Note over Safetensors: Model file:<br/>~2.5 MB<br/>Contains all 15 weight matrices
Rust->>Safetensors: Load at startup
Safetensors-->>Rust: DELLWeights struct
Rust->>Rust: Allocate hidden states<br/>h^F, h^S, z all zeros
Note over Audio: Audio callback starts<br/>48kHz sample rate
loop Every 512 samples (10.7ms)
Audio->>Rust: Request audio parameters
Rust->>Rust: Read latest motion x_t
Rust->>Rust: Read commitment c_t
Rust->>Rust: Update fast equilibrium<br/>h^F ← (1-α_F)h^F + α_F·f_F(x, z)
alt Frame count % 24 == 0 (every 400ms)
Rust->>Rust: Update slow equilibrium<br/>h^S ← (1-α_S)h^S + α_S·f_S(x, z)
end
Rust->>Rust: Compute gate<br/>g ← σ(W_g · [h^F; h^S])
Rust->>Rust: Compute latent<br/>z ← g ⊙ h^F + (1-g) ⊙ h^S
Rust->>Rust: Compute output<br/>y ← W_y · [z; c]
Rust-->>Audio: Audio parameters y_t
Audio->>Audio: Render 512 samples via PM
endWeight Matrix Dimensions
| Layer | Shape | Param Count | Memory (f32) |
|---|---|---|---|
| W_f1 (fast MLP 1) | 135 × 256 | 34,560 | 135 KB |
| b_f1 (fast bias 1) | 256 | 256 | 1 KB |
| W_f2 (fast MLP 2) | 256 × 128 | 32,768 | 128 KB |
| b_f2 (fast bias 2) | 128 | 128 | 0.5 KB |
| W_s1 (slow MLP 1) | 135 × 256 | 34,560 | 135 KB |
| b_s1 (slow bias 1) | 256 | 256 | 1 KB |
| W_s2 (slow MLP 2) | 256 × 128 | 32,768 | 128 KB |
| b_s2 (slow bias 2) | 128 | 128 | 0.5 KB |
| W_g (gate MLP) | 256 × 128 | 32,768 | 128 KB |
| b_g (gate bias) | 128 | 128 | 0.5 KB |
| W_y (output) | 129 × 32 | 4,128 | 16 KB |
| b_y (output bias) | 32 | 32 | 128 B |
| Total | - | 172,480 | ~674 KB |
Inference Performance:
- Matrix multiplications: ~172K FLOPs per step
- Measured latency: 120μs on M2 Pro (single core)
- Throughput: 8,333 inferences/sec (far exceeds 60 Hz requirement)
State Machine Diagram
stateDiagram-v2
[*] --> Idle: Model loaded
Idle --> Warmup: First motion frame
Warmup --> Active: 24 frames processed<br/>(400ms elapsed)
note right of Warmup
Slow equilibrium still initializing
h^S ramping up from zero
Gate biased toward fast (g ≈ 1)
end note
Active --> Active: Normal operation
note right of Active
Both equilibria stable
Gate dynamically balancing
Audio responding to motion
end note
Active --> Transitioning: Large motion change<br/>(commitment spike)
note right of Transitioning
Gate shifts toward fast (g → 1)
Fast eq tracks new gesture
Slow eq smoothly follows
end note
Transitioning --> Active: Motion stabilizes<br/>(commitment decreases)
Active --> Idle: No motion for 5 seconds
note right of Idle
States persist (not reset)
Ready to resume instantly
end note
Idle --> [*]: ShutdownHyperparameter Sensitivity Analysis
Time Constants (τ_F, τ_S)
graph LR
subgraph Tested["Tested Configurations"]
A["τ_F = 16.7ms<br/>τ_S = 200ms"]
B["τ_F = 16.7ms<br/>τ_S = 400ms<br/>✅ CHOSEN"]
C["τ_F = 16.7ms<br/>τ_S = 800ms"]
D["τ_F = 33ms<br/>τ_S = 400ms"]
end
subgraph Results["Validation Results"]
R1["Test Loss: 0.042<br/>Jitter: High"]
R2["Test Loss: 0.038<br/>Jitter: Low<br/>✅ BEST"]
R3["Test Loss: 0.041<br/>Jitter: Very Low<br/>⚠️ Too sluggish"]
R4["Test Loss: 0.045<br/>Jitter: Medium<br/>⚠️ Lag noticeable"]
end
A -.-> R1
B -.-> R2
C -.-> R3
D -.-> R4
style B fill:#4caf50,color:#fff
style R2 fill:#4caf50,color:#fffConclusion: τ_S = 400ms (2.5 Hz update) provides optimal tradeoff between smoothness and responsiveness.
Memory Footprint
pub struct DELLInference {
// Weights (read-only, never mutated)
weights: Arc<DELLWeights>, // 674 KB (shared across threads)
// Hidden states (mutable, per-instance)
h_fast: DVector<f32>, // 128 * 4 = 512 bytes
h_slow: DVector<f32>, // 128 * 4 = 512 bytes
z: DVector<f32>, // 128 * 4 = 512 bytes
// Scratch buffers (pre-allocated)
scratch_256: DVector<f32>, // 256 * 4 = 1 KB
scratch_129: DVector<f32>, // 129 * 4 = 516 bytes
// Config
config: DELLConfig, // 32 bytes
}
// Total per-instance: ~3 KB
// Total with shared weights: ~677 KBCache Behavior:
- Weights: Resident in L3 cache (shared 32 MB on M2 Pro)
- Hidden states: Fit entirely in L1 cache (192 KB per core)
- Scratch buffers: Fit entirely in L1 cache
References
- [19-DELL_THEORY.md](../19-DELL_THEORY.md) - Complete mathematical formulation
- [09-ECHELON_RUNTIME.md](../09-ECHELON_RUNTIME.md) - Echelon runtime integration
- [17-PM_SYNTHESIS.md](../17-PM_SYNTHESIS.md) - Physical Modeling synthesis backend
Promotion Decision
Promote into a technical note or architecture paper with implementation anchors.
Source Anchor
Comp-Core/docs/architecture/diagrams/dell-architecture.md
Detected Structure
Method · Evaluation · References · Code Anchors · Architecture