Grand Diomande Research · Full HTML Reader

DELL (Dual-Equilibrium Latent Learning) Architecture Diagrams

```mermaid graph TB subgraph Input["Input Layer"] X[Motion State x_t<br/>6-dim:<br/>• Left hand velocity<br/>• Right hand velocity<br/>• Torso orientation] C[Context c_t<br/>1-dim:<br/>• Commitment scalar] end

Embodied Trajectory Systems architecture technical paper candidate score 54 .md

Full Public Reader

DELL (Dual-Equilibrium Latent Learning) Architecture Diagrams

## Overview
Comprehensive visualization of the DELL neural architecture, training procedure, and inference deployment across Python (training) and Rust (inference).

Architecture Diagram

mermaid
graph TB
    subgraph Input["Input Layer"]
        X[Motion State x_t<br/>6-dim:<br/>• Left hand velocity<br/>• Right hand velocity<br/>• Torso orientation]
        C[Context c_t<br/>1-dim:<br/>• Commitment scalar]
    end

    subgraph FastEq["Fast Equilibrium (τ_F = 16.7ms)"]
        HF[Hidden State h^F<br/>128-dim<br/>Updates: 60Hz]
        WF1[MLP Layer 1<br/>W_f1: 135→256<br/>ReLU activation]
        WF2[MLP Layer 2<br/>W_f2: 256→128]

        X --> WF1
        Z -.feedback.-> WF1
        WF1 --> WF2
        WF2 --> |α_F = 1.0| HF
        HF -.-> HF
    end

    subgraph SlowEq["Slow Equilibrium (τ_S = 400ms)"]
        HS[Hidden State h^S<br/>128-dim<br/>Updates: 2.5Hz]
        WS1[MLP Layer 1<br/>W_s1: 135→256<br/>ReLU activation]
        WS2[MLP Layer 2<br/>W_s2: 256→128]

        X --> WS1
        Z -.feedback.-> WS1
        WS1 --> WS2
        WS2 --> |α_S = 0.025| HS
        HS -.-> HS
    end

    subgraph Gating["Gating Network"]
        WG[Gate MLP<br/>W_g: 256→128<br/>Sigmoid output]
        G[Gate Vector g_t<br/>128-dim<br/>Range: 0,1]

        HF --> WG
        HS --> WG
        WG --> G
    end

    subgraph Latent["Latent State"]
        Z[z_t = g_t ⊙ h^F + 1-g_t ⊙ h^S<br/>128-dim weighted combination]
        G --> Z
        HF --> Z
        HS --> Z
    end

    subgraph Output["Output Layer (Audio Parameters)"]
        WY[Linear Projection<br/>W_y: 129→32]
        Y[Audio Parameters y_t<br/>32-dim:<br/>• FM ratios 0-3<br/>• FM indices 4-7<br/>• ADSR 8-15<br/>• Filter params 16-23<br/>• Reverb 24-27<br/>• Master gain 28-31]

        Z --> WY
        C --> WY
        WY --> Y
    end

    style FastEq fill:#e3f2fd
    style SlowEq fill:#fff3e0
    style Gating fill:#f3e5f5
    style Latent fill:#e8f5e9
    style Output fill:#fce4ec

Equilibrium Dynamics

Fast Equilibrium Update Rule

mermaid
graph LR
    subgraph t["Time Step t"]
        HF_t[h^F_t]
    end

    subgraph Compute["Compute f_F"]
        Input[" x_t, z_t "]
        MLP["2-Layer MLP<br/>f_F: ℝ^135 → ℝ^128"]
        Input --> MLP
    end

    subgraph Update["Equilibrium Update"]
        EQ["h^F_t+1 = 1-α_F · h^F_t + α_F · f_Fx_t, z_t<br/><br/>α_F = dt/τ_F = 0.0167s / 0.0167s = 1.0"]
    end

    subgraph t1["Time Step t+1"]
        HF_t1[h^F_t+1]
    end

    HF_t --> EQ
    MLP --> EQ
    EQ --> HF_t1

    style EQ fill:#bbdefb

Interpretation: Fast equilibrium fully replaces its state every frame (α=1.0), making it maximally responsive to immediate motion changes.

Slow Equilibrium Update Rule

mermaid
graph LR
    subgraph t["Time Step t"]
        HS_t[h^S_t]
    end

    subgraph Compute["Compute f_S"]
        Input[" x_t, z_t "]
        MLP["2-Layer MLP<br/>f_S: ℝ^135 → ℝ^128"]
        Input --> MLP
    end

    subgraph Update["Equilibrium Update"]
        EQ["h^S_t+1 = 1-α_S · h^S_t + α_S · f_Sx_t, z_t<br/><br/>α_S = dt/τ_S = 0.0167s / 0.4s = 0.042"]
    end

    subgraph t1["Time Step t+1"]
        HS_t1[h^S_t+1]
    end

    HS_t --> EQ
    MLP --> EQ
    EQ --> HS_t1

    style EQ fill:#ffe0b2

Interpretation: Slow equilibrium **retains 95.8

Training Procedure

mermaid
flowchart TD
    Start([Start Training]) --> LoadData[Load Training Dataset<br/>15 hours of mocopi + audio]

    LoadData --> Preprocess[Preprocess Data<br/>• Normalize motion to -1,1<br/>• Compute commitment from EKF<br/>• Extract audio params from PM]

    Preprocess --> InitModel[Initialize DELL Model<br/>• Random weights Xavier init<br/>• h^F_0 = 0, h^S_0 = 0]

    InitModel --> Epoch{Epoch < 100?}

    Epoch -->|Yes| Batch[Sample Mini-Batch<br/>Batch size: 32<br/>Seq length: 300 frames 5s]

    Batch --> Forward[Forward Pass<br/>Unroll for T=300 steps]

    Forward --> Loss[Compute Loss<br/>L = L_recon + 0.1·L_smooth + 0.05·L_div]

    Loss --> Backward[Backward Pass<br/>Backprop through time BPTT]

    Backward --> Clip[Gradient Clipping<br/>Max norm: 1.0]

    Clip --> Update[Update Weights<br/>AdamW lr=1e-3]

    Update --> Val{Every 10 batches?}

    Val -->|Yes| Validate[Validate on Held-Out<br/>Compute test loss]
    Val -->|No| Batch

    Validate --> EarlySTop{Test loss<br/>improving?}

    EarlySTop -->|Yes| Batch
    EarlySTop -->|No, 5 epochs| SaveModel[Save Best Model<br/>dell_weights.pt]

    SaveModel --> Export[Export to Rust<br/>Convert to .safetensors]

    Epoch -->|No| Export

    Export --> End([Training Complete])

    style Loss fill:#ffcdd2
    style Export fill:#c8e6c9

Loss Function Breakdown

mermaid
graph TD
    subgraph Losses["Loss Components"]
        L_total[Total Loss L]

        L_recon[Reconstruction Loss<br/>L_recon = MSEy_pred, y_true<br/>Weight: 1.0]

        L_smooth[Temporal Smoothness<br/>L_smooth = mean‖y_t+1 - y_t‖²<br/>Weight: 0.1]

        L_div[Equilibrium Divergence<br/>L_div = -mean‖h^F - h^S‖²<br/>Weight: 0.05<br/>Negative = maximize divergence]
    end

    L_recon --> L_total
    L_smooth --> L_total
    L_div --> L_total

    subgraph Purpose["Design Rationale"]
        P1[Reconstruction:<br/>Match target audio accurately]
        P2[Smoothness:<br/>Prevent audio pops/clicks]
        P3[Divergence:<br/>Prevent mode collapse<br/>to single equilibrium]
    end

    L_recon -.-> P1
    L_smooth -.-> P2
    L_div -.-> P3

    style L_total fill:#1976d2,color:#fff
    style L_recon fill:#4caf50
    style L_smooth fill:#ff9800
    style L_div fill:#9c27b0

Inference Deployment (Rust)

mermaid
sequenceDiagram
    participant Train as Training (Python)
    participant Export as Export Script
    participant Safetensors as .safetensors File
    participant Rust as Echelon (Rust)
    participant Audio as Audio Thread

    Note over Train: Train DELL model<br/>100 epochs (~6 hours)

    Train->>Export: python export_to_rust.py
    activate Export

    Export->>Export: Load dell_weights.pt
    Export->>Export: Convert to f32 (from f64)
    Export->>Export: Validate dimensions
    Export->>Safetensors: Save as dell_v1.safetensors
    deactivate Export

    Note over Safetensors: Model file:<br/>~2.5 MB<br/>Contains all 15 weight matrices

    Rust->>Safetensors: Load at startup
    Safetensors-->>Rust: DELLWeights struct

    Rust->>Rust: Allocate hidden states<br/>h^F, h^S, z all zeros

    Note over Audio: Audio callback starts<br/>48kHz sample rate

    loop Every 512 samples (10.7ms)
        Audio->>Rust: Request audio parameters
        Rust->>Rust: Read latest motion x_t
        Rust->>Rust: Read commitment c_t

        Rust->>Rust: Update fast equilibrium<br/>h^F ← (1-α_F)h^F + α_F·f_F(x, z)

        alt Frame count % 24 == 0 (every 400ms)
            Rust->>Rust: Update slow equilibrium<br/>h^S ← (1-α_S)h^S + α_S·f_S(x, z)
        end

        Rust->>Rust: Compute gate<br/>g ← σ(W_g · [h^F; h^S])
        Rust->>Rust: Compute latent<br/>z ← g ⊙ h^F + (1-g) ⊙ h^S
        Rust->>Rust: Compute output<br/>y ← W_y · [z; c]

        Rust-->>Audio: Audio parameters y_t
        Audio->>Audio: Render 512 samples via PM
    end

Weight Matrix Dimensions

LayerShapeParam CountMemory (f32)
W_f1 (fast MLP 1)135 × 25634,560135 KB
b_f1 (fast bias 1)2562561 KB
W_f2 (fast MLP 2)256 × 12832,768128 KB
b_f2 (fast bias 2)1281280.5 KB
W_s1 (slow MLP 1)135 × 25634,560135 KB
b_s1 (slow bias 1)2562561 KB
W_s2 (slow MLP 2)256 × 12832,768128 KB
b_s2 (slow bias 2)1281280.5 KB
W_g (gate MLP)256 × 12832,768128 KB
b_g (gate bias)1281280.5 KB
W_y (output)129 × 324,12816 KB
b_y (output bias)3232128 B
Total-172,480~674 KB

Inference Performance:
- Matrix multiplications: ~172K FLOPs per step
- Measured latency: 120μs on M2 Pro (single core)
- Throughput: 8,333 inferences/sec (far exceeds 60 Hz requirement)

State Machine Diagram

mermaid
stateDiagram-v2
    [*] --> Idle: Model loaded

    Idle --> Warmup: First motion frame

    Warmup --> Active: 24 frames processed<br/>(400ms elapsed)
    note right of Warmup
        Slow equilibrium still initializing
        h^S ramping up from zero
        Gate biased toward fast (g ≈ 1)
    end note

    Active --> Active: Normal operation
    note right of Active
        Both equilibria stable
        Gate dynamically balancing
        Audio responding to motion
    end note

    Active --> Transitioning: Large motion change<br/>(commitment spike)
    note right of Transitioning
        Gate shifts toward fast (g → 1)
        Fast eq tracks new gesture
        Slow eq smoothly follows
    end note

    Transitioning --> Active: Motion stabilizes<br/>(commitment decreases)

    Active --> Idle: No motion for 5 seconds
    note right of Idle
        States persist (not reset)
        Ready to resume instantly
    end note

    Idle --> [*]: Shutdown

Hyperparameter Sensitivity Analysis

Time Constants (τ_F, τ_S)

mermaid
graph LR
    subgraph Tested["Tested Configurations"]
        A["τ_F = 16.7ms<br/>τ_S = 200ms"]
        B["τ_F = 16.7ms<br/>τ_S = 400ms<br/>✅ CHOSEN"]
        C["τ_F = 16.7ms<br/>τ_S = 800ms"]
        D["τ_F = 33ms<br/>τ_S = 400ms"]
    end

    subgraph Results["Validation Results"]
        R1["Test Loss: 0.042<br/>Jitter: High"]
        R2["Test Loss: 0.038<br/>Jitter: Low<br/>✅ BEST"]
        R3["Test Loss: 0.041<br/>Jitter: Very Low<br/>⚠️ Too sluggish"]
        R4["Test Loss: 0.045<br/>Jitter: Medium<br/>⚠️ Lag noticeable"]
    end

    A -.-> R1
    B -.-> R2
    C -.-> R3
    D -.-> R4

    style B fill:#4caf50,color:#fff
    style R2 fill:#4caf50,color:#fff

Conclusion: τ_S = 400ms (2.5 Hz update) provides optimal tradeoff between smoothness and responsiveness.

Memory Footprint

rust
pub struct DELLInference {
    // Weights (read-only, never mutated)
    weights: Arc<DELLWeights>,  // 674 KB (shared across threads)

    // Hidden states (mutable, per-instance)
    h_fast: DVector<f32>,       // 128 * 4 = 512 bytes
    h_slow: DVector<f32>,       // 128 * 4 = 512 bytes
    z: DVector<f32>,            // 128 * 4 = 512 bytes

    // Scratch buffers (pre-allocated)
    scratch_256: DVector<f32>,  // 256 * 4 = 1 KB
    scratch_129: DVector<f32>,  // 129 * 4 = 516 bytes

    // Config
    config: DELLConfig,         // 32 bytes
}

// Total per-instance: ~3 KB
// Total with shared weights: ~677 KB

Cache Behavior:
- Weights: Resident in L3 cache (shared 32 MB on M2 Pro)
- Hidden states: Fit entirely in L1 cache (192 KB per core)
- Scratch buffers: Fit entirely in L1 cache

References

  • [19-DELL_THEORY.md](../19-DELL_THEORY.md) - Complete mathematical formulation
  • [09-ECHELON_RUNTIME.md](../09-ECHELON_RUNTIME.md) - Echelon runtime integration
  • [17-PM_SYNTHESIS.md](../17-PM_SYNTHESIS.md) - Physical Modeling synthesis backend

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

Comp-Core/docs/architecture/diagrams/dell-architecture.md

Detected Structure

Method · Evaluation · References · Code Anchors · Architecture