Back to corpus
proposalexperiment writeup candidatescore 24

Modular Voice Control System

A comprehensive, extensible voice control system for DJ control using Gemini Live API with track analysis and intelligent transition recommendations.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

A comprehensive, extensible voice control system for DJ control using Gemini Live API with track analysis and intelligent transition recommendations. ### Core Functionality - **Real-time Voice Recognition**: Uses Gemini Live API for high-accuracy speech recognition - **Command Matching**: Fuzzy matching with command buffering for multi-word commands - **Keyboard Execution**: Sends keyboard shortcuts to control Serato DJ - **Deck Management**: Tracks current deck and manages state ### Advanced Features - **Track Analysis**: On-demand audio analysis using librosa - BPM detection - Beat grid extraction - Drop detection (energy spikes) - Build-up detection - Section detection (breakdowns, builds) - **Transition Recommendations**: AI-powered suggestions using Gemini - Harmonic mixing (key compatibility) - Energy matching - Beat alignment - Optimal timing suggestions ### Higher-Order Commands - **Play Next**: Loads and plays next track - **Continuous Mode**: Auto-play with automatic track loading - **Transitions**: Smooth transitions between tracks - **Sync & Play**: Beat-matched transitions Track analysis is automatically triggered when: - User navigates library ("next track", "move down") - User loads tracks ("load left", "load right")

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.