Back to corpus
architecturetechnical paper candidatescore 54

Voice Control Architecture - Technical Overview

You now have **three independent voice control pipelines** for Rekordbox DJ software, each optimized for different use cases.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

You now have **three independent voice control pipelines** for Rekordbox DJ software, each optimized for different use cases. **Files:** - `dj_agent/scripts/run_rekordbox_voice_gemini.py` - Main entry point - `dj_agent/voice_control/gemini_live_asr.py` - Gemini Live streaming - `dj_agent/voice_control/orbiter/` - Command matching & execution - `START_REKORDBOX_VOICE_GEMINI.sh` - Launcher script **Pros:** - Lowest latency (80ms total) - Highest out-of-box accuracy (98%) - Simplest setup **Cons:** - Requires internet connection - API costs (~$0.001 per command) - Sends audio to cloud **Files:** - `dj_agent/scripts/run_rekordbox_voice_whisper.py` - Main entry point - `dj_agent/voice_control/whisper_asr.py` - Whisper transcription - `dj_agent/voice_control/core/whisper_listener.py` - VAD + Whisper integration - `START_REKORDBOX_VOICE_WHISPER.sh` - Launcher script

Promotion decision

What has to happen next

Promote into a technical note or architecture paper with implementation anchors.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.