Quick Start Guide - Voice Control System
The voice control system now includes an **Auto DJ** feature that automatically mixes tracks with intelligent transitions and effects!
Full Public Reader
Quick Start Guide - Voice Control System
Auto DJ Feature
The voice control system now includes an Auto DJ feature that automatically mixes tracks with intelligent transitions and effects!
Quick Start with Auto DJ
1. Start Voice Control:
python3 dj_agent/run_voice_control_gemini.py2. Add Tracks (programmatically or via Serato):
- Tracks are automatically analyzed when added
- Analysis includes BPM, key, energy levels, drops, and breakdowns
3. Start Auto DJ:
Say: "start auto dj"
4. Control Auto DJ:
- "stop auto dj" - Stop automatic mixing
- "pause auto dj" / "resume auto dj" - Pause/resume
- "skip track" - Skip to next track
- "set auto dj mode harmonic" - Change mixing strategy
- "show queue" - View current queue
- "clear queue" - Clear all tracks
Mixing Strategies
- harmonic: Key-based mixing (Camelot wheel)
- bpm: BPM matching
- energy: Energy level matching
- composite: Balanced combination (default)
- random: Random selection
See `dj_agent/AUTO_DJ_README.md` for complete documentation.
---
Quick Start Guide - Voice Control System
Prerequisites
1. Install Dependencies
pip install -r requirements.txt2. Get Gemini API Key
- Visit: https://ai.google.dev/
- Create an API key
- Add it to `.env` file:
echo "GEMINI_API_KEY=your-api-key-here" > .envStarting the Program
Method 1: Using the Shell Script (Easiest)
# From project root
./START_VOICE_CONTROL_GEMINI.shThis script:
- Checks for `.env` file
- Starts voice control automatically
- Handles API key loading
Method 2: Direct Python Command
# Basic start
python3 dj_agent/run_voice_control_gemini.py
# With API key as argument
python3 dj_agent/run_voice_control_gemini.py --api-key your-api-key-here
# Disable track analysis (faster startup)
python3 dj_agent/run_voice_control_gemini.py --no-track-analysis
# Disable transition recommendations
python3 dj_agent/run_voice_control_gemini.py --no-transitions
# Both disabled (minimal mode)
python3 dj_agent/run_voice_control_gemini.py --no-track-analysis --no-transitionsMethod 3: List Available Commands
# See all 328+ voice commands
python3 dj_agent/run_voice_control_gemini.py --commandsUsage Examples
Basic Voice Control
1. Start the program:
./START_VOICE_CONTROL_GEMINI.sh2. Speak commands:
- "play left" - Plays left deck
- "play right" - Plays right deck
- "cue 1 left" - Jump to cue 1 on left deck
- "play next" - Load and play next track
- "continuous mode" - Enable auto-play mode
3. Stop: Press `Ctrl+C`
With Track Analysis
Track analysis runs automatically when enabled. It analyzes tracks when you:
- Navigate library ("next track", "move down")
- Load tracks ("load left", "load right")
Note: Full track path detection requires Serato integration. Currently, the hooks are in place but need Serato to provide actual track paths.
With Transition Recommendations
When enabled, the system can suggest optimal transition points:
- Harmonic mixing (key compatibility)
- Energy matching
- Beat alignment
Note: Requires both tracks to be analyzed first.
Command Line Options
--api-key KEY Gemini API key (or use .env file)
--commands List all available voice commands
--no-track-analysis Disable track analysis features
--no-transitions Disable transition recommendationsExample Session
# 1. Start voice control
$ ./START_VOICE_CONTROL_GEMINI.sh
# Output:
# ๐ Starting Gemini Live Voice Control...
# โ Gemini Live API initialized
# โ Voice controller initialized with 328 commands
# โ Track analysis enabled
# โ Transition advisor enabled
#
# ๐ค GEMINI LIVE VOICE CONTROL - HIGH ACCURACY MODE
# ======================================================================
#
# โ๏ธ Settings:
# Model: gemini-2.0-flash-exp
# Sample rate: 16000Hz
# Cooldown: 1.5s
#
# โ Connecting to Gemini Live API...
# โ Connected to Gemini Live API
# โ Audio streaming started
#
# ๐ค Listening for voice commands...
# 2. Speak commands:
# You: "play left"
# System: โ "play left" โ Pressed: w
# You: "play next right"
# System: โ "play next right" โ ๐ Executing chain: play next right
# โ Pressed: down
# โ Pressed: shift+right
# โ Pressed: s
# 3. Stop:
# Press Ctrl+C
# System: VOICE CONTROL STOPPEDTroubleshooting
### "API key required"
- Create `.env` file with: `GEMINI_API_KEY=your-key`
- Or pass `--api-key your-key`
- Or set: `export GEMINI_API_KEY=your-key`
"No module named 'pynput'"
pip install pynput"No module named 'pyaudio'"
# macOS
brew install portaudio
pip install pyaudio
# Linux
sudo apt-get install portaudio19-dev
pip install pyaudio"No module named 'librosa'"
pip install librosa scipy### Microphone not working
- Check macOS permissions: System Preferences โ Security โ Microphone
- Grant Terminal/Python access to microphone
Quick Reference
Start Commands:
# Easiest
./START_VOICE_CONTROL_GEMINI.sh
# Direct
python3 dj_agent/run_voice_control_gemini.py
# List commands
python3 dj_agent/run_voice_control_gemini.py --commandsStop:
- Press `Ctrl+C`
Test System:
python3 dj_agent/voice_control/test_system.pyPromotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
Comp-Core/apps/web/cc-studio/docs/dj_agent/voice_control/QUICKSTART.md
Detected Structure
Method ยท Evaluation ยท References ยท Code Anchors