Grand Diomande Research ยท Full HTML Reader

Real-Time Voice Control for DJ

This will: 1. Calibrate your microphone 2. Start listening continuously 3. Execute keyboard shortcuts when you speak commands 4. Press Ctrl+C to stop

Agents That Account for Themselves research note experiment writeup candidate score 24 .md

Full Public Reader

Real-Time Voice Control for DJ

๐ŸŽค Speak into your microphone and control Serato with your voice!

Quick Start

bash
python3 dj_agent/run_voice_control.py

Then just speak commands like:
- "play left deck"
- "censor right"
- "cue 1"
- "load next track left"

The keyboard shortcuts will be executed automatically!

Installation

Install required packages:

bash
# Install speech recognition
pip install SpeechRecognition

# Install audio support
# On macOS:
brew install portaudio
pip install pyaudio

# On Linux:
sudo apt-get install python3-pyaudio

# Install keyboard control
pip install pynput

Usage

Run Voice Control

bash
python3 dj_agent/run_voice_control.py

This will:
1. Calibrate your microphone
2. Start listening continuously
3. Execute keyboard shortcuts when you speak commands
4. Press Ctrl+C to stop

Show Available Commands

bash
python3 dj_agent/run_voice_control.py --commands

Test Your Microphone

bash
python3 dj_agent/run_voice_control.py --test

Available Commands

### Left Deck (Deck A)
- "play left" / "pause left" / "stop left" โ†’ Press W
- "censor left" โ†’ Press U
- "load next left" โ†’ Alt+W
- "load previous left" โ†’ Alt+Q
- "tempo up left" / "speed up left" โ†’ Press R
- "tempo down left" / "slow down left" โ†’ Press E
- "rewind left" โ†’ Alt+E
- "fast forward left" โ†’ Alt+R
- "cue 1" to "cue 5" โ†’ Press 1-5

### Right Deck (Deck B)
- "play right" / "pause right" / "stop right" โ†’ Press S
- "censor right" โ†’ Press J
- "load next right" โ†’ Alt+S
- "load previous right" โ†’ Alt+A
- "tempo up right" / "speed up right" โ†’ Press F
- "tempo down right" / "slow down right" โ†’ Press D
- "rewind right" โ†’ Alt+D
- "fast forward right" โ†’ Alt+F
- "cue 6" to "cue 0" โ†’ Press 6-0

How It Works

1. Microphone Input - Listens to your voice continuously
2. Speech Recognition - Uses Google Speech Recognition (free, no API key needed)
3. Command Matching - Matches what you said to a command
4. Keyboard Execution - Presses the corresponding keyboard shortcut
5. Serato Responds - Serato receives the keyboard shortcut and executes the action

Tips

1. Speak Clearly - Enunciate commands clearly
2. Include Deck - Always say "left" or "right" to specify which deck
3. Short Commands - Keep commands short and direct
4. Quiet Environment - Works best in quiet environments
5. Serato in Focus - Make sure Serato window is active/focused

Troubleshooting

### "No module named 'speech_recognition'"
Install it:

bash
pip install SpeechRecognition pyaudio

### "No module named 'pynput'"
Install it:

bash
pip install pynput

### Microphone Not Working
- Check system permissions (macOS: System Preferences โ†’ Security & Privacy โ†’ Microphone)
- Test with: `python3 dj_agent/run_voice_control.py --test`
- Try a different microphone

### Commands Not Recognized
- Speak more clearly
- Check available commands: `python3 dj_agent/run_voice_control.py --commands`
- Make sure you're saying the exact phrases

### Keyboard Shortcuts Not Working
- Make sure Serato is the active window
- Check that your Serato keyboard shortcuts match
- Verify keyboard control permissions (macOS: System Preferences โ†’ Security & Privacy โ†’ Accessibility)

Advanced: Using OpenAI Whisper (Optional)

For better accuracy, you can use OpenAI Whisper instead of Google Speech Recognition. This requires more setup but works better with complex commands.

Examples

bash
# Start voice control
python3 dj_agent/run_voice_control.py

# You'll see:
๐ŸŽค Calibrating microphone for ambient noise...
โœ“ Microphone calibrated!
โœ“ Voice controller initialized with 45 commands

๐ŸŽค VOICE CONTROL ACTIVE
Speak commands like:
  โ€ข 'play left deck'
  โ€ข 'censor right'
  โ€ข 'cue 1'

# Then speak:
๐ŸŽค Listening...
๐Ÿ“ Heard: "play left deck"
โœ“ Recognized: "play left deck"
  Pressed: w

๐ŸŽค Listening...
๐Ÿ“ Heard: "censor right"
โœ“ Recognized: "censor right"
  Pressed: j

Stopping

Press Ctrl+C to stop voice control. You'll see stats:

VOICE CONTROL STOPPED
Commands executed: 15
Recognition errors: 2

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

projects/Documentation/02-projects/dj-agent/studio/VOICE_CONTROL_README.md

Detected Structure

Evaluation ยท References ยท Code Anchors