Quick Test Guide - Wav2Vec2 + Rekordbox
**What to expect**: ``` Loading Wav2Vec2 ASR model: facebook/wav2vec2-base-960h โ Rekordbox orbiter initialized ๐ค Wav2Vec2 listener started. Speak now...
Full Public Reader
Quick Test Guide - Wav2Vec2 + Rekordbox
1. Prerequisites Check
System Requirements
# From studio/ directory
cd [home]/Desktop/Computational\ Choreography/computational-studio/studio
# Activate venv
source venv/bin/activate
# Check Python packages
python -c "import torch; print('โ
PyTorch:', torch.__version__)"
python -c "import transformers; print('โ
Transformers:', transformers.__version__)"
python -c "import pynput; print('โ
pynput installed')"
python -c "from huggingface_hub import InferenceClient; print('โ
HF Hub installed')"Environment Variables
# Check .env file exists
ls -la ../.env
# Should contain:
# HF_TOKEN=your_huggingface_token_here### Rekordbox Setup
1. Launch Rekordbox in Performance mode (not Export mode)
2. Load tracks to Deck 1 and Deck 2
3. Verify keyboard shortcuts:
- Manually press `Z` โ Deck 1 should play/pause
- Manually press `N` โ Deck 2 should play/pause
- Manually press `7` โ 4-beat loop on Deck 1
2. Quick Test - Ghost Mode
Test the pipeline without actually controlling Rekordbox:
# From studio/ directory
python dj_agent/scripts/run_rekordbox_voice_wav2vec.pyWhat to expect:
Loading Wav2Vec2 ASR model: facebook/wav2vec2-base-960h
โ
Rekordbox orbiter initialized
๐ค Wav2Vec2 listener started. Speak now...
[Speak: "play left"]
๐ ASR (text): "play left"
๐ Rekordbox: execute 3006 (Z) reason=approved
๐ป [GHOST] Would execute 3006 (Z) score=1.000
โฑ Rekordbox latency: 45.2 ms
[Speak: "loop right"]
๐ ASR (text): "loop right"
๐ Rekordbox: execute 3114 (7 + Shift) reason=approved
๐ป [GHOST] Would execute 3114 (7 + Shift) score=1.000
โฑ Rekordbox latency: 52.1 msVerify:
- โ
ASR transcribes your speech correctly
- โ
Commands map to correct IDs (3006 = Play Left, 3114 = Loop Right)
- โ
Shortcuts match Rekordbox (Z, 7+Shift, etc.)
- โ
Latency < 100ms for mapping
3. Live Test - Real Keyboard Control
โ ๏ธ IMPORTANT: This will send real keyboard commands to Rekordbox!
Step 1: Enable Live Mode
Edit `dj_agent/scripts/run_rekordbox_voice_wav2vec.py`:
Find line:
bridge_cfg = BridgeConfig(ghost=False) # Changed from TrueOr add command-line flag support (recommended).
Step 2: Test Basic Commands
./START_REKORDBOX_VOICE_WAV2VEC.shTest sequence:
1. Play Left
- Say: "play left"
- Expected: Deck 1 plays/pauses
- Verify: Audio output changes, waveform moves
2. Play Right
- Say: "play right"
- Expected: Deck 2 plays/pauses
- Verify: Deck 2 starts playing
3. Sync Left
- Say: "sync left" or "beat sync left"
- Expected: Deck 1 syncs to Deck 2
- Verify: BPMs match, beatgrids align
4. Loop Left
- Say: "loop left"
- Expected: 4-beat loop on Deck 1
- Verify: Loop indicator appears, playback loops
5. Loop Right
- Say: "loop right"
- Expected: 4-beat loop on Deck 2
- Verify: Deck 2 enters loop mode
4. Common Issues & Solutions
Issue: ASR Not Recognizing Speech
Symptoms:
๐ ASR (text): ""
โ ๏ธ No Rekordbox command matchSolutions:
- โ
Check microphone input (run `python -m sounddevice`)
- โ
Speak louder and clearer
- โ
Reduce background music volume
- โ
Use noise-cancelling microphone
- โ
Check sample rate (should be 16kHz for Wav2Vec2)
Issue: Wrong Command Mapped
Symptoms:
๐ ASR (text): "play left"
๐ Rekordbox: execute 3106 (N) reason=approved # Wrong! Should be 3006 (Z)Solutions:
- โ
Check hard-coded overrides in `run_rekordbox_voice_wav2vec.py:102-126`
- โ
Verify `Mapping/commands.yaml` has correct IDs
- โ
Clear any cached embeddings
- โ
Say "left" or "right" explicitly
Issue: Keyboard Shortcuts Not Working
Symptoms:
๐ Rekordbox: execute 3006 (Z) reason=approved
[Keyboard sent: Z]
# But Rekordbox doesn't respondSolutions:
- โ
Ensure Rekordbox window is focused (active)
- โ
Check macOS Accessibility permissions:
- System Preferences โ Security & Privacy โ Accessibility
- Add Terminal or Python to allowed apps
- โ
Test manually: Press `Z` while Rekordbox is focused
- โ
Verify shortcuts in Rekordbox โ Preferences โ Keyboard
- โ
Make sure Rekordbox is in Performance mode, not Export
Issue: High Latency
Symptoms:
โฑ Rekordbox latency: 450.2 ms # Too slow!Solutions:
- โ
Use GPU for Wav2Vec2 (if available)
- โ
Profile stages:
# In run_rekordbox_voice_wav2vec.py
t_asr = time.time()
# ... ASR code ...
print(f"ASR: {(time.time()-t_asr)*1000:.1f}ms")
t_embed = time.time()
# ... Embedding code ...
print(f"Embed: {(time.time()-t_embed)*1000:.1f}ms")- โ Cache embeddings for common commands
- โ Reduce top_k in search (currently 5)
5. Integration with My New Rekordbox Bridge
You can optionally use the new `RekordboxBridge` I created for better control:
# In run_rekordbox_voice_wav2vec.py, replace the orbiter bridge with:
from dj_agent.core.rekordbox_bridge import RekordboxBridge
import yaml
# Load config
with open('configs/rekordbox.yaml') as f:
config = yaml.safe_load(f)
# Initialize my bridge
rekordbox_bridge = RekordboxBridge(config['dj']['rekordbox'])
# In your execute logic:
def execute_command(command_id: str):
# Map command_id to action name
id_to_action = {
"3006": "PLAY_A",
"3106": "PLAY_B",
"3014": "LOOP_4_A",
"3114": "LOOP_4_B",
"3009": "SYNC_A",
"3109": "SYNC_B",
# ... add more mappings from commands.yaml
}
action_name = id_to_action.get(command_id)
if not action_name:
print(f"Unknown command ID: {command_id}")
return
# Get keyboard mapping from config
mapping = config['dj']['rekordbox']['map']
if action_name not in mapping:
print(f"No mapping for action: {action_name}")
return
# Build message
action_config = mapping[action_name]
message = {
'type': 'keyboard',
'key': action_config['key'],
'modifiers': action_config.get('modifiers', [])
}
# Send!
rekordbox_bridge.send(message)6. Next Steps
Once basic commands work:
1. Test more commands:
- Hot cues: "set hot cue A left deck"
- Effects: "effects left", "echo left"
- Recording: "start recording"
2. Run formal evaluation:
python dj_agent/scripts/eval_rekordbox_voice_wav2vec.py tests/rekordbox_manifest.jsonl3. Compare with Gemini:
# Run Gemini path for comparison
python dj_agent/scripts/run_rekordbox_voice_gemini.py4. Tune for your setup:
- Adjust confidence thresholds
- Add custom command aliases
- Fine-tune Wav2Vec2 on your voice (optional)
7. Performance Targets
Good Performance:
- โ
ASR accuracy: > 85
- โ
Command accuracy: > 90
- โ
ASR latency: < 300ms
- โ
Total latency: < 500ms
Excellent Performance:
- โ
ASR accuracy: > 95
- โ
Command accuracy: > 97
- โ
ASR latency: < 200ms
- โ
Total latency: < 300ms
8. Troubleshooting Script
Run this to diagnose issues:
# Test audio input
python -c "
import sounddevice as sd
import numpy as np
print('๐ค Recording 3 seconds...')
audio = sd.rec(int(3 * 16000), samplerate=16000, channels=1, dtype='float32')
sd.wait()
print(f'โ
Recorded {len(audio)} samples')
print(f' Max amplitude: {np.abs(audio).max():.3f}')
print(' If max < 0.01, microphone may be too quiet')
"
# Test Wav2Vec2 ASR
python -c "
from dj_agent.voice_control.wav2vec_asr import transcribe
import numpy as np
print('๐ Testing Wav2Vec2 ASR...')
# Test with silence (should return empty or noise)
silence = np.zeros(16000, dtype=np.float32)
result = transcribe(silence, 16000)
print(f'Silence transcribed as: \"{result}\"')
"
# Test Embedding Gemma
python -c "
from dj_agent.voice_control.orbiter.embedding import EmbeddingGemmaProvider
print('๐ง Testing Embedding Gemma...')
embedder = EmbeddingGemmaProvider()
emb = embedder.embed_text('play left')
print(f'โ
Embedding shape: {emb.shape}')
print(f' Values: [{emb[0]:.3f}, {emb[1]:.3f}, ..., {emb[-1]:.3f}]')
"
# Test Rekordbox Index
python -c "
from dj_agent.voice_control.orbiter import RekordboxOrbiter, OrbiterConfig
from dj_agent.voice_control.orbiter.embedding import EmbeddingGemmaProvider
from pathlib import Path
print('๐ Testing Rekordbox Index...')
commands_path = Path('Mapping/commands.yaml')
embedder = EmbeddingGemmaProvider()
orbiter = RekordboxOrbiter(OrbiterConfig(commands_path=commands_path), embedder)
emb = embedder.embed_text('play left')
hits = orbiter.index.search(emb, top_k=5)
print(f'โ
Found {len(hits)} matches:')
for hit in hits:
print(f' - {hit.command_id}: {hit.metadata.get(\"shortcut\")} (score={hit.score:.3f})')
"---
Ready to test? Run:
./START_REKORDBOX_VOICE_WAV2VEC.shGood luck! ๐ค๐๏ธ
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
projects/Documentation/02-projects/dj-agent/studio/docs/TEST_WAV2VEC_QUICK.md
Detected Structure
Evaluation ยท References ยท Code Anchors ยท Architecture