Grand Diomande Research · Full HTML Reader

How to Run the Voice Control System

```bash # Make sure you're in the project root (studio/) cd "[home]/Desktop/Computational Choreography/computational-studio/studio"

Agents That Account for Themselves research note experiment writeup candidate score 20 .md

Full Public Reader

How to Run the Voice Control System

Quick Start

1. Install Dependencies

bash

# Make sure you're in the project root (studio/)
cd "[home]/Desktop/Computational Choreography/computational-studio/studio"

# Activate virtual environment
source venv/bin/activate

# Install/update dependencies
pip install -r requirements.txt

2. Set Up API Key

Create a `.env` file in the project root (`studio/.env`) with:

bash

GEMINI_API_KEY=your-api-key-here

Get your API key from: https://ai.google.dev/

3. Run the Voice Control System

Option A: Using the shell script (easiest)

bash

./START_VOICE_CONTROL_GEMINI.sh

Option B: Using Python directly

bash

python3 dj_agent/scripts/run_voice_control_gemini.py

Option C: With custom options

bash

python3 dj_agent/scripts/run_voice_control_gemini.py --api-key YOUR_KEY
python3 dj_agent/scripts/run_voice_control_gemini.py --commands  # List all commands
python3 dj_agent/scripts/run_voice_control_gemini.py --no-track-analysis  # Disable track analysis

Features & Configuration

Enable New Features

The new architectural enhancements can be enabled via configuration:

python

config = {
    # MIDI state awareness (prevents unsafe actions)
    'enable_midi_feedback': True,
    'midi_input_port': None,  # Auto-detect or specify port name

    # LLM-based intent processing (expressive commands)
    'enable_intent_processing': True,
    'intent_model': 'gemini-2.0-flash-exp',

    # Semantic search (natural language music queries)
    'enable_semantic_search': True,
    'library_index_file': './.library_index.json',

    # Track analysis
    'enable_track_analysis': True,
    'analysis_cache_dir': './.track_analysis_cache',

    # Transition recommendations
    'enable_transitions': True,
    'transition_announcements': False,
}

Example: Run with All Features Enabled

Create a custom script or modify `run_voice_control_gemini.py`:

python

from dj_agent.voice_control import VoiceController
import asyncio

config = {
    'enable_midi_feedback': True,
    'enable_intent_processing': True,
    'enable_semantic_search': True,
    'enable_track_analysis': True,
    'enable_transitions': True,
}

controller = VoiceController(config=config)
asyncio.run(controller.start())

Usage Examples

### Basic Commands
- `"play left"` - Play left deck
- `"pause right"` - Pause right deck
- `"cue 1"` - Jump to cue point 1
- `"censor left"` - Apply censor effect

### Higher-Order Commands
- `"play next"` - Load and play next track
- `"continuous mode"` - Enable auto-play mode
- `"transition left"` - Smooth transition to next track

### New Expressive Commands (with Intent Processor)
- `"loop the vocal and fade it out slowly"` - Complex loop with fade
- `"give me a long buildup"` - Transition with buildup
- `"play something funky"` - Semantic search

### Semantic Search (with Library Indexer)
- `"play some 90s hip hop"` - Find and play matching tracks
- `"find something chill"` - Search by vibe
- `"search for house music"` - Genre-based search

Troubleshooting

macOS Permissions

Microphone Access:
- System Settings → Privacy & Security → Microphone
- Enable access for Terminal (or your Python interpreter)

Accessibility Permissions (for keyboard control):
- System Settings → Privacy & Security → Accessibility
- Enable access for Terminal (or your Python interpreter)

MIDI Setup

If MIDI feedback is enabled but not working:

1. Check available MIDI ports:

python

import mido
print(mido.get_input_names())

2. Configure specific port in config:

python

config['midi_input_port'] = 'IAC Driver Bus 1'  # Example

3. Ensure Serato is configured to send MIDI output

Library Indexing

To index your music library for semantic search:

python

from dj_agent.voice_control.core import LibraryIndexer, SeratoLibraryReader

library = SeratoLibraryReader()
indexer = LibraryIndexer(serato_library=library)
indexer.index_tracks(force_reindex=False)  # Index all tracks

Common Issues

"ModuleNotFoundError: No module named 'dj_agent'"
- Make sure you're in the project root (`studio/`)
- Check that `sys.path` includes the project root

"Gemini API key required"
- Create `.env` file with `GEMINI_API_KEY=your-key`
- Or set environment variable: `export GEMINI_API_KEY=your-key`

"No MIDI input ports found"
- MIDI feedback is optional - system works without it
- Set `enable_midi_feedback: False` to disable

Commands not affecting Serato
- Check Accessibility permissions
- Ensure Serato DJ is running
- Try focusing Serato manually before running commands

Advanced Usage

Custom Command Processing

python

from dj_agent.voice_control.core import VoiceController, IntentProcessor

# Create custom intent processor
intent_processor = IntentProcessor(
    [sensitive field redacted],
    model_name='gemini-2.0-flash-exp'
)

# Use in controller
controller = VoiceController(
    config={'enable_intent_processing': True},
    intent_processor=intent_processor
)

Library Search

python

from dj_agent.voice_control.core import LibraryIndexer, SemanticSearch

# Index library
indexer = LibraryIndexer()
indexer.index_tracks()

# Search
search = SemanticSearch(library_indexer=indexer, [sensitive field redacted])
results = search.search("90s hip hop", limit=10)

for track, similarity in results:
    print(f"{track.title} - {track.artist} ({similarity:.2f})")

Stopping the System

Press `Ctrl+C` to stop the voice control system gracefully.

The system will:
- Close audio streams
- Stop MIDI listeners
- Save any pending state
- Display statistics

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

projects/Documentation/02-projects/dj-agent/studio/HOW_TO_RUN.md

Detected Structure

Evaluation · Figures · Code Anchors