Tier 3: Context-Aware Embeddings - Semantic Command Disambiguation Guide
**Context-Aware Embeddings** enables the voice control system to understand ambiguous commands by considering the current DJ system state. When you say "play" or "sync" without specifying a deck, the system intelligently infers which deck you mean based on what's currently happening.
Full Public Reader
Tier 3: Context-Aware Embeddings - Semantic Command Disambiguation Guide
Overview
Context-Aware Embeddings enables the voice control system to understand ambiguous commands by considering the current DJ system state. When you say "play" or "sync" without specifying a deck, the system intelligently infers which deck you mean based on what's currently happening.
Benefits:
- ✅ Natural, conversational commands ("play" instead of "play left")
- ✅ Context-aware disambiguation (knows which deck you mean)
- ✅ Intelligent action suggestions (predicts next likely commands)
- ✅ Fast heuristic matching (<5ms overhead)
- ✅ No ML training required (rule-based)
---
Quick Start
Default Behavior (Enabled)
python run_rekordbox_voice_gemini_enhanced.pyStartup Output:
⚙️ Tier 3 Enhancements:
↩️ State tracking & undo: True
(history size: 20)
💻 Whisper fallback: True
(model: base.en, offline capable)
🎯 Context embeddings: True
(semantic command disambiguation)
✓ Connecting to Gemini Live API...What Happens:
1. System tracks command history (last deck, last action)
2. When you give ambiguous command, system analyzes context
3. Infers which deck you mean based on priority rules
4. Resolves command automatically with high confidence
---
How It Works
Architecture
┌─────────────────────────────────────────┐
│ Voice Command: "sync" │
│ (no deck specified - AMBIGUOUS) │
└──────────────┬──────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ Context Encoder │
│ Last action: play │
│ Last deck: left │
│ Left deck: playing │
│ Right deck: stopped │
└──────────────┬───────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ Semantic Matcher │
│ Priority: last deck > cued > playing │
│ Inference: "sync" → "sync left" │
│ Confidence: 90% │
└──────────────┬───────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ Resolved Command: "sync left" │
│ Reasoning: last action was on left │
└──────────────────────────────────────────┘Disambiguation Priority
When resolving ambiguous commands, the system uses this priority order:
1. Last Deck (90
- Most recent action was on this deck
- Example: After "play left", "sync" → "sync left"
2. Cued Deck (75
- Deck is cued and ready to play
- Example: Right cued → "play" → "play right"
3. Playing Deck (75
- Deck is currently playing
- Example: Left playing → "loop" → "loop left"
4. Loop Active (70
- Deck has an active loop
- Example: Right looping → "exit loop" → "exit loop right"
5. Crossfader Position (65
- Crossfader favors one deck
- Example: Crossfader on left → "sync" → "sync left"
6. Default to Left (40
- No context available
- Fallback behavior
---
Usage Examples
Example 1: Sequential Commands
Scenario: You're working with the left deck
You: "play left"
→ ✓ play left
You: "sync" (ambiguous - no deck specified)
→ 🎯 Semantic match: "sync" → "sync left" (90%)
Reasoning: last action was on left
→ ✓ sync left
You: "loop 4 beats" (ambiguous)
→ 🎯 Semantic match: "loop 4 beats" → "loop 4 beats left" (90%)
Reasoning: last action was on left
→ ✓ loop 4 beats leftImpact: Natural flow - you don't need to repeat "left" every time
Example 2: Deck Switching
Scenario: You cue the right deck while left is playing
You: "play left"
→ ✓ play left
You: "cue right"
→ ✓ cue right
You: "play" (ambiguous)
→ 🎯 Semantic match: "play" → "play right" (90%)
Reasoning: last action was on right
→ ✓ play rightImpact: System remembers your last action, even when switching decks
Example 3: Track References
Scenario: You want to play a track
You: "load track ABC left"
→ ✓ load track ABC left
You: "play that" (ambiguous reference)
→ 🎯 Semantic match: "play that" → "play left" (80%)
Reasoning: Reference resolves to left deck based on context
→ ✓ play leftImpact: Conversational references like "that" work naturally
Example 4: No Context
Scenario: First command after startup
You: "sync" (ambiguous, no context)
→ 🎯 Semantic match: "sync" → "sync left" (40%)
Reasoning: No context available, defaulting to left deck
→ ✓ sync leftImpact: Even without context, system provides reasonable default
---
Ambiguous Command Patterns
The system detects these ambiguous patterns:
Deck Actions (No Deck Specified)
| Command | Resolves To | Example |
|---|---|---|
| `play` | `play {inferred_deck}` | "play" → "play left" |
| `stop` | `stop {inferred_deck}` | "stop" → "stop right" |
| `pause` | `pause {inferred_deck}` | "pause" → "pause left" |
| `sync` | `sync {inferred_deck}` | "sync" → "sync right" |
| `loop` | `loop {inferred_deck}` | "loop" → "loop left" |
| `cue` | `cue {inferred_deck}` | "cue" → "cue right" |
| `halve loop` | `halve loop {inferred_deck}` | "halve loop" → "halve loop left" |
| `double loop` | `double loop {inferred_deck}` | "double loop" → "double loop right" |
| `exit loop` | `exit loop {inferred_deck}` | "exit loop" → "exit loop left" |
Track References
| Command | Resolves To | Example |
|---|---|---|
| `play that` | `play {inferred_deck}` | "play that" → "play left" |
| `load this` | `load {inferred_deck}` | "load this" → "load right" |
| `eject it` | `eject {inferred_deck}` | "eject it" → "eject left" |
Relative Commands
| Command | Resolves To | Example |
|---|---|---|
| `next track` | `next track {inferred_deck}` | "next track" → "next track left" |
| `previous track` | `previous track {inferred_deck}` | "previous track" → "previous track right" |
---
Context Encoding
The system encodes DJ state as natural language:
Example Context Encoding
State:
- Left deck: playing, track loaded
- Right deck: cued, track loaded
- Crossfader: on left
- Last action: "cue" on right
Encoded Description:
"left deck is playing. right deck is cued and ready. crossfader is on left. last action was 'cue' on right."This description is used for semantic matching and debugging.
---
Command Suggestions
The system can suggest likely next actions based on context:
Example Suggestions
Scenario 1: Left deck cued
Context: Left deck is cued and ready
Suggested actions:
- play left deck
- sync left deckScenario 2: Left deck playing, right deck cued
Context: Both decks ready, left playing
Suggested actions:
- play right deck
- crossfade between decks
- sync right deckScenario 3: Left deck has active loop
Context: Left deck looping (4 beats)
Suggested actions:
- halve left loop
- double left loop
- exit left loop---
Performance
Latency
| Operation | Time | Notes |
|---|---|---|
| Context encoding | <1ms | Very fast |
| Semantic matching | <5ms | Heuristic-based |
| Command resolution | <2ms | Simple pattern matching |
| Total overhead | <10ms | Negligible impact |
Accuracy
| Confidence | Accuracy | Decision |
|---|---|---|
| 90 | ||
| 75 | ||
| 65 | ||
| 40 |
Threshold: System only applies resolution if confidence ≥ 70
---
Configuration
Enable/Disable
Enable: (default)
python run_rekordbox_voice_gemini_enhanced.pyDisable:
python run_rekordbox_voice_gemini_enhanced.py --no-embeddingsWhen Disabled:
- Ambiguous commands are NOT resolved
- User must specify deck explicitly
- Falls back to Tier 1 intelligent defaults
---
Integration with Other Features
Works With Tier 1 Intelligent Defaults
Context embeddings runs BEFORE intelligent defaults:
1. Tier 2 Contextual Disambiguation - Resolve pronouns ("that" → "left")
2. Tier 3 Context Embeddings - Resolve ambiguous actions ("sync" → "sync left")
3. Tier 1 Intelligent Defaults - Apply final defaults if still ambiguous
This layered approach ensures maximum accuracy.
Works With Tier 3 State Tracking
When state tracking is enabled, context is richer:
- Command history is more accurate
- Undo/redo commands preserve context
- Better predictions for next actions
Works With Tier 2 Macros
Macros benefit from context:
# Macro: transition
transition:
description: "Transition from current deck to other deck"
commands:
- "sync {other_deck}" # Context resolves {other_deck}
- "play {other_deck}"
- "crossfade to {other_deck}"---
Troubleshooting
Issue: Wrong Deck Inferred
Symptoms:
You: "sync"
→ 🎯 Semantic match: "sync" → "sync right" (75%)
→ (Expected: sync left)Causes:
1. Recent action was on right deck
2. Right deck has stronger context signal (cued, looping, etc.)
Solutions:
1. Be more explicit: "sync left"
2. Check last action (might have been on different deck)
3. Verify crossfader position (might favor other deck)
Issue: Low Confidence Warning
Symptoms:
You: "play"
→ 🎯 Semantic match: "play" → "play left" (40%)
Reasoning: No context available, defaulting to left deckCause: No recent context (first command after startup)
Solution:
1. This is expected behavior
2. System falls back to left deck default
3. Subsequent commands will have higher confidence
Issue: Commands Not Being Disambiguated
Symptoms:
You: "sync"
→ ✓ sync (not resolved)Causes:
1. Context embeddings disabled (`--no-embeddings`)
2. Command is not ambiguous (already specifies deck)
3. Confidence below 70
Solutions:
1. Check startup output: `🎯 Context embeddings: True`
2. Verify command pattern is in ambiguous list
3. Provide more context (execute deck-specific commands first)
---
API Reference
CLI Arguments
--no-embeddings # Disable context-aware embeddingsPython API
from dj_agent.voice_control.core.gemini_listener_enhanced import EnhancedGeminiVoiceListener
listener = EnhancedGeminiVoiceListener(
enable_context_embeddings=True, # Enable feature (default)
)
# Check status
print(listener.enable_context_embeddings) # True/False
# Get stats
print(f"Commands disambiguated: {listener.commands_disambiguated}")Programmatic Access
from dj_agent.voice_control.embeddings import (
SemanticCommandMatcher,
SystemContext,
)
# Create matcher
matcher = SemanticCommandMatcher()
# Build context
context = SystemContext(
left_playing=True,
last_action="play",
last_deck="left",
)
# Match ambiguous command
result = matcher.match("sync", context)
print(f"Resolved: {result.command}") # "sync left"
print(f"Confidence: {result.confidence}") # 0.9
print(f"Reasoning: {result.reasoning}") # "last action was on left"---
Best Practices
1. Build Context Gradually
Good:
You: "load track ABC left"
You: "play" # Infers left
You: "sync" # Infers leftWhy: Each command builds context for the next
2. Be Explicit When Switching Decks
Good:
You: "play left"
You: "cue right" # Explicit deck switch
You: "play" # Infers right (recent action)Why: Explicit deck specification creates clear context
3. Check Reasoning When Learning
Good:
You: "sync"
→ 🎯 Semantic match: "sync" → "sync left" (90%)
Reasoning: last action was on left ← Read this!Why: Understanding reasoning helps you predict behavior
4. Use With State Tracking
Good:
python run_...enhanced.py
# Both embeddings and state tracking enabledWhy: State tracking enriches context for better inference
---
Examples by Use Case
Use Case 1: Quick Deck Operations
Goal: Rapid-fire commands without repeating deck name
You: "load track XYZ left"
You: "play" → "play left"
You: "loop 4 beats" → "loop 4 beats left"
You: "sync" → "sync left"
You: "halve loop" → "halve loop left"Benefit: 5 commands, only 1 explicit deck specification
Use Case 2: Transitioning Between Decks
Goal: Smooth transition with minimal verbosity
You: "play left"
You: "cue right"
You: "sync" → "sync right" (last action)
You: "play" → "play right"
You: "crossfade to right"Benefit: Natural flow, system follows your intent
Use Case 3: Loop Manipulation
Goal: Adjust loop on active deck
You: "loop 8 beats left"
You: "halve loop" → "halve loop left" (has active loop)
You: "halve loop" → "halve loop left" (4 beats now)
You: "double loop" → "double loop left" (8 beats again)
You: "exit loop" → "exit loop left"Benefit: Natural loop workflow, no deck repetition
---
Summary
Context-Aware Embeddings = Natural Voice Control
- ✅ Say "play" instead of "play left" (when context is clear)
- ✅ System infers deck from recent actions
- ✅ 6 priority levels for disambiguation (90
- ✅ <10ms overhead (negligible)
- ✅ Works with Tier 1, 2, 3 features
- ✅ No ML training required
Enable: (default)
python run_rekordbox_voice_gemini_enhanced.pyDisable:
python run_rekordbox_voice_gemini_enhanced.py --no-embeddingsCheck Status:
⚙️ Tier 3 Enhancements:
🎯 Context embeddings: True
(semantic command disambiguation)---
Say less, do more! 🎯🎧
Generated: 2025-11-22
System: Computational Choreography - Tier 3 Context Embeddings
Version: 3.0 - Feature #11
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
projects/Documentation/02-projects/dj-agent/studio/TIER3_CONTEXT_EMBEDDINGS_GUIDE.md
Detected Structure
Method · Evaluation · References · Code Anchors · Architecture