Grand Diomande Research · Full HTML Reader

Stage 3: EXPAND + MASTER PLAN

**Primary STT: Deepgram Nova-3 (Cloud Streaming)** - Protocol: WebSocket to wss://api.deepgram.com/v1/listen - Parameters: encoding=linear16, sample_rate=16000, channels=1, model=nova-3, language=en, smart_format=true, keywords=["latte:2","cappuccino:2","espresso:2","oat:1.5","almond:1.5","large:1","medium:1","small:1"] - Partial results: interim_results=true (for live preview) - Latency: ~300ms for first partial, ~500ms for stable transcript - Cost: $0.0043/minute. At 500 orders/month, 30s average = $1.08/month. -

Embodied Trajectory Systems proposal experiment writeup candidate score 34 .md

Full Public Reader

# Stage 3: EXPAND + MASTER PLAN
## LUME Commerce -- Experiential Commerce Infrastructure

---

3a. RISK AUDIT

### R1: GPU Contention During Voice Processing [CRITICAL]
- Failure scenario: When Whisper.cpp fallback activates (WiFi down), the CUDA inference burst (~2-3 seconds) steals GPU cycles from the visual pipeline. Frame rate drops from 60fps to 20-30fps. All customers see the stutter, not just the person ordering. The entertainment experience degrades.
- Probability: MEDIUM (35
- Impact: MEDIUM-HIGH. Visual stutter during ordering creates a jarring experience. If WiFi is frequently unreliable, this becomes a persistent problem.
- Mitigation: (a) Pre-warm Whisper model at boot but don't run inference until needed. (b) During Whisper inference, reduce visual pipeline to 30fps (skip every other frame, doubling GPU headroom). This is imperceptible to customers vs stuttery 45fps. (c) Batch audio: accumulate 5-10s of audio, process in one burst, return to 60fps. (d) Long-term: Whisper-TensorRT optimization gives 3x speedup, reducing contention window.
- Validation criteria: Visual pipeline maintains >30fps during Whisper inference burst. No visible stutter reported by 3 out of 3 test observers in blind test.

### R2: Voice Ordering Accuracy in Coffee Shop Noise [CRITICAL]
- Failure scenario: Background noise (espresso machine 70-80 dB, conversations 55-65 dB, music 60-70 dB) degrades STT accuracy below 70
- Probability: LOW with Deepgram (15
- Impact: CRITICAL. If voice ordering doesn't work, the entire Commerce tier value proposition fails.
- Mitigation: (a) Audio preprocessing: 3-mic beamforming (directional pickup toward customer, null toward espresso machine), noise gate, band-pass 200-4kHz, AGC. (b) Deepgram Nova-3 primary (trained on millions of hours of noisy real-world audio). (c) Domain-specific vocabulary hints sent to Deepgram (coffee menu terms as keywords parameter). (d) Local Whisper as degraded fallback only. (e) Touch fallback: LUME display shows order suggestions from partial transcript, customer taps to confirm. Never force voice-only.
- Validation criteria: >85

### R3: Content Flywheel Adoption Rate [MEDIUM]
- Failure scenario: Customers don't interact with LUME visuals during queue time. They stare at their phones. QR scan rate is <5
- Probability: MEDIUM (30
- Impact: MEDIUM. Commerce and analytics still work without the flywheel. Revenue is not dependent on content sharing. But the marketing moat doesn't build.
- Mitigation: (a) Prominent placement: LUME must face the queue, not be behind the counter. (b) Staff encouragement: "Have you tried the wall display while you wait?" (c) Group dynamics: one person interacting draws others. (d) Attract mode tuned to catch peripheral attention. (e) Content incentive: "Show your LUME clip for 10
- Validation criteria: >15

### R4: Deepgram API Reliability and Latency [MEDIUM]
- Failure scenario: Deepgram WebSocket connection drops mid-order. Or latency spikes to >1 second during peak hours (their server load). Customer says "large latte" and waits 3 seconds for response. Feels broken.
- Probability: LOW (15
- Impact: MEDIUM. Local Whisper fallback activates, degrading to slower but functional voice ordering.
- Mitigation: (a) WebSocket heartbeat monitoring: if ping >500ms, switch to local Whisper within 1 second. (b) Pre-buffer: start processing audio locally on Whisper while waiting for Deepgram response. Use whichever returns first. (c) Deepgram SLA: enterprise tier includes uptime guarantee. (d) Alternative cloud STT providers (AssemblyAI, Google Chirp) as second cloud fallback.
- Validation criteria: Total STT response time (cloud or fallback) <3 seconds for 99

### R5: BWB NLU Pattern Port Accuracy [MEDIUM]
- Failure scenario: Porting 80K lines of Swift NLU patterns to Rust introduces subtle regex/matching bugs. "Large" matches differently. Modifier detection misses "oat" in "oat milk latte." The ported code is 90
- Probability: MEDIUM (30
- Impact: MEDIUM. Incorrect parsing means wrong orders, customer frustration, barista overrides.
- Mitigation: (a) Port with test suite: extract 200+ test cases from BWB_KioskTests (202 existing tests + 217 voice ordering type tests). Every test must pass in Rust before shipping. (b) Run parallel validation: send same transcripts through both Swift (on iPad) and Rust (on LUME) NLU pipelines, compare outputs. Flag divergences. (c) Start with the 50 most common coffee orders only. Expand vocabulary incrementally.
- Validation criteria: 100

### R6: Kitchen Display Adoption [LOW-MEDIUM]
- Failure scenario: Baristas already have a workflow with their existing POS kitchen display. Adding LUME's web-based kitchen display means monitoring two screens. They ignore LUME orders.
- Probability: MEDIUM (35
- Impact: MEDIUM. Voice orders not fulfilled = customers waiting longer = bad experience.
- Mitigation: (a) Default: Route LUME orders to the EXISTING POS kitchen display via API (Square Orders API, Toast Kitchen API). LUME adds orders to the barista's existing workflow. (b) Fallback: LUME announces orders via speakers ("New order: large oat milk latte!"). Audio alert is impossible to miss. (c) Web kitchen display is for shops that DON'T have an existing KDS.
- Validation criteria: 95

### R7: Privacy Perception with Cameras in Commerce Space [LOW-MEDIUM]
- Failure scenario: Customers feel surveilled by a camera-equipped device in a payment/ordering area. Social media backlash: "This coffee shop is filming you to sell your data." Venue owner removes LUME.
- Probability: LOW (15
- Impact: MEDIUM-HIGH. One viral negative post could damage the brand.
- Mitigation: (a) "Privacy Mode" signage: "This device uses depth sensors only. No photos, no facial recognition, no personal data." (b) LED indicator: green light when analytics active (no images). Red light only during content clip capture (customer-initiated). (c) Depth-only mode: toggle that disables all RGB cameras, using depth for visuals and analytics. (d) Published privacy policy specific to LUME Commerce.
- Validation criteria: Zero unresolved privacy complaints in 90-day pilot. Customer survey: >80

### R8: Menu Configuration Complexity [LOW-MEDIUM]
- Failure scenario: Setting up a new venue requires manually entering every menu item, size, modifier, and price. For a coffee shop with 40 drinks and 15 modifiers, this takes 2+ hours. Shop owner gives up.
- Probability: MEDIUM (30
- Impact: LOW-MEDIUM. Bad setup = bad NLU accuracy = bad orders.
- Mitigation: (a) Square/Toast menu sync: auto-import menu from existing POS via API. One click. (b) CSV/JSON import for other POS systems. (c) Template menus: pre-built "Coffee Shop Standard" menu with 50+ items, shop owner enables/disables. (d) Companion app menu editor with search + categorize.
- Validation criteria: New venue menu configured in <30 minutes. Template menu covers >80

### R9: Zone Configuration for Analytics [LOW]
- Failure scenario: Shop owner defines queue zone incorrectly (too large, overlapping with seating). Analytics data is meaningless. "Your queue length is 47" when actual queue is 4 people.
- Probability: LOW (15
- Impact: LOW. Bad zones = bad analytics, but voice ordering and entertainment still work.
- Mitigation: (a) Companion app with live depth view. Shop owner draws zones on the depth camera view with finger. Instant visual feedback: "LUME sees 3 people in your queue zone." (b) Auto-suggest: LUME's depth map identifies the most common body positions during first 24 hours and suggests zones. (c) Manual override always available.
- Validation criteria: Zone setup completes in <10 minutes with visual validation. Auto-suggest accuracy >70

### R10: Single-Person Scaling for Commerce Support [MEDIUM]
- Failure scenario: Commerce introduces new failure modes: failed orders, incorrect charges, payment disputes, kitchen sync issues. Each venue generates 2-3 support tickets/month. At 50 venues, that's 100-150 tickets/month. Mohamed drowns in support instead of building.
- Probability: MEDIUM-HIGH (40
- Impact: HIGH. Support backlog kills customer retention. Venues churn because issues aren't resolved.
- Mitigation: (a) Self-diagnosing system: LUME logs every order attempt, STT confidence, NLU result, and payment status. Support dashboard shows full order trace. (b) Auto-recovery: failed STT -> retry with Whisper. Failed payment -> retry or route to counter. Failed kitchen push -> audio announcement fallback. (c) FAQ bot: LLM-powered chat support on lume.dance/support trained on LUME knowledge base. (d) Hire first support contractor at 30 commerce venues ($1.5K/month).
- Validation criteria: <5

---

3b. EXPANDED SPECIFICATIONS

Spec 1: On-Device Voice Model Decision

Primary STT: Deepgram Nova-3 (Cloud Streaming)
- Protocol: WebSocket to wss://api.deepgram.com/v1/listen
- Parameters: encoding=linear16, sample_rate=16000, channels=1, model=nova-3, language=en, smart_format=true, keywords=["latte:2","cappuccino:2","espresso:2","oat:1.5","almond:1.5","large:1","medium:1","small:1"]
- Partial results: interim_results=true (for live preview)
- Latency: ~300ms for first partial, ~500ms for stable transcript
- Cost: $0.0043/minute. At 500 orders/month, 30s average = $1.08/month.
- Accuracy: >95

Fallback STT: Whisper.cpp base.en (Local CUDA)
- Model: ggml-base.en.bin (200MB, 74M params)
- Inference: whisper.cpp with CUDA backend on Jetson Orin Nano Super
- Expected: ~2-3 seconds for 10 seconds of audio
- Accuracy: ~85
- Trigger: WiFi unreachable OR Deepgram latency >500ms for 3 consecutive requests
- GPU impact: ~3ms/frame for 5-10 frames during inference. Visual pipeline drops to 30fps temporarily.

NLU: BWB Pattern Engine (Local, CPU Only)
- No LLM. The ported BWB patterns handle:
- Drink names: 50+ items with 3-5 aliases each
- Sizes: small/medium/large/tall/grande/venti + numeric (12oz, 16oz, 20oz)
- Modifiers: milks (oat, almond, soy, coconut, whole, skim), temperatures (hot, iced, blended), extras (extra shot, light ice, no whip)
- Quantities: "two," "a couple," "three," explicit numbers
- Confidence: 0.0-1.0 per parsed item based on fuzzy match score
- Processing time: <50ms on Jetson CPU for typical order transcript
- Memory: ~10MB (compiled Rust + vocabulary data)

TTS: Piper (Local, CPU Only)
- Model: en_US-lessac-medium (50MB, VITS architecture)
- Latency: <100ms per sentence
- Quality: natural-sounding, suitable for confirmations and announcements
- Output: 22050Hz WAV -> LUME speakers or HDMI audio

Benchmark Summary:

MetricCloud (Deepgram)Local (Whisper)Target
STT latency~500ms~2500ms<3000ms
STT accuracy (70dB noise)~95
NLU processing50ms50ms<100ms
TTS latency100ms100ms<200ms
Total order time~2s~4s<5s
GPU impact0
Monthly cost (500 orders)$1.08 | $0<$5

Spec 2: Queue Analytics Data Model

sql
-- Supabase tables for LUME Commerce analytics

CREATE TABLE lume_venues (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name TEXT NOT NULL,
  slug TEXT UNIQUE NOT NULL,
  lume_device_id TEXT UNIQUE,
  timezone TEXT DEFAULT 'America/New_York',
  menu_config JSONB,          -- menu items, prices, aliases
  zone_config JSONB,          -- zone definitions (queue, service, exit, etc.)
  branding JSONB,             -- logo URL, watermark config, colors
  payment_config JSONB,       -- payment method, Square/Stripe creds
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE lume_analytics_snapshots (
  id BIGSERIAL PRIMARY KEY,
  venue_id UUID REFERENCES lume_venues(id),
  captured_at TIMESTAMPTZ NOT NULL,
  current_occupancy INT,
  queue_length INT,
  service_count INT,
  estimated_wait_seconds INT,
  avg_service_time_seconds FLOAT,
  throughput_per_hour FLOAT,
  conversion_rate FLOAT,      -- reached service / entered
  abandonment_rate FLOAT,     -- exited without service
  heatmap JSONB               -- 20x20 grid of density values
);
CREATE INDEX idx_analytics_venue_time ON lume_analytics_snapshots(venue_id, captured_at);

CREATE TABLE lume_orders (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  venue_id UUID REFERENCES lume_venues(id),
  order_number INT,           -- venue-specific sequential number
  items JSONB NOT NULL,       -- [{name, quantity, modifiers, price}]
  total_amount DECIMAL(10,2),
  status TEXT DEFAULT 'pending',  -- pending, preparing, ready, completed, cancelled
  payment_status TEXT DEFAULT 'unpaid',  -- unpaid, processing, paid, refunded
  payment_method TEXT,        -- counter, square, stripe, apple_pay
  stt_method TEXT,            -- deepgram, whisper_local
  stt_confidence FLOAT,
  nlu_confidence FLOAT,
  customer_name TEXT,         -- optional, from voice (e.g., "for Sarah")
  created_at TIMESTAMPTZ DEFAULT NOW(),
  confirmed_at TIMESTAMPTZ,
  ready_at TIMESTAMPTZ,
  completed_at TIMESTAMPTZ
);
CREATE INDEX idx_orders_venue_status ON lume_orders(venue_id, status);

CREATE TABLE lume_content_clips (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  venue_id UUID REFERENCES lume_venues(id),
  clip_url TEXT,              -- CDN URL for web access
  thumbnail_url TEXT,
  duration_seconds FLOAT,
  visual_preset TEXT,
  audio_mode TEXT,            -- listen, generate
  qr_scanned BOOLEAN DEFAULT false,
  downloaded BOOLEAN DEFAULT false,
  shared_platform TEXT,       -- tiktok, instagram, etc. (if tracked)
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE lume_track_events (
  id BIGSERIAL PRIMARY KEY,
  venue_id UUID REFERENCES lume_venues(id),
  track_id INT,               -- local centroid tracker ID (not persistent)
  event_type TEXT NOT NULL,   -- enter, zone_change, dwell, exit
  zone TEXT,                  -- queue, service, entrance, exit, seating
  position_x FLOAT,
  position_y FLOAT,
  dwell_seconds FLOAT,       -- for dwell events
  captured_at TIMESTAMPTZ NOT NULL
);
-- Partitioned by date for efficient cleanup
CREATE INDEX idx_track_events_venue_time ON lume_track_events(venue_id, captured_at);

Spec 3: BWB Code Port Map (Swift -> Rust)

Principle: Port the LOGIC, not the syntax. The BWB Swift code is organized around iOS patterns (@MainActor, ObservableObject, Combine). The Rust code uses async channels (tokio::mpsc) and message-passing instead of observation.

BWB SWIFT                              LUME RUST
---------                              ---------

VoiceOrderingOrchestrator              voice::orchestrator::VoiceOrchestrator
  @Published phase                       -> enum Phase { Idle, Listening, Processing, ... }
  ObservableObject                       -> struct with tokio::mpsc::Sender
  Singleton shared                       -> owned by CommerceEngine, not global

TranscriptPipeline                     voice::orchestrator (inline)
  processIncoming()                      -> process_transcript(raw: &str, is_final: bool)
  TranscriptNormalizer                   -> nlu::normalize(text: &str) -> String
  TranscriptStabilityTracker             -> stability_count field in orchestrator

LiveOrderPreviewGenerator              nlu::menu_matcher::MenuMatcher
  MenuAliasMatcher                       -> HashMap<String, MenuItem> + fuzzy match
  ModifierDetector                       -> regex patterns for milk/temp/extras
  QuantityExtractor                      -> regex patterns for numbers

CartCoordinator                        cart::coordinator::CartCoordinator
  pendingOrders / confirmedCart           -> Vec<ParsedItem> / Vec<ConfirmedItem>
  addPendingOrder()                      -> add_pending(item: ParsedItem)
  confirmAll()                           -> confirm_all() -> Vec<ConfirmedItem>

ConfirmationCoordinator                cart::confirmation::ConfirmationFlow
  AutoConfirmConfig (3s, 0.8 conf)       -> ConfirmConfig { delay_ms: 3000, min_confidence: 0.8 }
  confirmationTimer                      -> tokio::time::sleep(Duration::from_secs(3))

SessionManager                         cart::session::SessionManager
  120s timeout                           -> tokio::time::timeout(Duration::from_secs(120))
  recordActivity()                       -> reset_timeout()

FeedbackCoordinator                    voice::piper_bridge::PiperTTS
  speak() / speakAndWait()               -> speak(text: &str) -> impl Future
  playAudio() / playHaptic()             -> play_sound(sound: SoundType)
  AVSpeechSynthesizer                    -> Piper subprocess via stdin/stdout

QueueService                           analytics::exporter::DataExporter
  pendingOrders / inProgressOrders        -> in lume_orders table, not in-memory
  Supabase realtime subscription         -> Supabase REST + WebSocket (same)
  refreshInterval: 30s                   -> export_interval: 5s

KitchenDisplayView                     kitchen::web_display (Axum + htmx)
  SwiftUI View                           -> HTML template with htmx SSE updates
  Dark background                        -> CSS dark theme
  Order cards grid                       -> CSS Grid layout
  Status buttons                         -> htmx hx-post for status changes

Test Parity Strategy:

Extract test cases from these BWB test files:
- `BWB_KioskTests/VoiceOrchestratorTests.swift` (112 lines)
- `BWB_KioskTests/VoiceOrderingTypesTests.swift` (217 lines)
- `BWB_KioskTests/CartManagerTests.swift` (190 lines)
- `BWB_KioskTests/BWB_KioskTests.swift` (202 lines)
- `BWB_KioskTests/TTSIntegrationTests.swift` (94 lines)
- `BWBCore` 434 passing tests (extract NLU-relevant subset)

Target: 150+ Rust tests covering the same cases before Commerce goes live.

Spec 4: Content Flywheel Technical Pipeline

CLIP CAPTURE PIPELINE:

1. Engagement Detection (runs every frame, ~0.1ms)
   Input: centroid tracks in QUEUE zone from analytics engine
   Logic:
     IF any track in QUEUE zone for >15 seconds
     AND track motion_energy > 0.3 (normalized 0-1)
     AND no clip captured for this track in last 60 seconds
     THEN trigger clip capture for this track

2. Clip Capture (triggered, 15-second duration)
   - Start recording from circular buffer (last 5s already buffered)
   - Record 10 more seconds (total 15s)
   - Sources: tight camera (portrait 9:16) + visual overlay + audio
   - Compositor: blend visual frame over camera feed (same as existing)
   - Encode: NVENC H.265, 1080x1920 @ 30fps, ~15MB per clip
   - Audio: room audio (LISTEN mode) or generated (GENERATE mode)
   - Metadata: venue_slug, timestamp, visual_preset, duration

3. QR Code Generation (immediate after capture)
   - URL: https://lume.dance/c/{venue_slug}/{clip_id_short}
   - QR code rendered as overlay on LUME display (corner, non-intrusive)
   - QR visible for 30 seconds, then fades
   - Same QR also available via NFC tap (if V2 hardware)

4. Content Upload (background, WiFi)
   - Upload clip to CDN (Cloudflare R2 or Supabase Storage)
   - ~15MB per clip, WiFi 6E throughput: ~5 seconds
   - Generate thumbnail (best frame by visual energy)
   - Create entry in lume_content_clips table

5. Web Clip Page (served at lume.dance/c/...)
   - Mobile-optimized page
   - Auto-play preview (HLS, muted, 720p for quick load)
   - Download button (full 1080p file to camera roll)
   - Share buttons: TikTok, Instagram Stories, X, WhatsApp
   - Venue credit: logo + name + location
   - LUME credit: small "Made with LUME" badge

6. Analytics (tracked per clip)
   - QR scanned: boolean (link opened)
   - Preview watched: boolean (video loaded)
   - Downloaded: boolean (download button tapped)
   - Shared: platform (if detected via share intent)
   - Impressions: estimated from platform APIs (future)

Spec 5: Enterprise Fleet Dashboard

DASHBOARD: https://dashboard.lume.dance (web app, Supabase + Nexus-style)

PAGES:

1. Fleet Overview
   - Map view: all locations with real-time status (green/yellow/red)
   - Summary cards: total orders today, total queue time saved, total clips shared
   - Alerts: devices offline, unusual queue lengths, low order accuracy

2. Location Detail (per venue)
   - Real-time metrics: occupancy, queue length, wait time
   - Heatmap overlay on venue floor plan
   - Orders: list with status, time, items, voice accuracy
   - Content: clips generated today, QR scans, downloads, shares
   - Analytics: throughput chart (orders/hour), wait time trend, peak hours

3. Analytics Comparison
   - Side-by-side: Location A vs Location B vs Fleet Average
   - Metrics: wait time, throughput, conversion rate, content engagement
   - Time range selector: today, 7 days, 30 days, custom

4. Voice Performance
   - STT accuracy breakdown: Deepgram vs Whisper fallback rate
   - Most misunderstood items (NLU confidence <0.7)
   - Order correction rate (manual override by barista)
   - Top 20 ordered items with voice accuracy per item

5. Content Performance
   - Total clips: captured, scanned, downloaded, shared
   - Top-performing clips (most shares)
   - Engagement funnel: captured -> scanned -> downloaded -> shared
   - Estimated social impressions per location

6. Settings
   - Venue management: add/edit/deactivate locations
   - Menu management: bulk update across fleet
   - Zone configuration: templates by venue type
   - User management: roles (owner, manager, support)
   - Billing: subscription tier, invoices, payment method

TECH:
  - Next.js on Vercel (or Nexus-style self-hosted)
  - Supabase for data + auth + realtime
  - Charts: Recharts or D3
  - Map: Mapbox GL JS
  - Responsive: works on desktop and iPad

---

3c. MASTER EXECUTION CHECKLIST

### Wave 0: FOUNDATION (Days 1-5)
Set up the Rust Commerce Engine skeleton and validate STT on Jetson.

#TaskInputOutputValidationAutoStatus
0.1Create `lume-commerce` Rust crate with module structureArchitecture from Stage 2Cargo.toml + src/ with all module files (empty stubs)`cargo check` passesyesTODO
0.2Integrate whisper.cpp as Rust FFI dependencywhisper.cpp repo + Jetson CUDA`whisper_bridge.rs` with `transcribe(audio: &[f32]) -> String`Unit test: transcribe a WAV file, get textyesTODO
0.3Integrate Piper TTS as Rust subprocess bridgePiper binary for aarch64`piper_bridge.rs` with `speak(text: &str) -> AudioBuffer`Unit test: speak "hello", get WAV outputyesTODO
0.4Benchmark Whisper base.en on Jetson (CUDA)10 test audio files (coffee orders in noise)Benchmark report: latency + accuracy per file<3s for 10s audio, >80
0.5Set up Deepgram WebSocket streaming clientDeepgram API key`deepgram_client.rs` with streaming + partial resultsUnit test: stream audio, receive partial + final transcriptyesTODO
0.6Create `VoiceOrchestrator` state machine (empty transitions)BWB VoiceOrderingOrchestrator.swift`orchestrator.rs` with Phase enum + transition logicUnit test: Idle -> Listening -> Processing -> Confirming -> CompleteyesTODO
0.7Port domain models from BWB (Order, MenuItem, Customization)BWBCore/Models/*.swift`cart/models.rs` with Rust structs + serdeUnit test: serialize/deserialize order JSONyesTODO

Wave 0 Gate: Whisper.cpp runs on Jetson with CUDA. Deepgram streaming works. State machine compiles. Domain models serialize.

---

### Wave 1: NLU ENGINE PORT (Days 5-10)
Port the BWB voice NLU patterns from Swift to Rust.

#TaskInputOutputValidationAutoStatus
1.1Extract coffee menu vocabulary from BWB `VoiceNLUEngine+Vocabulary.swift`17K lines Swift vocabulary`nlu/vocabulary.json` (50+ items, 200+ aliases, modifiers)Manual review: all major items presentnoTODO
1.2Port `MenuAliasMatcher` to RustBWB MenuAliasMatcher logic`nlu/menu_matcher.rs` with fuzzy string matchingPort 50 test cases from BWB, all passyesTODO
1.3Port `ModifierDetector` to RustBWB ModifierDetector patterns`nlu/modifier_detector.rs` with regex patternsPort 30 test cases: milk types, temp, extras all detectedyesTODO
1.4Port `QuantityExtractor` to RustBWB QuantityExtractor logic`nlu/quantity_extractor.rs`Port 20 test cases: "two," "a couple," "three" all extract correctlyyesTODO
1.5Port `ConfidenceScorer` to RustBWB OrderParsingPipeline confidence`nlu/confidence_scorer.rs`Confidence scores match BWB within 0.05 for 50 test transcriptsyesTODO
1.6Build `OrderResultMerger` (combines menu match + modifiers + qty)BWB OrderResultMerger`nlu/order_merger.rs`End-to-end: "large iced oat milk latte" -> correct ParsedItemyesTODO
1.7Create test corpus: 100 real coffee order transcripts with expected outputManual creation + BWB logs`tests/corpus/orders.json`Reviewed for correctnessnoTODO
1.8Run NLU pipeline against test corpusCorpus + NLU pipelineAccuracy report>90

Wave 1 Gate: NLU pipeline parses "large iced oat milk latte with an extra shot" correctly. >90

---

### Wave 2: VOICE ORDERING FLOW (Days 10-16)
Wire STT -> NLU -> Cart -> Confirmation -> TTS into the complete ordering flow.

#TaskInputOutputValidationAutoStatus
2.1Build audio preprocessor (noise gate + bandpass + AGC)SpeakFlow AudioProcessingService as reference`voice/audio_preprocessor.rs`Test: 70dB pink noise input -> voice frequencies isolatedyesTODO
2.2Build STT tier selection (Deepgram primary, Whisper fallback)deepgram_client + whisper_bridge`voice/stt_bridge.rs` with auto-fallbackTest: kill WiFi -> switches to Whisper within 2syesTODO
2.3Wire VoiceOrchestrator: mic -> preprocessor -> STT -> NLU -> cartAll voice + NLU componentsFull voice ordering pipelineEnd-to-end test: speak order -> cart populated correctlynoTODO
2.4Port CartCoordinator from BWBBWB CartCoordinator logic`cart/coordinator.rs`Test: add/remove/modify items, calculate totalyesTODO
2.5Port ConfirmationCoordinator from BWBBWB ConfirmationCoordinator`cart/confirmation.rs` with auto-confirmTest: auto-confirm at 0.8 confidence after 3syesTODO
2.6Port SessionManager with 120s timeoutBWB SessionManager`cart/session.rs`Test: session times out after 120s of inactivityyesTODO
2.7Build TTS feedback flow (confirmations, order-ready)Piper bridgeTTS integrated into orchestratorSpeaks "Large oat milk latte, $5.50. Anything else?"yesTODO
2.8Build order overlay renderer for Unity visual engineVisual engine APIText cards in particle field (from Stage 2, Step 5 design)Order text visible on LUME display during voice orderingnoTODO
2.9Integration test: 20 consecutive voice orders on JetsonJetson dev kit + mic20 orders processed without crash or hangAll 20 complete, >85

Wave 2 Gate: A person can walk up to LUME, say a coffee order, see it confirmed on the visual display, and hear TTS confirmation. 20 consecutive orders without failure.

---

### Wave 3: QUEUE ANALYTICS (Days 14-19)
Build the depth-based queue analytics engine.

#TaskInputOutputValidationAutoStatus
3.1Build BodyDetector from depth silhouette maskDepth pipeline silhouette output`analytics/body_detector.rs`Correctly counts 1-5 people in test depth framesyesTODO
3.2Build CentroidTracker with Kalman filterBodyDetector output`analytics/centroid_tracker.rs`Tracks 3 people moving for 30s with <2 ID swapsyesTODO
3.3Build ZoneClassifier with configurable regionsZone config JSON + centroids`analytics/zone_classifier.rs`Correctly assigns centroids to queue/service/exit zonesyesTODO
3.4Build MetricsAggregator with rolling windowsTrack events`analytics/metrics.rs`Calculates wait time, throughput, conversion for test datayesTODO
3.5Build HeatmapGenerator (20x20 grid)Track events`analytics/heatmap.rs`Generates plausible 2D density map from 5 minutes of track datayesTODO
3.6Build DataExporter (SQLite local + Supabase cloud)Metrics + track events`analytics/exporter.rs`Data visible in SQLite after 60s of operationyesTODO
3.7Create Supabase schema for analytics (Spec 2)SQL from Stage 3Supabase tables liveAll tables created, test insert succeedsyesTODO
3.8Integration test: analytics running alongside visual pipelineJetson + Femto Bolt + 2-3 peopleReal-time queue count on LUME display + Supabase dataCount matches actual people. No visual stuttering.noTODO

Wave 3 Gate: LUME correctly counts 1-5 people, tracks dwell time, and exports analytics to Supabase. Zero impact on visual pipeline performance.

---

### Wave 4: PAYMENT + KITCHEN ROUTING (Days 18-23)
Connect the ordering flow to payment and kitchen display.

#TaskInputOutputValidationAutoStatus
4.1Build PaymentBridge (Square Terminal REST API)Square Terminal API docs`payment/square_terminal.rs`Test: create order -> Terminal displays -> mock payment -> confirmation callbackyesTODO
4.2Build PaymentBridge (Stripe Terminal server-driven)Stripe Terminal API docs`payment/stripe_terminal.rs`Same as 4.1 but with StripeyesTODO
4.3Build pay-at-counter flow (order push only, no payment on LUME)Order confirmed eventWebSocket push to POS with order detailsBarista receives order on their POS displayyesTODO
4.4Build KitchenRouter (WebSocket push to kitchen)Order confirmed event`kitchen/router.rs`Kitchen display receives order within 2s of confirmationyesTODO
4.5Build web-based Kitchen Display (htmx + Axum)Kitchen display spec`kitchen/web_display.rs` + HTML/CSSBrowser-based KDS shows orders, status buttons workyesTODO
4.6Build "Order Ready" TTS announcementKitchen status updateLUME speakers announce customer name + order"Sarah, your oat milk latte is ready!" plays clearlyyesTODO
4.7Wire order status from kitchen -> LUME -> visual celebrationKitchen "ready" eventVisual celebration effect on LUME displayParticle burst when order marked readynoTODO
4.8Integration test: full order lifecycleJetson + Square Terminal (or mock)Voice order -> kitchen -> payment -> ready announcementEnd-to-end in <60 seconds totalnoTODO

Wave 4 Gate: Full order lifecycle works: voice order -> kitchen display -> payment (or counter pay) -> order-ready announcement with visual celebration.

---

### Wave 5: CONTENT FLYWHEEL (Days 22-27)
Build the auto-capture, QR, and content sharing pipeline.

#TaskInputOutputValidationAutoStatus
5.1Build InteractionDetector (motion energy threshold in queue zone)Analytics track data`content/interaction_detector.rs`Triggers clip capture when person interacts >15s with motionyesTODO
5.2Build ClipCapture (15s, camera + visual overlay + audio)Content compositor (existing)`content/clip_capture.rs`15s H.265 clip on NVMe, visual overlay blendedyesTODO
5.3Build QRCodeGenerator for clip URLsClip metadata`content/qr_generator.rs`QR code rendered to framebuffer overlayyesTODO
5.4Build ContentServer web pages for clip downloadClip files + CDN`content/content_server.rs` + web pagesMobile web page: preview + download + share buttonsyesTODO
5.5Build clip upload to CDN (Cloudflare R2)Clip files on NVMe`content/content_server.rs` upload pathClip accessible at lume.dance/c/{slug}/{id} within 30syesTODO
5.6Add venue branding to clips (watermark, location tag)Venue branding configBranded clipsVenue logo visible in clip corner, location in metadatayesTODO
5.7Build content analytics tracking (scans, downloads, shares)Web page eventsSupabase lume_content_clips updatesDashboard shows clip funnel metricsyesTODO
5.8Integration test: full content flywheelJetson + person interacting with LUMEInteraction -> clip -> QR -> scan -> download on phoneComplete flow in <60 seconds including uploadnoTODO

Wave 5 Gate: A customer interacts with LUME visuals, a 15-second clip is captured, QR code appears, customer scans and downloads the clip on their phone with venue branding.

---

### Wave 6: DASHBOARD + PILOT DEPLOYMENT (Days 26-35)
Build the analytics dashboard and deploy to first coffee shop.

#TaskInputOutputValidationAutoStatus
6.1Build analytics web dashboard (overview + location detail)Supabase data + dashboard specNext.js app at dashboard.lume.danceReal-time metrics visible for test venueyesTODO
6.2Build voice performance dashboard pageOrder data with STT/NLU confidenceDashboard pageSTT accuracy, misunderstood items, correction rate visibleyesTODO
6.3Build content performance dashboard pageContent clip analyticsDashboard pageClip funnel (captured -> scanned -> downloaded -> shared) visibleyesTODO
6.4Build venue setup wizard in companion appZone config + menu configCompanion app screens (or web wizard)Venue configured in <30 minutesnoTODO
6.5Build menu import from Square POS APISquare API credentialsAuto-populated menu in LUMEMenu items match Square catalog within 95
6.6Create operations playbook (install guide, troubleshooting, FAQ)All system knowledgePDF/web documentNon-technical person can install LUME following the guidenoTODO
6.7Deploy LUME Commerce to pilot coffee shop (BWB venue)LUME device + configured venueLive deploymentSystem running in real environmentnoTODO
6.8Pilot monitoring: Day 1-7 (supervised)Live deploymentDaily metrics reportNo critical failures, >80
6.9Pilot monitoring: Day 8-14 (remote)Live deploymentDaily metrics reportAuto-recovery handles >80
6.10Pilot review: Day 14 (kill criteria check)14 days of metricsGo/no-go decisionVoice accuracy >80

Wave 6 Gate: LUME Commerce running in a real coffee shop for 14 days. Voice ordering, queue analytics, content flywheel, and dashboard all operational. Kill criteria met.

---

### Wave 7: ENTERPRISE PREPARATION (Days 32-42)
Prepare for multi-location scaling.

#TaskInputOutputValidationAutoStatus
7.1Build fleet management in dashboard (add/remove/monitor locations)Dashboard + SupabaseMulti-venue dashboard3 test venues visible on map with independent metricsyesTODO
7.2Build OTA menu update (push menu changes from dashboard to LUME)Dashboard menu editorLUME receives updated menu via HTTPS pullMenu update reflects on LUME within 60 secondsyesTODO
7.3Build multi-venue analytics comparisonSupabase aggregation queriesDashboard comparison pageSide-by-side metrics for 2+ venuesyesTODO
7.4Build Stripe billing integration for Commerce tierStripe Billing APISubscription management in dashboard$149/month charges, upgrade/downgrade, invoicesyesTODO
7.5Create enterprise sales deckPilot metrics + product infoPDF/Keynote presentationCompelling 10-slide deck with real pilot datanoTODO
7.6Create pricing page at lume.dance/commercePricing tiers + CTALive web pageCTAs for demo booking + pilot signupyesTODO
7.7Prepare 3 LUME Commerce units for enterprise pilotHardware + firmware3 configured devices ready to shipAll 3 boot, run full stack, connect to dashboardnoTODO

Wave 7 Gate: Ready to pitch multi-location chains with real pilot data, professional sales materials, and 3 ready-to-ship demo units.

---

EXECUTION SUMMARY

WaveDaysTasksKey Deliverable
0: Foundation1-57Whisper + Deepgram + state machine + models
1: NLU Port5-108BWB NLU patterns in Rust, >90
2: Voice Flow10-169End-to-end voice ordering on Jetson
3: Analytics14-198Depth queue analytics, Supabase export
4: Payment + Kitchen18-238Full order lifecycle with payment + kitchen
5: Content Flywheel22-278Auto-capture + QR + download + share
6: Dashboard + Pilot26-3510Live pilot in real coffee shop
7: Enterprise Prep32-427Sales materials + multi-location ready
TOTAL42 days65 tasksProduction Commerce in real venue

Critical Path: Wave 0 -> Wave 1 -> Wave 2 -> Wave 4 -> Wave 6 (voice ordering is the critical dependency). Waves 3 and 5 can run in parallel with Waves 2 and 4.

Agent-Dispatchable Tasks: 48 of 65 (74

Kill Criteria:
- Day 5: If Whisper on Jetson <70
- Day 10: If NLU port <85
- Day 14 (pilot): If voice accuracy <80
- Day 35 (post-pilot): If barista satisfaction <7/10, rethink kitchen routing approach
- Day 42: If content QR scan rate <5

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

evo-cube-output/lume-commerce-pos/stage3-expand-master-plan.md

Detected Structure

Method · Evaluation · References · Figures · Code Anchors · Architecture · is Stage Research