Grand Diomande Research · Full HTML Reader

Stage 3: EXPAND + MASTER PLAN

**Primary STT: Deepgram Nova-3 (Cloud Streaming)** - Protocol: WebSocket to wss://api.deepgram.com/v1/listen - Parameters: encoding=linear16, sample_rate=16000, channels=1, model=nova-3, language=en, smart_format=true, keywords=["latte:2","cappuccino:2","espresso:2","oat:1.5","almond:1.5","large:1","medium:1","small:1"] - Partial results: interim_results=true (for live preview) - Latency: ~300ms for first partial, ~500ms for stable transcript - Cost: $0.0043/minute. At 500 orders/month, 30s average = $1.08/month. -

Embodied Trajectory Systems proposal experiment writeup candidate score 34 .md

Full Public Reader

# Stage 3: EXPAND + MASTER PLAN
## LUME Commerce -- Experiential Commerce Infrastructure

---

3a. RISK AUDIT

### R1: GPU Contention During Voice Processing [CRITICAL]
- Failure scenario: When Whisper.cpp fallback activates (WiFi down), the CUDA inference burst (~2-3 seconds) steals GPU cycles from the visual pipeline. Frame rate drops from 60fps to 20-30fps. All customers see the stutter, not just the person ordering. The entertainment experience degrades.
- Probability: MEDIUM (35
- Impact: MEDIUM-HIGH. Visual stutter during ordering creates a jarring experience. If WiFi is frequently unreliable, this becomes a persistent problem.
- Mitigation: (a) Pre-warm Whisper model at boot but don't run inference until needed. (b) During Whisper inference, reduce visual pipeline to 30fps (skip every other frame, doubling GPU headroom). This is imperceptible to customers vs stuttery 45fps. (c) Batch audio: accumulate 5-10s of audio, process in one burst, return to 60fps. (d) Long-term: Whisper-TensorRT optimization gives 3x speedup, reducing contention window.
- Validation criteria: Visual pipeline maintains >30fps during Whisper inference burst. No visible stutter reported by 3 out of 3 test observers in blind test.

### R2: Voice Ordering Accuracy in Coffee Shop Noise [CRITICAL]
- Failure scenario: Background noise (espresso machine 70-80 dB, conversations 55-65 dB, music 60-70 dB) degrades STT accuracy below 70
- Probability: LOW with Deepgram (15
- Impact: CRITICAL. If voice ordering doesn't work, the entire Commerce tier value proposition fails.
- Mitigation: (a) Audio preprocessing: 3-mic beamforming (directional pickup toward customer, null toward espresso machine), noise gate, band-pass 200-4kHz, AGC. (b) Deepgram Nova-3 primary (trained on millions of hours of noisy real-world audio). (c) Domain-specific vocabulary hints sent to Deepgram (coffee menu terms as keywords parameter). (d) Local Whisper as degraded fallback only. (e) Touch fallback: LUME display shows order suggestions from partial transcript, customer taps to confirm. Never force voice-only.
- Validation criteria: >85

### R3: Content Flywheel Adoption Rate [MEDIUM]
- Failure scenario: Customers don't interact with LUME visuals during queue time. They stare at their phones. QR scan rate is <5
- Probability: MEDIUM (30
- Impact: MEDIUM. Commerce and analytics still work without the flywheel. Revenue is not dependent on content sharing. But the marketing moat doesn't build.
- Mitigation: (a) Prominent placement: LUME must face the queue, not be behind the counter. (b) Staff encouragement: "Have you tried the wall display while you wait?" (c) Group dynamics: one person interacting draws others. (d) Attract mode tuned to catch peripheral attention. (e) Content incentive: "Show your LUME clip for 10
- Validation criteria: >15

### R4: Deepgram API Reliability and Latency [MEDIUM]
- Failure scenario: Deepgram WebSocket connection drops mid-order. Or latency spikes to >1 second during peak hours (their server load). Customer says "large latte" and waits 3 seconds for response. Feels broken.
- Probability: LOW (15
- Impact: MEDIUM. Local Whisper fallback activates, degrading to slower but functional voice ordering.
- Mitigation: (a) WebSocket heartbeat monitoring: if ping >500ms, switch to local Whisper within 1 second. (b) Pre-buffer: start processing audio locally on Whisper while waiting for Deepgram response. Use whichever returns first. (c) Deepgram SLA: enterprise tier includes uptime guarantee. (d) Alternative cloud STT providers (AssemblyAI, Google Chirp) as second cloud fallback.
- Validation criteria: Total STT response time (cloud or fallback) <3 seconds for 99

### R5: BWB NLU Pattern Port Accuracy [MEDIUM]
- Failure scenario: Porting 80K lines of Swift NLU patterns to Rust introduces subtle regex/matching bugs. "Large" matches differently. Modifier detection misses "oat" in "oat milk latte." The ported code is 90
- Probability: MEDIUM (30
- Impact: MEDIUM. Incorrect parsing means wrong orders, customer frustration, barista overrides.
- Mitigation: (a) Port with test suite: extract 200+ test cases from BWB_KioskTests (202 existing tests + 217 voice ordering type tests). Every test must pass in Rust before shipping. (b) Run parallel validation: send same transcripts through both Swift (on iPad) and Rust (on LUME) NLU pipelines, compare outputs. Flag divergences. (c) Start with the 50 most common coffee orders only. Expand vocabulary incrementally.
- Validation criteria: 100

### R6: Kitchen Display Adoption [LOW-MEDIUM]
- Failure scenario: Baristas already have a workflow with their existing POS kitchen display. Adding LUME's web-based kitchen display means monitoring two screens. They ignore LUME orders.
- Probability: MEDIUM (35
- Impact: MEDIUM. Voice orders not fulfilled = customers waiting longer = bad experience.
- Mitigation: (a) Default: Route LUME orders to the EXISTING POS kitchen display via API (Square Orders API, Toast Kitchen API). LUME adds orders to the barista's existing workflow. (b) Fallback: LUME announces orders via speakers ("New order: large oat milk latte!"). Audio alert is impossible to miss. (c) Web kitchen display is for shops that DON'T have an existing KDS.
- Validation criteria: 95

### R7: Privacy Perception with Cameras in Commerce Space [LOW-MEDIUM]
- Failure scenario: Customers feel surveilled by a camera-equipped device in a payment/ordering area. Social media backlash: "This coffee shop is filming you to sell your data." Venue owner removes LUME.
- Probability: LOW (15
- Impact: MEDIUM-HIGH. One viral negative post could damage the brand.
- Mitigation: (a) "Privacy Mode" signage: "This device uses depth sensors only. No photos, no facial recognition, no personal data." (b) LED indicator: green light when analytics active (no images). Red light only during content clip capture (customer-initiated). (c) Depth-only mode: toggle that disables all RGB cameras, using depth for visuals and analytics. (d) Published privacy policy specific to LUME Commerce.
- Validation criteria: Zero unresolved privacy complaints in 90-day pilot. Customer survey: >80

### R8: Menu Configuration Complexity [LOW-MEDIUM]
- Failure scenario: Setting up a new venue requires manually entering every menu item, size, modifier, and price. For a coffee shop with 40 drinks and 15 modifiers, this takes 2+ hours. Shop owner gives up.
- Probability: MEDIUM (30
- Impact: LOW-MEDIUM. Bad setup = bad NLU accuracy = bad orders.
- Mitigation: (a) Square/Toast menu sync: auto-import menu from existing POS via API. One click. (b) CSV/JSON import for other POS systems. (c) Template menus: pre-built "Coffee Shop Standard" menu with 50+ items, shop owner enables/disables. (d) Companion app menu editor with search + categorize.
- Validation criteria: New venue menu configured in <30 minutes. Template menu covers >80

### R9: Zone Configuration for Analytics [LOW]
- Failure scenario: Shop owner defines queue zone incorrectly (too large, overlapping with seating). Analytics data is meaningless. "Your queue length is 47" when actual queue is 4 people.
- Probability: LOW (15
- Impact: LOW. Bad zones = bad analytics, but voice ordering and entertainment still work.
- Mitigation: (a) Companion app with live depth view. Shop owner draws zones on the depth camera view with finger. Instant visual feedback: "LUME sees 3 people in your queue zone." (b) Auto-suggest: LUME's depth map identifies the most common body positions during first 24 hours and suggests zones. (c) Manual override always available.
- Validation criteria: Zone setup completes in <10 minutes with visual validation. Auto-suggest accuracy >70

### R10: Single-Person Scaling for Commerce Support [MEDIUM]
- Failure scenario: Commerce introduces new failure modes: failed orders, incorrect charges, payment disputes, kitchen sync issues. Each venue generates 2-3 support tickets/month. At 50 venues, that's 100-150 tickets/month. Mohamed drowns in support instead of building.
- Probability: MEDIUM-HIGH (40
- Impact: HIGH. Support backlog kills customer retention. Venues churn because issues aren't resolved.
- Mitigation: (a) Self-diagnosing system: LUME logs every order attempt, STT confidence, NLU result, and payment status. Support dashboard shows full order trace. (b) Auto-recovery: failed STT -> retry with Whisper. Failed payment -> retry or route to counter. Failed kitchen push -> audio announcement fallback. (c) FAQ bot: LLM-powered chat support on lume.dance/support trained on LUME knowledge base. (d) Hire first support contractor at 30 commerce venues ($1.5K/month).
- Validation criteria: <5

---

3b. EXPANDED SPECIFICATIONS

Spec 1: On-Device Voice Model Decision

Primary STT: Deepgram Nova-3 (Cloud Streaming)
- Protocol: WebSocket to wss://api.deepgram.com/v1/listen
- Parameters: encoding=linear16, sample_rate=16000, channels=1, model=nova-3, language=en, smart_format=true, keywords=["latte:2","cappuccino:2","espresso:2","oat:1.5","almond:1.5","large:1","medium:1","small:1"]
- Partial results: interim_results=true (for live preview)
- Latency: ~300ms for first partial, ~500ms for stable transcript
- Cost: $0.0043/minute. At 500 orders/month, 30s average = $1.08/month.
- Accuracy: >95

Fallback STT: Whisper.cpp base.en (Local CUDA)
- Model: ggml-base.en.bin (200MB, 74M params)
- Inference: whisper.cpp with CUDA backend on Jetson Orin Nano Super
- Expected: ~2-3 seconds for 10 seconds of audio
- Accuracy: ~85
- Trigger: WiFi unreachable OR Deepgram latency >500ms for 3 consecutive requests
- GPU impact: ~3ms/frame for 5-10 frames during inference. Visual pipeline drops to 30fps temporarily.

NLU: BWB Pattern Engine (Local, CPU Only)
- No LLM. The ported BWB patterns handle:
- Drink names: 50+ items with 3-5 aliases each
- Sizes: small/medium/large/tall/grande/venti + numeric (12oz, 16oz, 20oz)
- Modifiers: milks (oat, almond, soy, coconut, whole, skim), temperatures (hot, iced, blended), extras (extra shot, light ice, no whip)
- Quantities: "two," "a couple," "three," explicit numbers
- Confidence: 0.0-1.0 per parsed item based on fuzzy match score
- Processing time: <50ms on Jetson CPU for typical order transcript
- Memory: ~10MB (compiled Rust + vocabulary data)

TTS: Piper (Local, CPU Only)
- Model: en_US-lessac-medium (50MB, VITS architecture)
- Latency: <100ms per sentence
- Quality: natural-sounding, suitable for confirmations and announcements
- Output: 22050Hz WAV -> LUME speakers or HDMI audio

Benchmark Summary:

Metric	Cloud (Deepgram)	Local (Whisper)	Target
STT latency	~500ms	~2500ms	<3000ms
STT accuracy (70dB noise)	~95
NLU processing	50ms	50ms	<100ms
TTS latency	100ms	100ms	<200ms
Total order time	~2s	~4s	<5s
GPU impact	0
Monthly cost (500 orders)	$1.08 \| $0	<$5

Spec 2: Queue Analytics Data Model

sql

-- Supabase tables for LUME Commerce analytics

CREATE TABLE lume_venues (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name TEXT NOT NULL,
  slug TEXT UNIQUE NOT NULL,
  lume_device_id TEXT UNIQUE,
  timezone TEXT DEFAULT 'America/New_York',
  menu_config JSONB,          -- menu items, prices, aliases
  zone_config JSONB,          -- zone definitions (queue, service, exit, etc.)
  branding JSONB,             -- logo URL, watermark config, colors
  payment_config JSONB,       -- payment method, Square/Stripe creds
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE lume_analytics_snapshots (
  id BIGSERIAL PRIMARY KEY,
  venue_id UUID REFERENCES lume_venues(id),
  captured_at TIMESTAMPTZ NOT NULL,
  current_occupancy INT,
  queue_length INT,
  service_count INT,
  estimated_wait_seconds INT,
  avg_service_time_seconds FLOAT,
  throughput_per_hour FLOAT,
  conversion_rate FLOAT,      -- reached service / entered
  abandonment_rate FLOAT,     -- exited without service
  heatmap JSONB               -- 20x20 grid of density values
);
CREATE INDEX idx_analytics_venue_time ON lume_analytics_snapshots(venue_id, captured_at);

CREATE TABLE lume_orders (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  venue_id UUID REFERENCES lume_venues(id),
  order_number INT,           -- venue-specific sequential number
  items JSONB NOT NULL,       -- [{name, quantity, modifiers, price}]
  total_amount DECIMAL(10,2),
  status TEXT DEFAULT 'pending',  -- pending, preparing, ready, completed, cancelled
  payment_status TEXT DEFAULT 'unpaid',  -- unpaid, processing, paid, refunded
  payment_method TEXT,        -- counter, square, stripe, apple_pay
  stt_method TEXT,            -- deepgram, whisper_local
  stt_confidence FLOAT,
  nlu_confidence FLOAT,
  customer_name TEXT,         -- optional, from voice (e.g., "for Sarah")
  created_at TIMESTAMPTZ DEFAULT NOW(),
  confirmed_at TIMESTAMPTZ,
  ready_at TIMESTAMPTZ,
  completed_at TIMESTAMPTZ
);
CREATE INDEX idx_orders_venue_status ON lume_orders(venue_id, status);

CREATE TABLE lume_content_clips (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  venue_id UUID REFERENCES lume_venues(id),
  clip_url TEXT,              -- CDN URL for web access
  thumbnail_url TEXT,
  duration_seconds FLOAT,
  visual_preset TEXT,
  audio_mode TEXT,            -- listen, generate
  qr_scanned BOOLEAN DEFAULT false,
  downloaded BOOLEAN DEFAULT false,
  shared_platform TEXT,       -- tiktok, instagram, etc. (if tracked)
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE lume_track_events (
  id BIGSERIAL PRIMARY KEY,
  venue_id UUID REFERENCES lume_venues(id),
  track_id INT,               -- local centroid tracker ID (not persistent)
  event_type TEXT NOT NULL,   -- enter, zone_change, dwell, exit
  zone TEXT,                  -- queue, service, entrance, exit, seating
  position_x FLOAT,
  position_y FLOAT,
  dwell_seconds FLOAT,       -- for dwell events
  captured_at TIMESTAMPTZ NOT NULL
);
-- Partitioned by date for efficient cleanup
CREATE INDEX idx_track_events_venue_time ON lume_track_events(venue_id, captured_at);

Spec 3: BWB Code Port Map (Swift -> Rust)

Principle: Port the LOGIC, not the syntax. The BWB Swift code is organized around iOS patterns (@MainActor, ObservableObject, Combine). The Rust code uses async channels (tokio::mpsc) and message-passing instead of observation.

BWB SWIFT                              LUME RUST
---------                              ---------

VoiceOrderingOrchestrator              voice::orchestrator::VoiceOrchestrator
  @Published phase                       -> enum Phase { Idle, Listening, Processing, ... }
  ObservableObject                       -> struct with tokio::mpsc::Sender
  Singleton shared                       -> owned by CommerceEngine, not global

TranscriptPipeline                     voice::orchestrator (inline)
  processIncoming()                      -> process_transcript(raw: &str, is_final: bool)
  TranscriptNormalizer                   -> nlu::normalize(text: &str) -> String
  TranscriptStabilityTracker             -> stability_count field in orchestrator

LiveOrderPreviewGenerator              nlu::menu_matcher::MenuMatcher
  MenuAliasMatcher                       -> HashMap<String, MenuItem> + fuzzy match
  ModifierDetector                       -> regex patterns for milk/temp/extras
  QuantityExtractor                      -> regex patterns for numbers

CartCoordinator                        cart::coordinator::CartCoordinator
  pendingOrders / confirmedCart           -> Vec<ParsedItem> / Vec<ConfirmedItem>
  addPendingOrder()                      -> add_pending(item: ParsedItem)
  confirmAll()                           -> confirm_all() -> Vec<ConfirmedItem>

ConfirmationCoordinator                cart::confirmation::ConfirmationFlow
  AutoConfirmConfig (3s, 0.8 conf)       -> ConfirmConfig { delay_ms: 3000, min_confidence: 0.8 }
  confirmationTimer                      -> tokio::time::sleep(Duration::from_secs(3))

SessionManager                         cart::session::SessionManager
  120s timeout                           -> tokio::time::timeout(Duration::from_secs(120))
  recordActivity()                       -> reset_timeout()

FeedbackCoordinator                    voice::piper_bridge::PiperTTS
  speak() / speakAndWait()               -> speak(text: &str) -> impl Future
  playAudio() / playHaptic()             -> play_sound(sound: SoundType)
  AVSpeechSynthesizer                    -> Piper subprocess via stdin/stdout

QueueService                           analytics::exporter::DataExporter
  pendingOrders / inProgressOrders        -> in lume_orders table, not in-memory
  Supabase realtime subscription         -> Supabase REST + WebSocket (same)
  refreshInterval: 30s                   -> export_interval: 5s

KitchenDisplayView                     kitchen::web_display (Axum + htmx)
  SwiftUI View                           -> HTML template with htmx SSE updates
  Dark background                        -> CSS dark theme
  Order cards grid                       -> CSS Grid layout
  Status buttons                         -> htmx hx-post for status changes

Test Parity Strategy:

Extract test cases from these BWB test files:
- `BWB_KioskTests/VoiceOrchestratorTests.swift` (112 lines)
- `BWB_KioskTests/VoiceOrderingTypesTests.swift` (217 lines)
- `BWB_KioskTests/CartManagerTests.swift` (190 lines)
- `BWB_KioskTests/BWB_KioskTests.swift` (202 lines)
- `BWB_KioskTests/TTSIntegrationTests.swift` (94 lines)
- `BWBCore` 434 passing tests (extract NLU-relevant subset)

Target: 150+ Rust tests covering the same cases before Commerce goes live.

Spec 4: Content Flywheel Technical Pipeline

CLIP CAPTURE PIPELINE:

1. Engagement Detection (runs every frame, ~0.1ms)
   Input: centroid tracks in QUEUE zone from analytics engine
   Logic:
     IF any track in QUEUE zone for >15 seconds
     AND track motion_energy > 0.3 (normalized 0-1)
     AND no clip captured for this track in last 60 seconds
     THEN trigger clip capture for this track

2. Clip Capture (triggered, 15-second duration)
   - Start recording from circular buffer (last 5s already buffered)
   - Record 10 more seconds (total 15s)
   - Sources: tight camera (portrait 9:16) + visual overlay + audio
   - Compositor: blend visual frame over camera feed (same as existing)
   - Encode: NVENC H.265, 1080x1920 @ 30fps, ~15MB per clip
   - Audio: room audio (LISTEN mode) or generated (GENERATE mode)
   - Metadata: venue_slug, timestamp, visual_preset, duration

3. QR Code Generation (immediate after capture)
   - URL: https://lume.dance/c/{venue_slug}/{clip_id_short}
   - QR code rendered as overlay on LUME display (corner, non-intrusive)
   - QR visible for 30 seconds, then fades
   - Same QR also available via NFC tap (if V2 hardware)

4. Content Upload (background, WiFi)
   - Upload clip to CDN (Cloudflare R2 or Supabase Storage)
   - ~15MB per clip, WiFi 6E throughput: ~5 seconds
   - Generate thumbnail (best frame by visual energy)
   - Create entry in lume_content_clips table

5. Web Clip Page (served at lume.dance/c/...)
   - Mobile-optimized page
   - Auto-play preview (HLS, muted, 720p for quick load)
   - Download button (full 1080p file to camera roll)
   - Share buttons: TikTok, Instagram Stories, X, WhatsApp
   - Venue credit: logo + name + location
   - LUME credit: small "Made with LUME" badge

6. Analytics (tracked per clip)
   - QR scanned: boolean (link opened)
   - Preview watched: boolean (video loaded)
   - Downloaded: boolean (download button tapped)
   - Shared: platform (if detected via share intent)
   - Impressions: estimated from platform APIs (future)

Spec 5: Enterprise Fleet Dashboard

DASHBOARD: https://dashboard.lume.dance (web app, Supabase + Nexus-style)

PAGES:

1. Fleet Overview
   - Map view: all locations with real-time status (green/yellow/red)
   - Summary cards: total orders today, total queue time saved, total clips shared
   - Alerts: devices offline, unusual queue lengths, low order accuracy

2. Location Detail (per venue)
   - Real-time metrics: occupancy, queue length, wait time
   - Heatmap overlay on venue floor plan
   - Orders: list with status, time, items, voice accuracy
   - Content: clips generated today, QR scans, downloads, shares
   - Analytics: throughput chart (orders/hour), wait time trend, peak hours

3. Analytics Comparison
   - Side-by-side: Location A vs Location B vs Fleet Average
   - Metrics: wait time, throughput, conversion rate, content engagement
   - Time range selector: today, 7 days, 30 days, custom

4. Voice Performance
   - STT accuracy breakdown: Deepgram vs Whisper fallback rate
   - Most misunderstood items (NLU confidence <0.7)
   - Order correction rate (manual override by barista)
   - Top 20 ordered items with voice accuracy per item

5. Content Performance
   - Total clips: captured, scanned, downloaded, shared
   - Top-performing clips (most shares)
   - Engagement funnel: captured -> scanned -> downloaded -> shared
   - Estimated social impressions per location

6. Settings
   - Venue management: add/edit/deactivate locations
   - Menu management: bulk update across fleet
   - Zone configuration: templates by venue type
   - User management: roles (owner, manager, support)
   - Billing: subscription tier, invoices, payment method

TECH:
  - Next.js on Vercel (or Nexus-style self-hosted)
  - Supabase for data + auth + realtime
  - Charts: Recharts or D3
  - Map: Mapbox GL JS
  - Responsive: works on desktop and iPad

---

3c. MASTER EXECUTION CHECKLIST

### Wave 0: FOUNDATION (Days 1-5)
Set up the Rust Commerce Engine skeleton and validate STT on Jetson.

#	Task	Input	Output	Validation	Auto	Status
0.1	Create `lume-commerce` Rust crate with module structure	Architecture from Stage 2	Cargo.toml + src/ with all module files (empty stubs)	`cargo check` passes	yes	TODO
0.2	Integrate whisper.cpp as Rust FFI dependency	whisper.cpp repo + Jetson CUDA	`whisper_bridge.rs` with `transcribe(audio: &[f32]) -> String`	Unit test: transcribe a WAV file, get text	yes	TODO
0.3	Integrate Piper TTS as Rust subprocess bridge	Piper binary for aarch64	`piper_bridge.rs` with `speak(text: &str) -> AudioBuffer`	Unit test: speak "hello", get WAV output	yes	TODO
0.4	Benchmark Whisper base.en on Jetson (CUDA)	10 test audio files (coffee orders in noise)	Benchmark report: latency + accuracy per file	<3s for 10s audio, >80
0.5	Set up Deepgram WebSocket streaming client	Deepgram API key	`deepgram_client.rs` with streaming + partial results	Unit test: stream audio, receive partial + final transcript	yes	TODO
0.6	Create `VoiceOrchestrator` state machine (empty transitions)	BWB VoiceOrderingOrchestrator.swift	`orchestrator.rs` with Phase enum + transition logic	Unit test: Idle -> Listening -> Processing -> Confirming -> Complete	yes	TODO
0.7	Port domain models from BWB (Order, MenuItem, Customization)	BWBCore/Models/*.swift	`cart/models.rs` with Rust structs + serde	Unit test: serialize/deserialize order JSON	yes	TODO

Wave 0 Gate: Whisper.cpp runs on Jetson with CUDA. Deepgram streaming works. State machine compiles. Domain models serialize.

---

### Wave 1: NLU ENGINE PORT (Days 5-10)
Port the BWB voice NLU patterns from Swift to Rust.

#	Task	Input	Output	Validation	Auto	Status
1.1	Extract coffee menu vocabulary from BWB `VoiceNLUEngine+Vocabulary.swift`	17K lines Swift vocabulary	`nlu/vocabulary.json` (50+ items, 200+ aliases, modifiers)	Manual review: all major items present	no	TODO
1.2	Port `MenuAliasMatcher` to Rust	BWB MenuAliasMatcher logic	`nlu/menu_matcher.rs` with fuzzy string matching	Port 50 test cases from BWB, all pass	yes	TODO
1.3	Port `ModifierDetector` to Rust	BWB ModifierDetector patterns	`nlu/modifier_detector.rs` with regex patterns	Port 30 test cases: milk types, temp, extras all detected	yes	TODO
1.4	Port `QuantityExtractor` to Rust	BWB QuantityExtractor logic	`nlu/quantity_extractor.rs`	Port 20 test cases: "two," "a couple," "three" all extract correctly	yes	TODO
1.5	Port `ConfidenceScorer` to Rust	BWB OrderParsingPipeline confidence	`nlu/confidence_scorer.rs`	Confidence scores match BWB within 0.05 for 50 test transcripts	yes	TODO
1.6	Build `OrderResultMerger` (combines menu match + modifiers + qty)	BWB OrderResultMerger	`nlu/order_merger.rs`	End-to-end: "large iced oat milk latte" -> correct ParsedItem	yes	TODO
1.7	Create test corpus: 100 real coffee order transcripts with expected output	Manual creation + BWB logs	`tests/corpus/orders.json`	Reviewed for correctness	no	TODO
1.8	Run NLU pipeline against test corpus	Corpus + NLU pipeline	Accuracy report	>90

Wave 1 Gate: NLU pipeline parses "large iced oat milk latte with an extra shot" correctly. >90

---

### Wave 2: VOICE ORDERING FLOW (Days 10-16)
Wire STT -> NLU -> Cart -> Confirmation -> TTS into the complete ordering flow.

#	Task	Input	Output	Validation	Auto	Status
2.1	Build audio preprocessor (noise gate + bandpass + AGC)	SpeakFlow AudioProcessingService as reference	`voice/audio_preprocessor.rs`	Test: 70dB pink noise input -> voice frequencies isolated	yes	TODO
2.2	Build STT tier selection (Deepgram primary, Whisper fallback)	deepgram_client + whisper_bridge	`voice/stt_bridge.rs` with auto-fallback	Test: kill WiFi -> switches to Whisper within 2s	yes	TODO
2.3	Wire VoiceOrchestrator: mic -> preprocessor -> STT -> NLU -> cart	All voice + NLU components	Full voice ordering pipeline	End-to-end test: speak order -> cart populated correctly	no	TODO
2.4	Port CartCoordinator from BWB	BWB CartCoordinator logic	`cart/coordinator.rs`	Test: add/remove/modify items, calculate total	yes	TODO
2.5	Port ConfirmationCoordinator from BWB	BWB ConfirmationCoordinator	`cart/confirmation.rs` with auto-confirm	Test: auto-confirm at 0.8 confidence after 3s	yes	TODO
2.6	Port SessionManager with 120s timeout	BWB SessionManager	`cart/session.rs`	Test: session times out after 120s of inactivity	yes	TODO
2.7	Build TTS feedback flow (confirmations, order-ready)	Piper bridge	TTS integrated into orchestrator	Speaks "Large oat milk latte, $5.50. Anything else?"	yes	TODO
2.8	Build order overlay renderer for Unity visual engine	Visual engine API	Text cards in particle field (from Stage 2, Step 5 design)	Order text visible on LUME display during voice ordering	no	TODO
2.9	Integration test: 20 consecutive voice orders on Jetson	Jetson dev kit + mic	20 orders processed without crash or hang	All 20 complete, >85

Wave 2 Gate: A person can walk up to LUME, say a coffee order, see it confirmed on the visual display, and hear TTS confirmation. 20 consecutive orders without failure.

---

### Wave 3: QUEUE ANALYTICS (Days 14-19)
Build the depth-based queue analytics engine.

#	Task	Input	Output	Validation	Auto	Status
3.1	Build BodyDetector from depth silhouette mask	Depth pipeline silhouette output	`analytics/body_detector.rs`	Correctly counts 1-5 people in test depth frames	yes	TODO
3.2	Build CentroidTracker with Kalman filter	BodyDetector output	`analytics/centroid_tracker.rs`	Tracks 3 people moving for 30s with <2 ID swaps	yes	TODO
3.3	Build ZoneClassifier with configurable regions	Zone config JSON + centroids	`analytics/zone_classifier.rs`	Correctly assigns centroids to queue/service/exit zones	yes	TODO
3.4	Build MetricsAggregator with rolling windows	Track events	`analytics/metrics.rs`	Calculates wait time, throughput, conversion for test data	yes	TODO
3.5	Build HeatmapGenerator (20x20 grid)	Track events	`analytics/heatmap.rs`	Generates plausible 2D density map from 5 minutes of track data	yes	TODO
3.6	Build DataExporter (SQLite local + Supabase cloud)	Metrics + track events	`analytics/exporter.rs`	Data visible in SQLite after 60s of operation	yes	TODO
3.7	Create Supabase schema for analytics (Spec 2)	SQL from Stage 3	Supabase tables live	All tables created, test insert succeeds	yes	TODO
3.8	Integration test: analytics running alongside visual pipeline	Jetson + Femto Bolt + 2-3 people	Real-time queue count on LUME display + Supabase data	Count matches actual people. No visual stuttering.	no	TODO

Wave 3 Gate: LUME correctly counts 1-5 people, tracks dwell time, and exports analytics to Supabase. Zero impact on visual pipeline performance.

---

### Wave 4: PAYMENT + KITCHEN ROUTING (Days 18-23)
Connect the ordering flow to payment and kitchen display.

#	Task	Input	Output	Validation	Auto	Status
4.1	Build PaymentBridge (Square Terminal REST API)	Square Terminal API docs	`payment/square_terminal.rs`	Test: create order -> Terminal displays -> mock payment -> confirmation callback	yes	TODO
4.2	Build PaymentBridge (Stripe Terminal server-driven)	Stripe Terminal API docs	`payment/stripe_terminal.rs`	Same as 4.1 but with Stripe	yes	TODO
4.3	Build pay-at-counter flow (order push only, no payment on LUME)	Order confirmed event	WebSocket push to POS with order details	Barista receives order on their POS display	yes	TODO
4.4	Build KitchenRouter (WebSocket push to kitchen)	Order confirmed event	`kitchen/router.rs`	Kitchen display receives order within 2s of confirmation	yes	TODO
4.5	Build web-based Kitchen Display (htmx + Axum)	Kitchen display spec	`kitchen/web_display.rs` + HTML/CSS	Browser-based KDS shows orders, status buttons work	yes	TODO
4.6	Build "Order Ready" TTS announcement	Kitchen status update	LUME speakers announce customer name + order	"Sarah, your oat milk latte is ready!" plays clearly	yes	TODO
4.7	Wire order status from kitchen -> LUME -> visual celebration	Kitchen "ready" event	Visual celebration effect on LUME display	Particle burst when order marked ready	no	TODO
4.8	Integration test: full order lifecycle	Jetson + Square Terminal (or mock)	Voice order -> kitchen -> payment -> ready announcement	End-to-end in <60 seconds total	no	TODO

Wave 4 Gate: Full order lifecycle works: voice order -> kitchen display -> payment (or counter pay) -> order-ready announcement with visual celebration.

---

### Wave 5: CONTENT FLYWHEEL (Days 22-27)
Build the auto-capture, QR, and content sharing pipeline.

#	Task	Input	Output	Validation	Auto	Status
5.1	Build InteractionDetector (motion energy threshold in queue zone)	Analytics track data	`content/interaction_detector.rs`	Triggers clip capture when person interacts >15s with motion	yes	TODO
5.2	Build ClipCapture (15s, camera + visual overlay + audio)	Content compositor (existing)	`content/clip_capture.rs`	15s H.265 clip on NVMe, visual overlay blended	yes	TODO
5.3	Build QRCodeGenerator for clip URLs	Clip metadata	`content/qr_generator.rs`	QR code rendered to framebuffer overlay	yes	TODO
5.4	Build ContentServer web pages for clip download	Clip files + CDN	`content/content_server.rs` + web pages	Mobile web page: preview + download + share buttons	yes	TODO
5.5	Build clip upload to CDN (Cloudflare R2)	Clip files on NVMe	`content/content_server.rs` upload path	Clip accessible at lume.dance/c/{slug}/{id} within 30s	yes	TODO
5.6	Add venue branding to clips (watermark, location tag)	Venue branding config	Branded clips	Venue logo visible in clip corner, location in metadata	yes	TODO
5.7	Build content analytics tracking (scans, downloads, shares)	Web page events	Supabase lume_content_clips updates	Dashboard shows clip funnel metrics	yes	TODO
5.8	Integration test: full content flywheel	Jetson + person interacting with LUME	Interaction -> clip -> QR -> scan -> download on phone	Complete flow in <60 seconds including upload	no	TODO

Wave 5 Gate: A customer interacts with LUME visuals, a 15-second clip is captured, QR code appears, customer scans and downloads the clip on their phone with venue branding.

---

### Wave 6: DASHBOARD + PILOT DEPLOYMENT (Days 26-35)
Build the analytics dashboard and deploy to first coffee shop.

#	Task	Input	Output	Validation	Auto	Status
6.1	Build analytics web dashboard (overview + location detail)	Supabase data + dashboard spec	Next.js app at dashboard.lume.dance	Real-time metrics visible for test venue	yes	TODO
6.2	Build voice performance dashboard page	Order data with STT/NLU confidence	Dashboard page	STT accuracy, misunderstood items, correction rate visible	yes	TODO
6.3	Build content performance dashboard page	Content clip analytics	Dashboard page	Clip funnel (captured -> scanned -> downloaded -> shared) visible	yes	TODO
6.4	Build venue setup wizard in companion app	Zone config + menu config	Companion app screens (or web wizard)	Venue configured in <30 minutes	no	TODO
6.5	Build menu import from Square POS API	Square API credentials	Auto-populated menu in LUME	Menu items match Square catalog within 95
6.6	Create operations playbook (install guide, troubleshooting, FAQ)	All system knowledge	PDF/web document	Non-technical person can install LUME following the guide	no	TODO
6.7	Deploy LUME Commerce to pilot coffee shop (BWB venue)	LUME device + configured venue	Live deployment	System running in real environment	no	TODO
6.8	Pilot monitoring: Day 1-7 (supervised)	Live deployment	Daily metrics report	No critical failures, >80
6.9	Pilot monitoring: Day 8-14 (remote)	Live deployment	Daily metrics report	Auto-recovery handles >80
6.10	Pilot review: Day 14 (kill criteria check)	14 days of metrics	Go/no-go decision	Voice accuracy >80

Wave 6 Gate: LUME Commerce running in a real coffee shop for 14 days. Voice ordering, queue analytics, content flywheel, and dashboard all operational. Kill criteria met.

---

### Wave 7: ENTERPRISE PREPARATION (Days 32-42)
Prepare for multi-location scaling.

#	Task	Input	Output	Validation	Auto	Status
7.1	Build fleet management in dashboard (add/remove/monitor locations)	Dashboard + Supabase	Multi-venue dashboard	3 test venues visible on map with independent metrics	yes	TODO
7.2	Build OTA menu update (push menu changes from dashboard to LUME)	Dashboard menu editor	LUME receives updated menu via HTTPS pull	Menu update reflects on LUME within 60 seconds	yes	TODO
7.3	Build multi-venue analytics comparison	Supabase aggregation queries	Dashboard comparison page	Side-by-side metrics for 2+ venues	yes	TODO
7.4	Build Stripe billing integration for Commerce tier	Stripe Billing API	Subscription management in dashboard	$149/month charges, upgrade/downgrade, invoices	yes	TODO
7.5	Create enterprise sales deck	Pilot metrics + product info	PDF/Keynote presentation	Compelling 10-slide deck with real pilot data	no	TODO
7.6	Create pricing page at lume.dance/commerce	Pricing tiers + CTA	Live web page	CTAs for demo booking + pilot signup	yes	TODO
7.7	Prepare 3 LUME Commerce units for enterprise pilot	Hardware + firmware	3 configured devices ready to ship	All 3 boot, run full stack, connect to dashboard	no	TODO

Wave 7 Gate: Ready to pitch multi-location chains with real pilot data, professional sales materials, and 3 ready-to-ship demo units.

---

EXECUTION SUMMARY

Wave	Days	Tasks	Key Deliverable
0: Foundation	1-5	7	Whisper + Deepgram + state machine + models
1: NLU Port	5-10	8	BWB NLU patterns in Rust, >90
2: Voice Flow	10-16	9	End-to-end voice ordering on Jetson
3: Analytics	14-19	8	Depth queue analytics, Supabase export
4: Payment + Kitchen	18-23	8	Full order lifecycle with payment + kitchen
5: Content Flywheel	22-27	8	Auto-capture + QR + download + share
6: Dashboard + Pilot	26-35	10	Live pilot in real coffee shop
7: Enterprise Prep	32-42	7	Sales materials + multi-location ready
TOTAL	42 days	65 tasks	Production Commerce in real venue

Critical Path: Wave 0 -> Wave 1 -> Wave 2 -> Wave 4 -> Wave 6 (voice ordering is the critical dependency). Waves 3 and 5 can run in parallel with Waves 2 and 4.

Agent-Dispatchable Tasks: 48 of 65 (74

Kill Criteria:
- Day 5: If Whisper on Jetson <70
- Day 10: If NLU port <85
- Day 14 (pilot): If voice accuracy <80
- Day 35 (post-pilot): If barista satisfaction <7/10, rethink kitchen routing approach
- Day 42: If content QR scan rate <5

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

evo-cube-output/lume-commerce-pos/stage3-expand-master-plan.md

Detected Structure

Method · Evaluation · References · Figures · Code Anchors · Architecture · is Stage Research