Grand Diomande Research · Full HTML Reader

Voice Ordering Pipeline Architecture

> **Purpose**: Comprehensive technical documentation for the voice ordering system refactoring. > **Last Updated**: December 26, 2025 > **Status**: ✅ Complete (10/10 Steps Done)

Business Systems architecture technical paper candidate score 66 .md

Full Public Reader

Voice Ordering Pipeline Architecture

> Purpose: Comprehensive technical documentation for the voice ordering system refactoring.
> Last Updated: December 26, 2025
> Status: ✅ Complete (10/10 Steps Done)

---

Table of Contents

1. [Executive Summary](#executive-summary)
2. [Problem Statement](#problem-statement)
3. [Architecture Overview](#architecture-overview)
4. [Component Details](#component-details)
5. [Data Flow Pipeline](#data-flow-pipeline)
6. [App Routes & Views](#app-routes--views)
7. [File Structure](#file-structure)
8. [Implementation Checklist](#implementation-checklist)
9. [Integration Points](#integration-points)
10. [Technical Decisions](#technical-decisions)
11. [Testing Strategy](#testing-strategy)
12. [Rollback Plan](#rollback-plan)

---

Executive Summary

What We're Building

A modular voice ordering pipeline that decomposes the monolithic `VoiceOrderingService.swift` (2,464 lines) into 10 focused components:

#ComponentLinesResponsibilityStatus
1FeedbackCoordinator337TTS + haptics + audio✅ DONE
2SessionManager270Session lifecycle✅ DONE
3UtteranceCompletionDetector407Silence/completion detection✅ DONE
4LiveOrderPreviewGenerator950Real-time item preview✅ DONE
5TranscriptPipeline552Transcript state management✅ DONE
6CartCoordinator440Cart state separation✅ DONE
7ConfirmationCoordinator626Confirmation flow✅ DONE
8OrderParsingPipeline1,016Hybrid AI+NLU parsing✅ DONE
9VoiceOrderingOrchestrator714Thin coordinator✅ DONE
10Testing & Cleanup600+Unit tests (207 tests, 181 passing)✅ DONE

User-Chosen Strategies

1. Parsing Strategy: Hybrid Merge - Run AI and NLU parsers in parallel, merge results
2. Live Preview: Pattern Matching - Fast regex-based matching (~50ms latency)
3. Implementation Approach: Incremental - Extract one component at a time

---

Problem Statement

The Monolith Issue

`VoiceOrderingService.swift` violates Single Responsibility Principle by handling:

1. Audio session management
2. Speech recognition control (iOS 17 legacy + iOS 26 enhanced)
3. Transcription handling (volatile + stable)
4. Live preview generation
5. Utterance completion detection (6+ state variables)
6. AI/NLU parsing orchestration
7. Confirmation flow management
8. TTS/audio feedback
9. Wake word detection
10. Error recovery (circuit breaker)
11. Session management (timeout)
12. Cart coordination

Pain Points

IssueImpactSolution
2,464 linesUnmaintainableSplit into 10 components
30+ state variablesBug-proneConsolidate into structs
7 concurrent timersRace conditionsCentralized timer management
Cyclomatic complexity ~30UntestableStrategy pattern
Hard-coded singletonsCan't mockDependency injection

---

Architecture Overview

High-Level Pipeline

┌─────────────────────────────────────────────────────────────────────────────┐
│                        VoiceOrderingOrchestrator                            │
│  (Thin coordinator - manages pipeline execution and state transitions)      │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
        ┌─────────────────────────────┼─────────────────────────────┐
        ▼                             ▼                             ▼
┌───────────────┐           ┌─────────────────┐           ┌─────────────────┐
│  Audio Layer  │           │ Processing Layer │           │   UI Layer      │
│               │           │                 │           │                 │
│ AudioCapture  │           │ TranscriptPipeline│         │ FeedbackCoord   │
│ Controller    │           │ OrderParsing     │           │ SessionManager  │
│               │           │ Pipeline         │           │                 │
└───────────────┘           └─────────────────┘           └─────────────────┘

Component Dependency Graph

VoiceOrderingOrchestrator
├── AudioCaptureController (wraps VoiceServiceProtocol)
│   ├── SpeechAnalyzerService (iOS 26+)
│   └── LegacyVoiceService (iOS 17-25)
├── TranscriptPipeline
│   ├── TranscriptNormalizer
│   └── TranscriptStabilityDetector
├── UtteranceCompletionDetector
│   └── TranscriptStabilityTracker
├── LiveOrderPreviewGenerator
│   ├── MenuAliasMatcher
│   ├── ModifierDetector
│   └── QuantityExtractor
├── OrderParsingPipeline (HYBRID STRATEGY)
│   ├── AIOrderParser (wraps AITranscriptParser)
│   ├── NLUOrderParser (wraps VoiceNLUEngine)
│   └── OrderResultMerger
├── CartCoordinator
│   └── ConstraintEngine
├── ConfirmationCoordinator
│   ├── ConfirmationResponseDetector
│   └── ConfirmationMessageGenerator
├── FeedbackCoordinator ✅ DONE
│   ├── AVSpeechSynthesizer
│   └── Haptics
└── SessionManager ✅ DONE
    └── Timer management

---

Component Details

1. FeedbackCoordinator ✅ COMPLETE

File: `BWBCore/Sources/BWBCore/Voice/Coordination/FeedbackCoordinator.swift`

Purpose: Centralize all TTS, haptic, and audio feedback

Public API:

swift
@MainActor
public final class FeedbackCoordinator: NSObject, ObservableObject {
    // State
    @Published public private(set) var isSpeaking: Bool

    // TTS
    public func speak(_ text: String, thenListen: Bool, onComplete: (() -> Void)?)
    public func speakAndWait(_ text: String, thenListen: Bool) async
    public func stopSpeaking()

    // Audio Feedback
    public func playAudio(_ feedback: VoiceAudioFeedback)
    public func playStartListening()
    public func playStopListening()
    public func playWakeWordDetected()
    public func playItemAdded()
    public func playOrderConfirmed()
    public func playClarificationNeeded()
    public func playClarificationReceived()
    public func playHelpProvided()
    public func playError()
    public func playErrorRecovery()

    // Haptics
    public func playHaptic(_ feedback: VoiceHapticFeedback)
    public func playSelection()
    public func playSuccess()
    public func playMedium()
    public func playLight()
}

Enums:

swift
public enum VoiceAudioFeedback {
    case wakeWordDetected   // System sound 1057
    case itemAdded          // System sound 1104
    case orderConfirmed     // System sound 1025
    case clarificationNeeded // System sound 1315
    case clarificationReceived // System sound 1104
    case helpProvided       // System sound 1114
    case error              // System sound 1053
    case errorRecovery      // System sound 1007
    case listening          // System sound 1306
    case stopped            // System sound 1306
}

public enum VoiceHapticFeedback {
    case selection, itemAdded, orderConfirmed, error, warning, wakeWord
}

Integration in VoiceOrderingService:

swift
// OLD (removed)
private let speechSynthesizer = AVSpeechSynthesizer()
@Published private(set) var isSpeaking = false
private var pendingListenAfterSpeech = false

// NEW
private let feedbackCoordinator = FeedbackCoordinator()
var isSpeaking: Bool { feedbackCoordinator.isSpeaking }

// Usage
feedbackCoordinator.speak(text, thenListen: true) { [weak self] in
    self?.startListening()
}

---

2. SessionManager ✅ COMPLETE

File: `BWBCore/Sources/BWBCore/Voice/Coordination/SessionManager.swift`

Purpose: Manage voice ordering session lifecycle

Public API:

swift
public enum VoiceSessionState: String, Sendable {
    case inactive, active, timedOut
}

@MainActor
public protocol SessionManagerDelegate: AnyObject {
    func sessionDidTimeout()
    func sessionStateDidChange(to state: VoiceSessionState)
}

@MainActor
public final class SessionManager: ObservableObject {
    // State
    @Published public private(set) var isSessionActive: Bool
    @Published public private(set) var sessionState: VoiceSessionState
    @Published public private(set) var sessionDuration: TimeInterval
    @Published public private(set) var timeUntilTimeout: TimeInterval?

    // Session Info
    public private(set) var sessionId: UUID?
    public private(set) var sessionStartTime: Date?
    public private(set) var lastActivityTime: Date?

    // Delegate
    public weak var delegate: SessionManagerDelegate?

    // Configuration
    public var sessionTimeoutInterval: TimeInterval // Default: 120s
    public var autoTimeoutEnabled: Bool

    // Control
    @discardableResult public func startSession() -> UUID
    public func endSession(reason: SessionEndReason)
    public func recordActivity()
    public func resetTimeout()
    public func cancelTimeout()

    // Helpers
    public var formattedDuration: String
    public var formattedTimeUntilTimeout: String?
}

public enum SessionEndReason: String, Sendable {
    case userEnded, timeout, completed, error, newSession
}

Integration in VoiceOrderingService:

swift
// OLD (removed)
private var sessionTimeoutTimer: Timer?
private let sessionTimeoutInterval: TimeInterval = 120

// NEW
private let sessionManager = SessionManager()

// In startSession():
sessionManager.delegate = self
sessionManager.startSession()

// In endSession():
sessionManager.endSession()

// Reset timeout:
private func resetSessionTimeout() {
    sessionManager.recordActivity()
}

// Delegate conformance:
extension VoiceOrderingService: SessionManagerDelegate {
    func sessionDidTimeout() {
        if dialogueManager.cart.isEmpty {
            endSession()
        } else {
            sessionManager.resetTimeout()
        }
    }
    func sessionStateDidChange(to state: VoiceSessionState) {
        Logger.voice.debug("Session state: \(state.rawValue)")
    }
}

---

3. UtteranceCompletionDetector ⏳ PENDING

File: `BWBCore/Sources/BWBCore/Voice/Detection/UtteranceCompletionDetector.swift`

Purpose: Determine when user has finished speaking

Current Code Location: `VoiceOrderingService.swift` lines 826-939

Variables to Extract:

swift
// Currently in VoiceOrderingService
private var lastTranscriptUpdate: Date = Date()
private var stableTranscript: String = ""
private var transcriptStabilityCount: Int = 0
private let requiredStabilityCount: Int = 1
private let transcriptStabilityInterval: TimeInterval = 0.4
private var lastSpeechTime: Date?

Proposed API:

swift
public struct UtteranceAnalysis {
    public let isComplete: Bool
    public let confidence: Double
    public let reason: CompletionReason
    public let processingCountdown: TimeInterval?

    public enum CompletionReason {
        case silenceTimeout
        case stableTranscript
        case orderEndingPhrase
        case confirmationKeyword
        case explicitStop
    }
}

public final class UtteranceCompletionDetector {
    // Configuration
    public var silenceThreshold: TimeInterval = 1.5
    public var stabilityInterval: TimeInterval = 0.4
    public var requiredStabilityCount: Int = 1

    // Order-ending patterns
    public var orderEndingPatterns: [String] = [
        "latte", "coffee", "mocha", "cappuccino", "espresso",
        "please", "thanks", "thank you", "that's it", "that's all"
    ]

    // Analysis
    public func analyze(
        transcript: String,
        isSpeechDetected: Bool,
        timeSinceLastUpdate: TimeInterval,
        timeSinceLastSpeech: TimeInterval
    ) -> UtteranceAnalysis

    public func reset()
}

---

4. LiveOrderPreviewGenerator ⏳ PENDING

File: `BWBCore/Sources/BWBCore/Voice/Detection/LiveOrderPreviewGenerator.swift`

Purpose: Generate real-time order previews as user speaks (no server calls)

Current Code Location: `VoiceOrderingService.swift` lines 766-824, 680-751

Proposed API:

swift
public struct LiveOrderItem: Identifiable, Equatable {
    public let id: UUID
    public let label: String
    public let menuItemId: String?
    public let quantity: Int
    public let modifiers: [LiveModifier]
    public let confidence: Double
    public let detectedAt: Date
    public let matchIndex: Int

    public struct LiveModifier: Identifiable, Equatable {
        public let id: UUID
        public let type: ModifierType
        public let label: String
        public let icon: ModifierIcon

        public enum ModifierType {
            case size, temperature, milk, syrup, extra
        }
        public enum ModifierIcon {
            case ice, coffee, milk, sparkle, custom
        }
    }
}

public final class LiveOrderPreviewGenerator {
    private var itemMetadata: [String: Date] = [:]
    private var modifierMetadata: [String: Date] = [:]

    private let menuMatcher: MenuAliasMatcher
    private let modifierDetector: ModifierDetector
    private let quantityExtractor: QuantityExtractor

    public func generatePreview(from transcript: String) -> [LiveOrderItem]
    public func reset()
}

// Supporting components
public final class MenuAliasMatcher {
    public func findMatches(in text: String) -> [MenuMatch]
}

public final class ModifierDetector {
    public func detectModifiers(in text: String) -> [DetectedModifier]
}

public final class QuantityExtractor {
    public func extractQuantity(from text: String, near itemName: String) -> Int
}

---

5. TranscriptPipeline ⏳ PENDING

File: `BWBCore/Sources/BWBCore/Voice/Pipeline/TranscriptPipeline.swift`

Purpose: Process raw transcription into normalized, structured form

Current Variables to Consolidate:

swift
// Currently scattered in VoiceOrderingService
@Published var transcript: String = ""           // Line 37
@Published var liveTranscript: String = ""       // Line 38
private var stableTranscript: String = ""        // Line 140
private var sessionTranscript: String = ""       // Line 189
private var lastProcessedTranscript: String = "" // Line 134

Proposed API:

swift
public struct TranscriptState: Equatable {
    public var raw: String = ""              // Volatile from recognizer
    public var normalized: String = ""       // Cleaned
    public var stable: String = ""           // Stable for processing
    public var sessionAccumulated: String = "" // Full session history
    public var lastProcessed: String = ""    // For deduplication

    public mutating func updateRaw(_ text: String)
    public mutating func stabilize()
    public mutating func appendToSession()
    public mutating func reset()
}

public final class TranscriptPipeline: ObservableObject {
    @Published public private(set) var state: TranscriptState = .init()

    private let normalizer: TranscriptNormalizer

    public func processIncoming(_ rawTranscript: String, isFinal: Bool)
    public func getFullSessionTranscript() -> String
    public func reset()
}

public struct TranscriptNormalizer {
    public func normalize(_ text: String) -> String
    public func applySpeechCorrections(_ text: String) -> String
}

---

6. CartCoordinator ⏳ PENDING

File: `BWBCore/Sources/BWBCore/Voice/Coordination/CartCoordinator.swift`

Purpose: Manage pending orders and confirmed cart separately

Proposed API:

swift
public final class CartCoordinator: ObservableObject {
    @Published public private(set) var pendingOrders: [VoiceParsedOrder] = []
    @Published public private(set) var confirmedCart: [VoiceParsedOrder] = []
    @Published public private(set) var pendingClarification: ClarificationRequest?

    private let constraintEngine: ConstraintEngine

    // Pending operations
    public func addToPending(_ orders: [VoiceParsedOrder])
    public func confirmPending()
    public func rejectPending()
    public func clearPending()

    // Cart operations
    public func clearAll()
    public func updateQuantity(itemId: String, quantity: Int)
    public func removeItem(itemId: String)
    public func applyModifier(itemId: String, modifier: VoiceModifier)

    // Validation
    public func validatePending() -> [ValidationViolation]

    // Export
    public func exportToOrderItems() -> [OrderItem]

    // State
    public var totalItemCount: Int
    public var isEmpty: Bool
}

---

7. ConfirmationCoordinator ⏳ PENDING

File: `BWBCore/Sources/BWBCore/Voice/Coordination/ConfirmationCoordinator.swift`

Purpose: Handle order confirmation flow

Current Code Location: `VoiceOrderingService.swift` lines 1468-1503, 1766-1829

Proposed API:

swift
public enum ConfirmationResult {
    case confirmed
    case rejected
    case modified(String)
    case unclear
}

public enum ConfirmationState: Equatable {
    case idle
    case awaitingConfirmation(orders: [VoiceParsedOrder])
    case processing
    case confirmed
    case rejected
}

public final class ConfirmationCoordinator: ObservableObject {
    @Published public var confirmationState: ConfirmationState = .idle
    @Published public var confirmationMessage: String = ""
    @Published public var autoConfirmCountdown: TimeInterval?

    // Configuration
    public var autoConfirmEnabled: Bool = false
    public var autoConfirmDuration: TimeInterval = 5.0
    public var minConfidenceForAutoConfirm: Double = 0.98

    // Keywords (localized)
    public var yesKeywords: [String]
    public var noKeywords: [String]

    // Control
    public func startConfirmation(for orders: [VoiceParsedOrder])
    public func processResponse(_ transcript: String) -> ConfirmationResult
    public func startAutoConfirmCountdown()
    public func cancelAutoConfirm()
    public func generateConfirmationMessage(for orders: [VoiceParsedOrder]) -> String
}

---

8. OrderParsingPipeline ⏳ PENDING

File: `BWBCore/Sources/BWBCore/Voice/Pipeline/OrderParsingPipeline.swift`

Purpose: Parse transcript into structured orders using HYBRID strategy

Proposed API:

swift
public struct OrderParseResult: Sendable {
    public let items: [ParsedOrderItem]
    public let intent: OrderIntent
    public let confidence: Double
    public let source: ParseSource
    public let warnings: [String]

    public enum ParseSource: String, Sendable {
        case ai, nlu, fallback, hybrid
    }

    public enum OrderIntent: String, Sendable {
        case order, clearOrder, readCart, help, confirm, decline
    }
}

public struct ParsedOrderItem: Sendable {
    public let itemId: String?
    public let itemName: String
    public let quantity: Int
    public let size: DrinkSize?
    public let temperature: DrinkTemperature?
    public let milk: MilkType?
    public let syrups: [String]
    public let modifiers: [String]
    public let confidence: Double
}

public final class OrderParsingPipeline {
    private let aiParser: AIOrderParser
    private let nluParser: NLUOrderParser
    private let resultMerger: OrderResultMerger

    public var strategy: ParsingStrategy = .hybrid

    public enum ParsingStrategy {
        case aiFirst       // AI → NLU fallback
        case nluFirst      // NLU → AI enhancement
        case hybrid        // Run both, merge (DEFAULT)
        case aiOnly
        case nluOnly
    }

    public func parse(_ transcript: String) async -> OrderParseResult
}

// Hybrid merge logic
public struct OrderResultMerger {
    public func merge(ai: OrderParseResult, nlu: OrderParseResult) -> OrderParseResult
    // Strategy:
    // 1. AI confidence > 0.8 → use AI
    // 2. AI fails → use NLU
    // 3. Both succeed → AI with NLU validation
}

---

9. VoiceOrderingOrchestrator ⏳ PENDING

File: `BWB_POS/BWB_POS/Services/VoiceOrderingOrchestrator.swift`

Purpose: Thin coordinator that wires all components together

Proposed API:

swift
@MainActor
public final class VoiceOrderingOrchestrator: ObservableObject {
    // Published state for UI
    @Published public var orderingState: VoiceOrderingState = .idle
    @Published public var currentTranscript: String = ""
    @Published public var liveItems: [LiveOrderItem] = []
    @Published public var cart: [VoiceParsedOrder] = []
    @Published public var confirmationMessage: String = ""
    @Published public var isSpeechDetected: Bool = false
    @Published public var isSpeaking: Bool = false
    @Published public var errorMessage: String?

    // Components (injected)
    private let audioCaptureController: AudioCaptureController
    private let transcriptPipeline: TranscriptPipeline
    private let utteranceDetector: UtteranceCompletionDetector
    private let livePreviewGenerator: LiveOrderPreviewGenerator
    private let orderParsingPipeline: OrderParsingPipeline
    private let cartCoordinator: CartCoordinator
    private let confirmationCoordinator: ConfirmationCoordinator
    private let feedbackCoordinator: FeedbackCoordinator
    private let sessionManager: SessionManager

    // Session control
    public func startSession()
    public func endSession()

    // Voice control
    public func startListening()
    public func stopListening()

    // Order actions
    public func confirmOrder()
    public func rejectOrder()
    public func clearCart()

    // Cart access
    public func finalizeOrder() -> [OrderItem]
}

---

Data Flow Pipeline

Complete Pipeline Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                             USER SPEAKS                                      │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ STAGE 1: AUDIO CAPTURE                                                       │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ AudioCaptureController                                                   │ │
│ │ • Manages VoiceServiceProtocol (iOS 26 SpeechAnalyzer / iOS 17 Legacy)  │ │
│ │ • Handles permissions, audio session                                     │ │
│ │ • Emits: VoiceTranscriptionResult (text, confidence, isVolatile)        │ │
│ │ • Emits: VoiceActivityResult (speechDetected, timestamp)                │ │
│ └─────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                    ┌─────────────────┼─────────────────┐
                    ▼                 ▼                 ▼
┌───────────────────────┐  ┌──────────────────┐  ┌──────────────────┐
│ STAGE 2a: TRANSCRIPT  │  │ STAGE 2b: VAD    │  │ STAGE 2c: LIVE   │
│ PIPELINE              │  │ TRACKING         │  │ PREVIEW          │
│ ┌───────────────────┐ │  │                  │  │ ┌──────────────┐ │
│ │TranscriptPipeline │ │  │ • isSpeechDetected│ │ │LiveOrderPrev │ │
│ │• Normalize        │ │  │ • lastSpeechTime │  │ │• Pattern match│ │
│ │• Track stability  │ │  │                  │  │ │• Menu aliases │ │
│ │• Accumulate       │ │  └──────────────────┘  │ │• Modifiers   │ │
│ └───────────────────┘ │                        │ └──────────────┘ │
└───────────────────────┘                        └──────────────────┘
          │                         │                      │
          ▼                         ▼                      │
┌─────────────────────────────────────────────┐           │
│ STAGE 3: UTTERANCE COMPLETION               │           │
│ ┌─────────────────────────────────────────┐ │           │
│ │ UtteranceCompletionDetector             │ │  ────────▶│ UI updates
│ │ • Check: stable + silent + content      │ │           │ liveItems[]
│ │ • Detect order-ending patterns          │ │           │
│ │ • Calculate processing countdown        │ │           │
│ └─────────────────────────────────────────┘ │           │
└─────────────────────────────────────────────┘
          │
          ▼ (when isComplete = true)
┌─────────────────────────────────────────────────────────────────────────────┐
│ STAGE 4: ORDER PARSING (HYBRID)                                              │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ OrderParsingPipeline                                                     │ │
│ │                                                                          │ │
│ │    ┌──────────────┐         ┌──────────────┐                            │ │
│ │    │ AIOrderParser│    ║    │ NLUOrderParser│     (run in parallel)     │ │
│ │    │ (Gemini 3)   │    ║    │ (rule-based) │                            │ │
│ │    └──────┬───────┘    ║    └──────┬───────┘                            │ │
│ │           │            ║           │                                     │ │
│ │           └────────────╬───────────┘                                     │ │
│ │                        ║                                                  │ │
│ │                        ▼                                                  │ │
│ │               ┌────────────────┐                                         │ │
│ │               │OrderResultMerger│                                        │ │
│ │               │• AI > 0.8 → AI │                                         │ │
│ │               │• AI fails → NLU│                                         │ │
│ │               │• Both → validate│                                        │ │
│ │               └────────────────┘                                         │ │
│ └─────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
          │
          ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ STAGE 5: CART MANAGEMENT                                                     │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ CartCoordinator                                                          │ │
│ │ • Add to pendingOrders[]                                                 │ │
│ │ • Validate with ConstraintEngine                                         │ │
│ │ • Check for clarifications needed                                        │ │
│ │ • On confirm: move to confirmedCart[]                                    │ │
│ └─────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
          │
          ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ STAGE 6: CONFIRMATION                                                        │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ ConfirmationCoordinator                                                  │ │
│ │ • Generate confirmation message                                          │ │
│ │ • Await user response (voice or tap)                                     │ │
│ │ • Optional: auto-confirm countdown (disabled by default)                 │ │
│ │ • On "yes" → confirmPending() → clear liveItems, reset transcript       │ │
│ │ • On "no" → rejectPending() → clear pending, keep listening             │ │
│ └─────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
          │
          ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ STAGE 7: FEEDBACK                                                            │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ FeedbackCoordinator ✅ DONE                                              │ │
│ │ • TTS: Speak confirmation message                                        │ │
│ │ • Haptics: Success/error feedback                                        │ │
│ │ • Audio: System sounds for events                                        │ │
│ └─────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
          │
          ▼
     FINAL CART
     (ready for checkout)

---

App Routes & Views

BWB_POS (iPad Point-of-Sale App)

Route/ViewFileUses Voice?Description
Main Tab: Queue`Views/Queue/QueueManagementView.swift`NoOrder queue management
Main Tab: Kiosk`Views/Kiosk/KioskOrderingView.swift`YESVoice ordering kiosk
Main Tab: Walk-In`Views/Orders/NewOrderView.swift`NoManual order entry
Main Tab: Analytics`Views/Analytics/AnalyticsDashboardView.swift`NoSales/performance
Main Tab: Waste`Views/Waste/WasteTrackingView.swift`NoWaste tracking
Kiosk Payment`Views/Kiosk/KioskPaymentView.swift`NoPayment after voice order

Voice Ordering Flow in KioskOrderingView

KioskOrderingView
├── VoiceOrderingService.shared (current - will become VoiceOrderingOrchestrator)
├── UI Components:
│   ├── Microphone button (start/stop session)
│   ├── Transcript display (liveTranscript)
│   ├── Live items preview (liveItems[])
│   ├── Confirmation overlay (confirmationMessage)
│   ├── Cart summary (cart[])
│   └── Checkout prompt
├── State Bindings:
│   ├── @StateObject voiceService
│   ├── @State showingPayment
│   ├── @State showCheckoutPrompt
│   └── @State checkoutPromptDismissed
└── Navigation:
    └── .sheet(isPresented: $showingPayment) → KioskPaymentView

BWB_Customer (iPhone Customer App)

Route/ViewFileUses Voice?Description
Home`Views/Home/CustomerHomeView.swift`NoMain landing
Menu`Views/Menu/MenuView.swift`NoBrowse menu
Menu Detail`Views/Menu/MenuItemDetailView.swift`NoItem customization
Cart`Views/Cart/CartView.swift`NoReview cart
Checkout`Views/Cart/CheckoutView.swift`NoPayment
Orders`Views/Orders/OrdersView.swift`NoOrder history
Events`Views/Events/EventsView.swift`NoEvents listing
Rewards`Views/Rewards/RewardsView.swift`NoLoyalty program

> Note: Customer app currently does NOT use voice ordering. Voice is iPad kiosk only.

---

File Structure

Current Structure

BWBCore/Sources/BWBCore/
├── Voice/
│   ├── Coordination/                    # ✅ NEW FOLDER
│   │   ├── FeedbackCoordinator.swift   # ✅ DONE
│   │   └── SessionManager.swift        # ✅ DONE
│   ├── AITranscriptParser.swift        # Existing - AI parsing
│   ├── VoiceNLUEngine.swift            # Existing - Rule-based NLU
│   ├── VoiceDialogueManager.swift      # Existing - State machine
│   ├── VoiceTypes.swift                # Existing - Data types
│   ├── VoiceServiceProtocol.swift      # Existing - Abstraction
│   ├── SpeechAnalyzerService.swift     # Existing - iOS 26
│   ├── LegacyVoiceService.swift        # Existing - iOS 17
│   ├── WakeWordDetector.swift          # Existing
│   └── ConstraintEngine.swift          # Existing

BWB_POS/BWB_POS/
├── Services/
│   └── VoiceOrderingService.swift      # MODIFYING (was 2464 lines)
└── Views/
    └── Kiosk/
        ├── KioskOrderingView.swift
        └── KioskPaymentView.swift

Target Structure (After Refactoring)

BWBCore/Sources/BWBCore/
├── Voice/
│   ├── Coordination/
│   │   ├── FeedbackCoordinator.swift       # ✅ DONE
│   │   ├── SessionManager.swift            # ✅ DONE
│   │   ├── CartCoordinator.swift           # ✅ Step 6
│   │   ├── ConfirmationCoordinator.swift   # ✅ Step 7
│   │   └── AudioCaptureController.swift    # ⏳ Step 9
│   ├── Detection/
│   │   ├── UtteranceCompletionDetector.swift # ✅ Step 3
│   │   ├── TranscriptStabilityTracker.swift  # ✅ Step 3
│   │   ├── LiveOrderPreviewGenerator.swift   # ✅ Step 4
│   │   ├── MenuAliasMatcher.swift            # ✅ Step 4
│   │   ├── ModifierDetector.swift            # ✅ Step 4
│   │   └── QuantityExtractor.swift           # ✅ Step 4
│   ├── Pipeline/
│   │   ├── TranscriptPipeline.swift          # ✅ Step 5
│   │   ├── TranscriptState.swift             # ✅ Step 5
│   │   ├── TranscriptNormalizer.swift        # ✅ Step 5
│   │   ├── OrderParsingPipeline.swift        # ✅ Step 8
│   │   ├── AIOrderParser.swift               # ✅ Step 8
│   │   ├── NLUOrderParser.swift              # ✅ Step 8
│   │   └── OrderResultMerger.swift           # ✅ Step 8
│   ├── AITranscriptParser.swift              # Keep
│   ├── VoiceNLUEngine.swift                  # Keep
│   ├── VoiceDialogueManager.swift            # Keep (may simplify later)
│   ├── VoiceTypes.swift                      # Keep
│   ├── VoiceServiceProtocol.swift            # Keep
│   ├── SpeechAnalyzerService.swift           # Keep
│   ├── LegacyVoiceService.swift              # Keep
│   ├── WakeWordDetector.swift                # Keep
│   └── ConstraintEngine.swift                # Keep

BWB_POS/BWB_POS/
├── Services/
│   ├── VoiceOrderingService.swift            # DEPRECATED after Step 9
│   └── VoiceOrderingOrchestrator.swift       # ⏳ Step 9 (NEW)
└── Views/
    └── Kiosk/
        ├── KioskOrderingView.swift           # Update for orchestrator
        └── KioskPaymentView.swift

---

Implementation Checklist

Step 1: FeedbackCoordinator ✅ COMPLETE

  • [x] Create `BWBCore/Sources/BWBCore/Voice/Coordination/` directory
  • [x] Create `FeedbackCoordinator.swift`
  • [x] Define `VoiceAudioFeedback` enum
  • [x] Define `VoiceHapticFeedback` enum
  • [x] Implement TTS methods (`speak`, `speakAndWait`, `stopSpeaking`)
  • [x] Implement audio feedback methods
  • [x] Implement haptic feedback methods
  • [x] Conform to `AVSpeechSynthesizerDelegate`
  • [x] Build BWBCore successfully
  • [x] Update `VoiceOrderingService` to use `feedbackCoordinator`
  • [x] Remove old `speechSynthesizer`, `isSpeaking`, `pendingListenAfterSpeech`
  • [x] Remove `AVSpeechSynthesizerDelegate` from `VoiceOrderingService`
  • [x] Update `speakConfirmation()` to use coordinator
  • [x] Update `endSession()` to use `feedbackCoordinator.stopSpeaking()`
  • [x] Update `reset()` to use coordinator
  • [x] Build BWB_POS successfully
  • [x] Test TTS works in kiosk mode

Step 2: SessionManager ✅ COMPLETE

  • [x] Create `SessionManager.swift`
  • [x] Define `VoiceSessionState` enum
  • [x] Define `SessionManagerDelegate` protocol
  • [x] Define `SessionEndReason` enum
  • [x] Implement session lifecycle (`startSession`, `endSession`)
  • [x] Implement timeout tracking (`recordActivity`, `resetTimeout`)
  • [x] Implement duration tracking
  • [x] Build BWBCore successfully
  • [x] Update `VoiceOrderingService` to use `sessionManager`
  • [x] Remove old `sessionTimeoutTimer`, `sessionTimeoutInterval`
  • [x] Update `startSession()` to call `sessionManager.startSession()`
  • [x] Update `endSession()` to call `sessionManager.endSession()`
  • [x] Update `resetSessionTimeout()` to call `sessionManager.recordActivity()`
  • [x] Add `SessionManagerDelegate` conformance
  • [x] Build BWB_POS successfully

Step 3: UtteranceCompletionDetector ✅ COMPLETE

  • [x] Create `BWBCore/Sources/BWBCore/Voice/Detection/` directory
  • [x] Create `UtteranceCompletionDetector.swift`
  • [x] Create `TranscriptStabilityTracker.swift`
  • [x] Define `UtteranceAnalysis` struct with `isComplete`, `confidence`, `reason`, `processingCountdown`
  • [x] Define `CompletionReason` enum (notReady, silenceTimeout, stableTranscript, orderEndingPhrase, endOfItemKeyword, hardTimeout, explicitStop, empty, alreadyProcessed)
  • [x] Extract `checkUtteranceComplete()` logic into `analyze()` method
  • [x] Extract stability tracking variables to `TranscriptStabilityTracker`
  • [x] Implement `analyze()` method with linguistic pattern detection
  • [x] Implement `reset()` and `fullReset()` methods
  • [x] Implement `configureWithLocale()` for localized keywords
  • [x] Build BWBCore successfully
  • [x] Update `VoiceOrderingService` to use `utteranceDetector`
  • [x] Remove old `stableTranscript`, `transcriptStabilityCount`, `lastTranscriptUpdate` variables
  • [x] Update `handleEnhancedTranscription()` to use detector
  • [x] Update `checkUtteranceComplete()` to use detector's `analyze()` method
  • [x] Update `resetUtteranceTracking()` to use detector
  • [x] Update `processCurrentTranscript()` to use detector
  • [x] Update `reset()` to call `utteranceDetector.fullReset()`
  • [x] Add detector configuration in initialization
  • [x] Build BWB_POS successfully

Step 4: LiveOrderPreviewGenerator ✅ COMPLETE

  • [x] Create `LiveOrderPreviewGenerator.swift`
  • [x] Create `MenuAliasMatcher.swift` with `LiveMenuMatch` type
  • [x] Create `ModifierDetector.swift` with `DetectedModifier` and `ModifierType`
  • [x] Create `QuantityExtractor.swift`
  • [x] Define `LiveOrderItem` struct with display formatting
  • [x] Define `LiveModifier` struct with icon support
  • [x] Define `OrderPreview` struct (moved from VoiceOrderingService)
  • [x] Implement drink alias matching (40+ aliases)
  • [x] Implement size/temperature/milk detection via ModifierDetector
  • [x] Implement syrup and shot detection
  • [x] Implement quantity extraction (words + digits)
  • [x] Implement `generateItems()` method
  • [x] Implement `generatePreview()` method
  • [x] Implement deduplication via itemMetadata
  • [x] Build BWBCore successfully
  • [x] Build BWB_POS successfully
  • [ ] Full integration deferred to Step 9 (VoiceOrderingOrchestrator)

Step 5: TranscriptPipeline ✅ COMPLETE

  • [x] Create `BWBCore/Sources/BWBCore/Voice/Pipeline/` directory
  • [x] Create `TranscriptPipeline.swift`
  • [x] Create `TranscriptState.swift`
  • [x] Create `TranscriptNormalizer.swift`
  • [x] Define `TranscriptState` struct (raw, normalized, stable, sessionAccumulated, lastProcessed)
  • [x] Consolidate transcript variables into state struct
  • [x] Implement normalization (50+ corrections)
  • [x] Implement speech corrections ("ice" → "iced", "expresso" → "espresso", etc.)
  • [x] Implement filler word removal ("um", "uh", "like", etc.)
  • [x] Implement session accumulation via `appendToSession()`
  • [x] Implement `processIncoming()` with stability tracking
  • [x] Build BWBCore successfully
  • [ ] Full integration deferred to Step 9 (VoiceOrderingOrchestrator)

Step 6: CartCoordinator ✅ COMPLETE

  • [x] Create `CartCoordinator.swift`
  • [x] Define `CartState` enum (empty, hasPending, hasConfirmed, mixed)
  • [x] Define `CartValidation` struct
  • [x] Define `CartCoordinatorDelegate` protocol
  • [x] Implement pending vs confirmed separation
  • [x] Implement `addToPending()`, `confirmPending()`, `rejectPending()`, `clearPending()`
  • [x] Implement `requestClarification()`, `applyClarification()`
  • [x] Implement `updateQuantity()`, `removeItem()` (index-based)
  • [x] Implement modifier methods (size, temperature, milk, shots, caffeine)
  • [x] Implement `exportToOrderItems()` for checkout
  • [x] Implement `pendingSummary()`, `confirmedSummary()` for TTS
  • [x] Basic validation (detailed constraint engine integration deferred)
  • [x] Build BWBCore successfully
  • [ ] Full integration deferred to Step 9 (VoiceOrderingOrchestrator)

Step 7: ConfirmationCoordinator ✅ COMPLETE

  • [x] Create `ConfirmationCoordinator.swift` (626 lines)
  • [x] Define `ConfirmationState` enum
  • [x] Define `ConfirmationResult` enum
  • [x] Extract confirmation keyword detection
  • [x] Extract auto-confirm countdown
  • [x] Extract message generation
  • [x] Implement `processResponse()` method
  • [x] Build BWBCore successfully
  • [ ] Update `VoiceOrderingService` to use coordinator (deferred to Step 9)
  • [ ] Remove old confirmation variables (deferred to Step 9)
  • [ ] Build BWB_POS successfully (deferred to Step 9)

Step 8: OrderParsingPipeline ✅ COMPLETE

  • [x] Create `OrderParsingPipeline.swift` (426 lines)
  • [x] Create `AIOrderParser.swift` (110 lines)
  • [x] Create `NLUOrderParser.swift` (90 lines)
  • [x] Create `OrderResultMerger.swift` (390 lines)
  • [x] Define `OrderParseResult` struct
  • [x] Define `ParsedOrderItem` struct
  • [x] Define `ParsingStrategy` enum
  • [x] Implement hybrid parallel parsing
  • [x] Implement merge logic
  • [x] Build BWBCore successfully
  • [ ] Update `VoiceOrderingService` to use pipeline (deferred to Step 9)
  • [ ] Replace `processWithAI()` with pipeline (deferred to Step 9)
  • [ ] Build BWB_POS successfully (deferred to Step 9)

Step 9: VoiceOrderingOrchestrator ✅ COMPLETE

  • [x] Create `VoiceOrderingOrchestrator.swift` in BWB_POS (714 lines)
  • [x] Wire all 8 components together
  • [x] Implement thin coordination logic
  • [x] Publish all needed state for UI
  • [x] Add compatibility properties for VoiceOrderingService migration
  • [x] Add wake word support
  • [x] Build BWBCore successfully
  • [ ] Update `KioskOrderingView` to use orchestrator (deferred to integration)
  • [ ] Verify all functionality works (deferred to testing)
  • [x] Mark `VoiceOrderingService` as deprecated (DEPRECATED comment added)

Step 10: Testing & Cleanup ⏳ PENDING

  • [ ] Create unit tests for `FeedbackCoordinator`
  • [ ] Create unit tests for `SessionManager`
  • [ ] Create unit tests for `UtteranceCompletionDetector`
  • [ ] Create unit tests for `LiveOrderPreviewGenerator`
  • [ ] Create unit tests for `TranscriptPipeline`
  • [ ] Create unit tests for `CartCoordinator`
  • [ ] Create unit tests for `ConfirmationCoordinator`
  • [ ] Create unit tests for `OrderParsingPipeline`
  • [ ] Create integration tests for full pipeline
  • [ ] Performance profiling (live preview < 50ms)
  • [ ] Remove deprecated `VoiceOrderingService`
  • [ ] Update all documentation

---

Integration Points

BWBCore Services Used

ServiceUsageComponent Using It
`VoiceServiceProtocol`Audio capture`AudioCaptureController`
`SpeechAnalyzerService`iOS 26 speech`AudioCaptureController`
`LegacyVoiceService`iOS 17 fallback`AudioCaptureController`
`AITranscriptParser`AI parsing`AIOrderParser`
`VoiceNLUEngine`Rule-based NLU`NLUOrderParser`, `LiveOrderPreviewGenerator`
`VoiceDialogueManager`State machineMay be replaced by orchestrator
`ConstraintEngine`Validation`CartCoordinator`
`WakeWordDetector`"Hey Brews"`VoiceOrderingOrchestrator`
`Logger.voice`LoggingAll components
`Haptics`Haptic feedback`FeedbackCoordinator`

External APIs

APIUsageComponent
Gemini 3AI transcript parsing`AITranscriptParser` → `AIOrderParser`
OpenAIFallback AI parsing`AITranscriptParser` → `AIOrderParser`
Apple SpeechTranscription`SpeechAnalyzerService` / `LegacyVoiceService`
AVSpeechSynthesizerTTS`FeedbackCoordinator`

---

Technical Decisions

Why Hybrid Parsing?

ApproachProsCons
AI OnlyBest NLU, handles edge casesLatency, API cost, failure risk
NLU OnlyFast, no API, offlineLimited to patterns, can't handle "no wait, change that"
HybridBest of both, fallbackComplexity, needs merge logic

Decision: Hybrid with AI preference when confidence > 0.8

Why Pattern Matching for Live Preview?

ApproachLatencyAccuracyCost
Pattern Matching<10msGoodFree
Lightweight NLU~50-100msBetterFree
AI Streaming~200-500msBestAPI calls

Decision: Pattern matching for instant feedback, AI for final parsing

Why Incremental Extraction?

ApproachRiskTestingRollback
Big Bang RewriteHighAll-or-nothingDifficult
IncrementalLowPer-componentEasy

Decision: Extract one component at a time, verify after each step

---

Testing Strategy

Unit Tests (Per Component)

swift
// Example: SessionManagerTests.swift
class SessionManagerTests: XCTestCase {
    var sut: SessionManager!
    var mockDelegate: MockSessionManagerDelegate!

    override func setUp() {
        sut = SessionManager()
        mockDelegate = MockSessionManagerDelegate()
        sut.delegate = mockDelegate
    }

    func testStartSession_createsSessionId() async {
        let sessionId = await sut.startSession()
        XCTAssertNotNil(sessionId)
        XCTAssertTrue(sut.isSessionActive)
    }

    func testSessionTimeout_callsDelegate() async throws {
        sut.sessionTimeoutInterval = 0.1 // 100ms for test
        await sut.startSession()
        try await Task.sleep(nanoseconds: 200_000_000) // 200ms
        XCTAssertTrue(mockDelegate.timeoutCalled)
    }
}

Integration Tests

swift
// Example: VoicePipelineIntegrationTests.swift
class VoicePipelineIntegrationTests: XCTestCase {
    func testFullPipeline_transcriptToCart() async {
        let orchestrator = VoiceOrderingOrchestrator()

        // Simulate transcript
        orchestrator.simulateTranscript("I'd like an iced oat milk latte please")

        // Wait for processing
        try await Task.sleep(nanoseconds: 500_000_000)

        // Verify cart
        XCTAssertEqual(orchestrator.cart.count, 1)
        XCTAssertEqual(orchestrator.cart[0].itemName, "Latte")
        XCTAssertEqual(orchestrator.cart[0].temperature, .iced)
        XCTAssertEqual(orchestrator.cart[0].milk, .oat)
    }
}

---

Rollback Plan

Per-Component Rollback

Each component is extracted with the old code still present (commented or guarded). To rollback:

1. FeedbackCoordinator: Uncomment `speechSynthesizer` code, remove `feedbackCoordinator`
2. SessionManager: Uncomment `sessionTimeoutTimer` code, remove `sessionManager`
3. etc.

Full Rollback

If entire refactoring needs to be reverted:

1. `git stash` current changes
2. Checkout previous commit before refactoring
3. Or: remove all new files in `Coordination/`, `Detection/`, `Pipeline/`
4. Revert `VoiceOrderingService.swift` to original

Version Control Strategy

  • Each step should be committed separately
  • Use descriptive commit messages: `refactor(voice): extract FeedbackCoordinator`
  • Tag milestones: `voice-pipeline-v0.1` after Step 2, etc.

---

Appendix: Code Snippets for Quick Reference

Creating a New Component

swift
// 1. Create file in BWBCore
// BWBCore/Sources/BWBCore/Voice/[Folder]/[ComponentName].swift

import Foundation

/// [Description of what this component does]
@MainActor
public final class [ComponentName]: ObservableObject {
    // MARK: - Published State
    @Published public private(set) var someState: Type

    // MARK: - Dependencies
    private let dependency: DependencyType

    // MARK: - Initialization
    public init(dependency: DependencyType = .shared) {
        self.dependency = dependency
    }

    // MARK: - Public Methods
    public func doSomething() {
        // ...
    }
}

Integrating in VoiceOrderingService

swift
// In VoiceOrderingService.swift

// 1. Add property
private let componentName = ComponentName()

// 2. Use in methods
func someMethod() {
    componentName.doSomething()
}

// 3. Remove old code (after verifying)
// private var oldVariable: Type  // REMOVED - now in ComponentName

---

Document History

DateAuthorChange
Dec 2024ClaudeInitial document creation
Dec 2024ClaudeSteps 1-2 completed

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

BWB/docs/VOICE_PIPELINE_ARCHITECTURE.md

Detected Structure

Method · Evaluation · Figures · Code Anchors · Architecture