Grand Diomande Research ยท Full HTML Reader

SpeakFlow ๐ŸŽ™๏ธ

Hold **โŒฅSpace** anywhere on macOS โ†’ speak โ†’ text appears in whatever app is focused. No cloud, no subscription, no latency.

Language as Infrastructure research note experiment writeup candidate score 24 .md

Full Public Reader

SpeakFlow ๐ŸŽ™๏ธ

> Wispr Flow competitor โ€” free, offline, private voice-to-text for macOS + iOS

What It Does

Hold โŒฅSpace anywhere on macOS โ†’ speak โ†’ text appears in whatever app is focused. No cloud, no subscription, no latency.

Project Status

WaveTasksStatus
Wave 0: FoundationSF-0.1 โ†’ SF-0.4โœ… Done โ€” builds, pipeline wired
Wave 1: Mac InjectionSF-1.1 โ†’ SF-1.4โฌœ Queued
Wave 2: AI CleanupSF-2.1 โ†’ SF-2.3โฌœ Queued
Wave 3: iOS KeyboardSF-3.1 โ†’ SF-3.5โฌœ Queued
Wave 4: ShipSF-4.1 โ†’ SF-4.4โฌœ Queued

Building

bash
# Build macOS app (no code signing needed for dev)
cd Desktop/SpeakFlow
xcodebuild -scheme SpeakFlow -destination 'platform=macOS' build \
  CODE_SIGN_IDENTITY="" CODE_SIGNING_REQUIRED=NO CODE_SIGNING_ALLOWED=NO GENERATE_INFOPLIST_FILE=YES

Architecture

SpeakFlow (macOS menu bar)
  โ””โ”€โ”€ AppDelegate         โ€” status bar icon, menu
  โ””โ”€โ”€ AppCoordinator      โ€” orchestrates the pipeline
        โ”œโ”€โ”€ HotKeyManager  โ€” โŒฅSpace global hotkey (Carbon API + CGEvent tap)
        โ””โ”€โ”€ SpeechService  โ€” SFSpeechRecognizer, on-device, offline

SpeakFlowCore (shared framework)
  โ”œโ”€โ”€ SpeechService.swift      โ€” SF-0.2: STT engine
  โ”œโ”€โ”€ HotKeyManager.swift      โ€” SF-0.3: global hotkey (hold pattern)
  โ”œโ”€โ”€ PermissionsManager.swift โ€” accessibility + speech permissions
  โ””โ”€โ”€ TranscriptionResult.swift โ€” shared models

SpeakFlowMobile (iOS host)
  โ””โ”€โ”€ Minimal app for keyboard extension distribution

SpeakFlowKeyboard (Wave 3)
  โ””โ”€โ”€ TBD โ€” iOS custom keyboard extension

Wave 0 Pipeline

โŒฅSpace held โ†’ HotKeyManager.onHotKeyDown
           โ†’ SpeechService.startRecording()
              โ””โ”€โ”€ AVAudioEngine โ†’ SFSpeechAudioBufferRecognitionRequest
                  โ””โ”€โ”€ SFSpeechRecognizer (on-device, en-US)
                      โ””โ”€โ”€ AsyncStream<TranscriptionResult>
                          โ””โ”€โ”€ console (Wave 1: will inject into focused app)
โŒฅSpace released โ†’ SpeechService.stopRecording() โ†’ finalText

Existing Code Merged

  • `Desktop/Speak/SpeakReader/` โ€” HotKeyManager, PermissionsManager ported
  • `Desktop/Speak/services/macos_speech_service.py` โ€” API design reference
  • `Desktop/cross-script-bridge/ios-keyboard/` โ€” KeyboardViewController (Wave 3 source)
  • `Desktop/voice-corpus/` โ€” 382 voice notes (Wave 2 training data)

Discord Ops Channel

Track progress: [#speakflow-ops](https://discord.com/channels/1470100197941051623/1473426349849972739)

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

NKo/macos/SpeakFlow/README.md

Detected Structure

Method ยท Evaluation ยท Code Anchors ยท Architecture