Back to corpus
architecturetechnical paper candidatescore 50

speakd v1.0 — Architecture

``` ┌─────────────────────────────────┐ │ Fn Key Pressed │ │ (CGEvent tap + poll detect) │ └────────────────┬────────────────┘ │ ┌────────────────▼────────────────┐ │ 96kHz Audio Capture (cpal) │ │ Built-in mic preferred over │ │ Continuity/iPhone │ └────────────────┬────────────────┘ │ ┌────────────────▼────────────────┐ │ Fn Key Released │ │ (poll detects within 100ms) │ └────────────────┬────────────────┘ │ ┌────────────────▼────────────────┐ │ WAV Encoding (hound crate) │ │ Native rate, 16-bit mono │ └─────────

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

| Mode | Activation | Stop | Use Case | |------|-----------|------|----------| | Hold | Hold Fn | Release Fn | Quick dictation (<30s) | | Toggle | Fn + Space | Press Fn again | Long dictation, meetings | | Event | Sound | Timing | |-------|-------|--------| | Recording starts | Tink.aiff | Immediate on Fn press | | Mode switch (hold→toggle) | Morse.aiff | On Space press during hold | | Recording stops | Pop.aiff | On Fn release / toggle stop | | Transcription complete | macOS notification | After paste | | Failure | Detection | Recovery | |---------|-----------|----------| | Mac4 relay down | 3s connect timeout | Skip to next tier | | Mac4 transcription crash | HTTP connection reset | Skip to cloud | | OpenAI API key invalid | 401 response | Skip to local | | OpenAI rate limited | 429 response | Skip to local | | No internet | All cloud timeouts | Local on-device | | Tailscale down | All mesh timeouts | Cloud → local | | Mic disconnected | Empty audio buffer | "Too short" message | | Audio device change | cpal error callback | Needs restart (future: auto-reconnect) | | Local binary missing | File check | Auto-compile from embedded Swift source | | Step | Hold Mode | Toggle Mode | |------|-----------|-------------| | Fn detection | <1ms | <1ms | | Audio capture | real-time | real-time | | Fn release detection | ~100ms (poll) | ~100ms | | WAV encoding | ~10ms | ~10ms | | **Mac4 transcription** | **~700ms** | **~700ms** | | Punctuation (nano) | ~200ms | ~200ms | | Paste (pbcopy+Cmd+V) | ~50ms | ~50ms | | **Total (Mac4 path)** | **~1.1s** | **~1.1s** | | **Total (OpenAI path)** | **~3.5s** | **~3.5s** | | **Total (local path)** | **~2.0s** | **~2.0s** | | Component | Status | Notes | |-----------|--------|-------| | Fn hold-to-record | LIVE | Polling fallback for Apple Silicon | | Fn+Space toggle | LIVE | For long dictation | | Mac4 SFSpeechRecognizer relay | LIVE | 0.7s, free, LaunchAgent | | Mac4 SpeechTranscriber relay | WIRED | Needs model download on Mac4 | | Mac2 relay | NOT BUILT | Same architecture, deploy when needed | | Mac5 MLX Whisper | NOT CONFIGURED | Need whisper endpoint at :8100 | | OpenAI Whisper | LIVE | Fallback, $0.006/min | | Local on-device | LIVE | Auto-compiles Swift binary | | GPT-5.4-nano punctuation | LIVE | For unpunctuated sources | | History DB | LIVE | SQLite, CLI search | | Post-processing | LIVE | Knowledge chain + smart notes | | MenuBar UI | PENDING | Next major feature | | Voice isolation (noise cancel) | PENDING | Needs AVAudioEngine in Swift layer |

Promotion decision

What has to happen next

Promote into a technical note or architecture paper with implementation anchors.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.