Grand Diomande Research · Full HTML Reader

Cross-App Text Injection Test Plan — SF-1.4

> Documents injection behavior across target applications. > Two methods tested: **AX API direct** (AXUIElementSetAttributeValue) and **CGEvent paste** (⌘V via clipboard).

Language as Infrastructure proposal experiment writeup candidate score 18 .md

Full Public Reader

Cross-App Text Injection Test Plan — SF-1.4

> Documents injection behavior across target applications.
> Two methods tested: AX API direct (AXUIElementSetAttributeValue) and CGEvent paste (⌘V via clipboard).

Overview

SpeakFlow must inject transcribed text into whatever app the user is focused on.
macOS apps fall into distinct rendering categories — each behaves differently with accessibility APIs.

MethodMechanismLatencyReliabilityLimitations
AX API Direct`AXUIElementSetAttributeValue(el, kAXValueAttribute, text)`~1msHigh for native text fieldsFails on non-AX apps, some Electron
CGEvent PasteWrite to `NSPasteboard` → synthesize ⌘V via `CGEvent`~50msUniversal fallbackClobbers clipboard, timing-sensitive

---

1. Cursor (Electron / Chromium)

App Type: Electron (Chromium + Node.js)
Text Rendering: Chromium's Blink engine, CodeMirror/Monaco editor

### AX API Direct
- Status: ⚠️ Partial
- Behavior: Electron exposes AX tree, but CodeMirror/Monaco editors use a hidden `<textarea>` for input that does NOT expose `kAXValueAttribute` as settable. The visible editor is a custom canvas/DOM render.
- Focused element: AX returns `AXWebArea` or `AXTextArea` — but `AXUIElementSetAttributeValue` with `kAXValueAttribute` is silently ignored on the editor area.
- Quirk: Standard Electron text inputs (settings, search bars) DO support AX value setting.

### CGEvent Paste
- Status: ✅ Works
- Behavior: Write text to `NSPasteboard.general`, then post ⌘V via `CGEvent`. Cursor processes the paste through its normal input pipeline.
- Quirk: Cursor may reformat pasted text (auto-indent, bracket matching). A small delay (~50ms) after setting pasteboard is needed before posting ⌘V.
- Workaround: Use `CGEvent` paste as primary method. Consider typing simulation (`CGEvent` key-down/key-up per character) for small text to avoid clipboard clobber — but this is slow for long text.

### Recommended Strategy
1. Attempt AX API — check if focused element has settable `kAXValueAttribute`
2. Fallback to CGEvent paste (primary path for Cursor)

---

2. Discord (Electron / Chromium)

App Type: Electron (Chromium + Node.js)
Text Rendering: React DOM, `<div contenteditable>` message composer

### AX API Direct
- Status: ❌ Fails
- Behavior: Discord's message input is a `contenteditable` div. AX tree exposes it as `AXTextArea` but `AXUIElementSetAttributeValue` does not work — the React state doesn't update, so the text appears visually but vanishes on send (React reconciliation overwrites it).
- Quirk: Even when AX "succeeds" (no error returned), the underlying React state is stale. The message field shows the injected text but Discord's internal model doesn't contain it.

### CGEvent Paste
- Status: ✅ Works
- Behavior: Clipboard paste triggers Discord's `onPaste` handler, which properly updates React state. Text is retained and sends correctly.
- Quirk: Discord strips formatting from pasted text (expected). Paste of very long text (>2000 chars) triggers Discord's character limit UI, but that's Discord's own behavior.
- Timing: ~30ms delay after pasteboard write is sufficient.

### Recommended Strategy
- CGEvent paste only. AX API is unreliable for contenteditable React apps.

---

3. Safari (WebKit)

App Type: Native macOS (WebKit rendering engine)
Text Rendering: WebKit's native text input handling

### AX API Direct
- Status: ⚠️ Depends on element type
- Behavior:
- `<input type="text">` and `<textarea>`: AX API works. `kAXValueAttribute` is settable and the DOM value updates correctly.
- `contenteditable` divs: Same issue as Discord — AX may set visual text but JavaScript state (e.g., React, Vue bindings) won't update.
- Google Docs, Notion, etc.: Custom rendering — AX API fails entirely (they use canvas or custom DOM).
- Quirk: Safari's AX tree is well-structured compared to Chromium. `AXWebArea` → `AXGroup` → `AXTextField` hierarchy is navigable. But only standard `<input>` and `<textarea>` elements are reliably settable.

### CGEvent Paste
- Status: ✅ Works
- Behavior: Universal paste works for all element types in Safari. WebKit processes ⌘V through its standard input pipeline, triggering `paste` DOM events.
- Quirk: Some web apps intercept `paste` events and modify content (e.g., Notion adds blocks). This is app-specific, not Safari-specific.

### Recommended Strategy
1. Attempt AX API — works for standard HTML form inputs
2. Fallback to CGEvent paste for contenteditable and custom editors

---

4. Notes (NSTextView)

App Type: Native macOS (AppKit, NSTextView)
Text Rendering: NSTextView (native Cocoa text system)

### AX API Direct
- Status: ✅ Works perfectly
- Behavior: NSTextView fully supports AX APIs. `kAXValueAttribute` is readable and settable. Text is injected at the correct position when combined with `kAXSelectedTextRangeAttribute`.
- Approach for insertion at cursor:
1. Get `kAXSelectedTextRangeAttribute` to find cursor position
2. Set `kAXSelectedTextAttribute` to replace selection (or insert at cursor with empty selection)
- Quirk: Notes uses iCloud sync — injected text is synced normally. No special handling needed.

### CGEvent Paste
- Status: ✅ Works
- Behavior: Standard paste through ⌘V. NSTextView handles it natively.
- Quirk: Preserves rich text formatting from clipboard if `NSPasteboard` contains RTF. For plain text injection, set `NSPasteboard` with `.string` type only.

### Recommended Strategy
- AX API direct is preferred. NSTextView is the gold standard for AX compatibility. Falls back to paste if needed but shouldn't be necessary.

---

5. Terminal (Terminal.app)

App Type: Native macOS (custom terminal emulator view)
Text Rendering: Custom NSView subclass (not NSTextView)

### AX API Direct
- Status: ❌ Fails
- Behavior: Terminal.app's input area is NOT an AX-settable text field. The terminal emulator renders its own content and does not expose a writable `kAXValueAttribute`. The AX tree shows the terminal content as read-only `AXStaticText`.
- Quirk: The shell (bash/zsh) running inside Terminal owns the input line — Terminal.app itself is just a display layer. There's no AX element representing "the current shell prompt input."

### CGEvent Paste
- Status: ✅ Works
- Behavior: ⌘V paste sends the clipboard content to the terminal's input stream. The shell receives it as if typed.
- Quirk:
- Multiline text: Pasting multiline text into a shell can execute intermediate lines as commands. This is dangerous. SpeakFlow should strip newlines or warn when injecting into Terminal.
- Bracketed paste mode: Modern shells (zsh 5.1+, bash with readline) support bracketed paste (`\e[200~...\e[201~`), which prevents execution. Terminal.app supports this. However, not all shells/programs enable it.
- Special characters: Shell metacharacters (`$`, `!`, `\`, `` ` ``, `|`, `;`, `&`) in transcribed text could cause unintended command execution. Consider escaping or quoting.

### Recommended Strategy
- CGEvent paste only (AX doesn't work)
- Safety: Strip trailing newlines. Optionally wrap in single quotes for shell safety.
- Future: Detect if focused app is a terminal emulator and apply safety rules.

---

Decision Matrix

AppAX API DirectCGEvent PastePrimary MethodNotes
Cursor⚠️ PartialCGEvent pasteEditor areas need paste; standard inputs work with AX
DiscordCGEvent pasteReact contenteditable breaks AX state
Safari⚠️ DependsAX first, paste fallbackStandard inputs: AX. Contenteditable: paste
NotesAX API directNSTextView = best AX support
TerminalCGEvent pasteNot a text field; safety concerns with shell injection

Injection Algorithm

1. Get focused application (NSWorkspace.shared.frontmostApplication)
2. Get focused AX element (AXUIElementCopyAttributeValue → kAXFocusedUIElementAttribute)
3. Check if element has settable kAXValueAttribute:
   a. AXUIElementIsAttributeSettable(el, kAXValueAttribute)
4. If settable → AX API direct injection:
   a. Get kAXSelectedTextRangeAttribute (cursor position)
   b. Set kAXSelectedTextAttribute = transcribed text (insert at cursor)
5. If NOT settable → CGEvent paste fallback:
   a. Save current NSPasteboard contents
   b. Set NSPasteboard.general with transcribed text (.string type)
   c. Post CGEvent for ⌘V (keyDown + keyUp, keycode 9 + .maskCommand)
   d. After ~50ms delay, restore original pasteboard contents
6. Special handling:
   - Terminal apps: strip trailing newlines, consider escaping
   - Electron apps: may need extra delay for paste processing

Test Environment

  • macOS: 14.0+ (Sonoma)
  • Hardware: Apple Silicon (M-series) or Intel
  • Permissions: Accessibility must be granted to SpeakFlow.app
  • Test text: "Hello from SpeakFlow" (simple), "café résumé naïve" (Unicode), "line1\nline2" (multiline)

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

NKo/macos/SpeakFlow/Tests/CrossAppTestPlan.md

Detected Structure

Method · Evaluation · Architecture