Grand Diomande Research · Full HTML Reader

NKO-2.1 Complete — Swift Phonetics Port

**Task:** Port NKoPhonetics — IPA mappings, tone marks, N'Ko Unicode ranges to Swift **Status:** ✅ COMPLETE **Date:** 2026-02-19 **Build:** `swift build` ✅ | `swift test` ✅ (106 tests, 0 failures)

Language as Infrastructure research note experiment writeup candidate score 24 .md

Full Public Reader

NKO-2.1 Complete — Swift Phonetics Port

Task: Port NKoPhonetics — IPA mappings, tone marks, N'Ko Unicode ranges to Swift
Status: ✅ COMPLETE
Date: 2026-02-19
Build: `swift build` ✅ | `swift test` ✅ (106 tests, 0 failures)

---

What Was Built

Native Swift port of `nko/phonetics.py` (820 lines Python → 1,995 lines Swift across 9 source files + 1 test file). Every class, method, constant, and enum from the Python module has a Swift-idiomatic equivalent.

Source Files

FileLinesPurpose
`NKoPhonetics.swift`386Main public API — unified phonetics engine
`CharacterTables.swift`344Canonical character data (singleton, O(1) lookup)
`IPAMapper.swift`272IPA conversion, reverse mapping, syllabification
`NKoDataset.swift`180Codable model for nko-unified.json
`UnicodeRange.swift`137Unicode block constants, range checks, script detection
`CharInfo.swift`100Immutable character info struct (frozen dataclass equivalent)
`ToneType.swift`76Tone type enum with IPA diacritics
`Phoneme.swift`53Phoneme struct for TTS pipeline
`CharCategory.swift`38Character classification enum
`PhoneticTests.swift`40971 unit tests covering all public API

Location: `shared/SwiftCore/Sources/NKoPhonetics/`
Tests: `shared/SwiftCore/Tests/PhoneticTests.swift`
Resource: `shared/SwiftCore/Sources/NKoPhonetics/Resources/nko-unified.json`

---

API Surface

`NKoPhonetics` (main class)

swift
let ph = NKoPhonetics()             // loads bundled JSON
let ph = NKoPhonetics(loadJSON: false) // tables only, lightweight

// Unicode range
ph.isNKoChar("ߞ")           // true
ph.isNKoText("ߒߞߏ")         // true
ph.nkoPurity("ߞabc")        // 0.25

// Classification
ph.classify("ߞ")            // .consonant
ph.isVowel("ߊ")             // true
ph.isConsonant("ߞ")         // true
ph.isLetter("ߊ")            // true
ph.isToneMark("߫")          // true (as standalone Character)
ph.isDigit("߃")             // true
ph.isPunctuation("߹")       // true

// Character info
ph.charInfo(for: "ߞ")       // CharInfo(ߞ U+07DE "Ka" [consonant] ipa="k")
ph.allChars()                // [Character: CharInfo] — 56 entries
ph.charCount                 // 56

// IPA conversion
ph.toIPA("ߒߞߏ")             // "nkɔ"
ph.toIPA("ߞ߫", includeTones: true)  // "ḱ"
ph.charToIPA("ߖ")           // "dʒ"
ph.ipaToNKo("nkɔ")          // "ߣߞߏ"

// Phoneme generation
ph.toPhonemes("ߒߞߏ")        // [Phoneme(n), Phoneme(k), Phoneme(ɔ)]

// Tone handling (scalar-level for correctness)
ph.tone(for: Unicode.Scalar(0x07EB)!)  // .high
ph.stripTones("ߞ߫")          // "ߞ"
ph.extractTones("ߊ߫ߞ")       // [(1, .high)]
ph.hasToneMarks("ߞ߫")        // true

// Script detection
ph.detectScript("ߒߞߏ")       // .nko

// Digit utilities
ph.nkoDigitValue("߃")        // 3
try ph.intToNKoDigits(42)    // "߄߂"

// Syllabification
ph.syllabifyIPA("bama")      // ["ba", "ma"]

// Pronunciation guide
ph.pronunciationGuide("ߞߏ")  // [PronunciationEntry...]

Supporting Types

  • `CharInfo` — Immutable struct: `char`, `codepoint`, `code`, `name`, `category`, `ipa`, `latin`, `toneType?`, `digitValue?`, `punctuationEquivalent?`
  • `Phoneme` — Struct: `symbol`, `sourceChar`, `audioHint`, `durationMs`, `tone?`
  • `ToneType` — Enum: `.high`, `.low`, `.rising`, `.long`, `.veryLong`, `.nasal`, `.nasalAlt`, `.mid`, `.unknown`
  • `CharCategory` — Enum: `.vowel`, `.consonant`, `.digit`, `.toneMark`, `.combining`, `.punctuation`, `.other`
  • `NKoUnicode` — Static utilities: `blockStart/End`, `isNKo(_:)`, `detectScript(_:)`, etc.
  • `IPAMapper` — Static methods: `toIPA(_:)`, `toPhonemes(_:)`, `ipaToNKo(_:)`, `syllabifyIPA(_:)`
  • `NKoDataset` — Codable model for nko-unified.json (vocabulary, proverbs, characters, meta)

---

Swift-Specific Design Decisions

### 1. Unicode Scalar Iteration
N'Ko combining marks (U+07EB–U+07F3) merge with preceding base characters in Swift's `Character` grapheme cluster view. All tone-sensitive operations (`stripTones`, `extractTones`, `hasToneMarks`, `toIPA`, `toPhonemes`) iterate over `unicodeScalars` instead of `Character` to correctly handle this. Dual-keyed lookup tables (both `Character` and `Unicode.Scalar`) ensure O(1) performance regardless of access pattern.

### 2. Sendable & Concurrency
All public types conform to `Sendable`. `CharacterTables` is `@unchecked Sendable` (immutable singleton). `NKoPhonetics` is `@unchecked Sendable` (class with immutable state after init).

### 3. Strong Typing
- `ToneType` enum instead of string constants
- `CharCategory` enum with computed properties (`isLetter`, `isCombining`)
- `CharInfo` is a proper struct with `Identifiable`, `Hashable`, `CustomStringConvertible`
- Throwing API for `intToNKoDigits` with typed `NKoError`

### 4. Resource Bundling
`nko-unified.json` is bundled as an SPM process resource via `Bundle.module`, decoded with `JSONDecoder` into the strongly-typed `NKoDataset` model.

---

Test Results

Test Suite 'PhoneticTests' passed
  Executed 71 tests, with 0 failures (0 unexpected) in 0.004 seconds

Test Suite 'All tests' passed
  Executed 106 tests, with 0 failures (0 unexpected) in 0.020 seconds

Test Coverage (71 phonetic tests)

CategoryTestsDescription
Unicode range9isNKoChar, isNKoText, nkoPurity
Classification10classify, isVowel, isConsonant, isLetter
IPA conversion9toIPA, charToIPA, tones, spaces, multi-char
Reverse IPA3ipaToNKo with multi-char, spaces
Phoneme gen5count, symbols, tone attachment, durations
Tone handling7getTone, stripTones, hasToneMarks, extractTones
CharInfo5lookup, categories, digit values, nil for unknown
Digit utils5digitValue, intToNKo, zero, negative
Script detect3nko, latin, mixed
Syllabification2simple, multi-syllable
Enums4ToneType properties, CharCategory properties
NKoUnicode3block range, isDigit, isLetter
Pronunciation1guide generation
Debug/Edge5debugDescription, empty strings, classifyText

---

Build Output

swift build
Build complete! (0.10s)

swift test
106 tests, 0 failures

---

Character Coverage

  • 7 vowels (ߊ–ߐ) with IPA, Latin transliteration
  • 26 consonants (ߑ–ߪ) including labial-velar (/gb/), affricates (/dʒ/, /tʃ/), variants
  • 10 digits (߀–߉) with numeric values and N'Ko names
  • 5 tone marks (߫–߯) — high, low, rising, long, very long
  • 2 combining marks (߲–߳) — nasalization
  • 6 punctuation (ߴ–ߺ) — exclamation, full stop, ellipsis, word joiner, apostrophes
  • Total: 56 classified characters covering the full N'Ko Unicode block (U+07C0–U+07FF)

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

NKo/NKO-2.1-COMPLETE.md

Detected Structure

Method · Evaluation · Code Anchors · Architecture