Back to corpus
research noteexperiment writeup candidatescore 24

NKO-1.2 Complete — `nko.phonetics` Module

Replaced the 4-line stub at `nko/phonetics.py` with a comprehensive **820-line** unified phonetics module that consolidates IPA mappings, tone handling, character classification, and Unicode utilities from 13+ scattered implementations across the codebase.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

Replaced the 4-line stub at `nko/phonetics.py` with a comprehensive **820-line** unified phonetics module that consolidates IPA mappings, tone handling, character classification, and Unicode utilities from 13+ scattered implementations across the codebase. ### Source Files Analyzed | File | What it contributed | |------|-------------------| | `core/audio/phoneme.py` | PhonemeMapper, NKO_CONSONANTS, NKO_VOWELS, NKO_TONES, IPA mappings | | `core/transliteration/nko.py` | NkoHandler — full NKO_TO_IPA map, IPA_TO_NKO reverse, LATIN_TO_NKO, validation | | `core/transliteration/bridge.py` | Script detection logic (Unicode ranges) | | `core/audio/pronunciation.py` | PronunciationEngine — syllabification, difficulty estimation | | `core/prediction/prosody_engine.py` | NKO_VOWELS/NKO_CONSONANTS sets, ToneType enum, NKO_HIGH_TONE/LOW_TONE constants | | `core/prediction/tts_engine.py` | PhonemeMapping dataclass, TonePattern enum, dialect awareness | | `tools/sound-sigils/definitions.py` | SigilDefinition data (N'Ko char → semantic/audio mappings) | | `data/nko-unified.json` | 232 canonical records — characters (vowels/consonants/digits/tone_marks/punctuation), vocabulary, morphology, cognates, proverbs | ### Unicode Range Utilities | Method | Returns | Example | |--------|---------|---------| | `is_nko_char(ch)` | `bool` | `is_nko_char('ߞ') → True` | | `is_nko_text(text)` | `bool` | `is_nko_text('ߒߞߏ') → True` | | `nko_purity(text)` | `float` | `nko_purity('ߒabc') → 0.25` | ### Character Classification | Method | Returns | Example | |--------|---------|---------| | `classify(ch)` | `CharCategory` | `classify('ߞ') → CONSONANT` | | `is_vowel(ch)` | `bool` | `is_vowel('ߊ') → True` | | `is_consonant(ch)` | `bool` | `is_consonant('ߞ') → True` | | `is_letter(ch)` | `bool` | `is_letter('ߊ') → True` | | `is_tone_mark(ch)` | `bool` | `is_tone_mark('߫') → True` | | `is_combining(ch)` | `bool` | `is_combining('߲') → True` | | `is_digit(ch)` | `bool` | `is_digit('߁') → True` | | `is_punctuation(ch)` | `bool` | `is_punctuation('߹') → True` | ### IPA Conversion | Method | Returns | Example | |--------|---------|---------| | `to_ipa(text)` | `str` | `to_ipa('ߒߞߏ') → 'nkɔ'` | | `to_ipa(text, include_tones=True)` | `str` | Includes IPA diacritics for tones | | `char_to_ipa(ch)` | `str` | `char_to_ipa('ߞ') → 'k'` | | `to_phonemes(text)` | `List[Phoneme]` | List with symbol, source_char, tone, duration | | `ipa_to_nko(ipa)` | `str` | `ipa_to_nko('nkɔ') → 'ߣߞߏ'` |

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.