NKO-1.5 Complete — Merge Morphology ✅
**Task:** Merge morphology — combine cross-script-bridge/morphology + keyboard-ai/morphological_engine **Status:** DONE **Date:** 2025-07-19
Full Public Reader
NKO-1.5 Complete — Merge Morphology ✅
Task: Merge morphology — combine cross-script-bridge/morphology + keyboard-ai/morphological_engine
Status: DONE
Date: 2025-07-19
---
What Was Merged
### Source 1: `core/morphology/morphology/` (cross-script-bridge)
| File | What It Does | Lines | Merged |
|------|-------------|-------|--------|
| `analyzer.py` | MorphologicalAnalyzer — word decomposition, verb/noun matching, cross-script normalization, sentence analysis, interlinear glossing | ~350 | ✅ Full |
| `conjugator.py` | VerbConjugator — 9 TAM particles × 6 person-numbers, tri-script output, full paradigm generation | ~250 | ✅ Full |
| `compound_splitter.py` | CompoundSplitter — 14 known compounds, greedy longest-match decomposition, compound type classification | ~300 | ✅ Full |
| `tone_engine.py` | ToneEngine — tone extraction, minimal pairs, cross-script tone preservation | ~300 | Referenced (TonePattern enum exposed) |
| `cli.py` | CLI interface for all morphology commands | ~200 | N/A (CLI stays in core) |
### Source 2: `core/prediction/morphological_engine.py` (keyboard-ai)
| Component | What It Does | Merged |
|-----------|-------------|--------|
| `MandingMorphologyAnalyzer` | Prefix/suffix/root tables (VERBAL_SUFFIXES, NOMINAL_SUFFIXES, PREFIXES, ROOT_DICTIONARY), morpheme prediction | ✅ Affix tables merged into AffixInventory; root dictionary merged into analyzer |
| `CodeSwitchingDetector` | French/English/N'Ko detection | Not merged (prediction-layer concern, not morphology) |
| `CulturalCalendarEngine` | Islamic calendar + agricultural seasons | Not merged (belongs in nko.culture) |
| `Gen63MorphologicalEngine` | Unified wrapper | Superseded by nko.morphology API |
---
Unified API: `nko/morphology.py`
### Classes
| Class | Purpose |
|-------|---------|
| `MorphologicalAnalyzer` | Word/sentence analysis, script detection, root extraction, noun class detection |
| `VerbConjugator` | Manding verb phrase generation (9 tenses × 6 persons × 3 scripts) |
| `CompoundDetector` | Compound word recognition, splitting, hypothetical generation |
| `AffixInventory` | Static catalogue of all known Manding prefixes, suffixes, postpositions |
### Enums
`MorphemeType` · `NounClass` (11 classes) · `TenseAspect` (9 values) · `PersonNumber` (6 values) · `CompoundType` (8 types) · `TonePattern`
Convenience Functions
# Word analysis
analyze(text, script=None) → List[WordAnalysis]
analyze_word(word, script=None) → WordAnalysis
decompose(word, script=None) → {"prefixes": [], "root": "", "suffixes": []}
extract_root(word, script=None) → str
# Noun classes
detect_noun_class(word, script=None) → NounClass | None
# Verb conjugation
conjugate(verb, tense, person) → ConjugatedForm
full_paradigm(verb) → Dict[str, List[ConjugatedForm]]
# Compounds
is_compound(word, script="latin") → bool
split_compound(word, script="latin") → CompoundWord
# Affix inventory
get_affix(form) → Dict | None
list_prefixes() → Dict
list_suffixes() → Dict
list_postpositions() → Dict---
Test Results
52 passed in 0.05s| Test Class | Tests | Focus |
|---|---|---|
| TestWordDecomposition | 5 | prefix + root + suffix splitting |
| TestRootExtraction | 5 | Root/stem identification |
| TestNounClassDetection | 8 | 11-class Manding noun classification |
| TestVerbConjugation | 7 | TAM particle system, paradigm coverage |
| TestAffixInventory | 6 | Prefix/suffix/postposition catalogue |
| TestCompoundDetection | 7 | Known & novel compound splitting |
| TestScriptDetection | 4 | N'Ko, Latin, Arabic auto-detection |
| TestSentenceAnalysis | 3 | Multi-word STV structure detection |
| TestEdgeCases | 7 | Empty strings, serialization, reconstruction |
---
What's New Beyond Merging
1. NounClass enum — 11 semantic noun classes (PERSON, ANIMAL, THING, LIQUID, MASS, ABSTRACT, PLACE, PLANT, BODY_PART, KINSHIP, TIME) with both root-lookup and suffix-hint detection
2. AffixInventory — unified static catalogue merging both sources' affix tables into one queryable class
3. Prefix stripping — analyzer now handles causative/benefactive/ventive prefixes before root matching
4. `decompose()` function — clean `{"prefixes": [], "root": "", "suffixes": []}` return
5. Module-level convenience functions — lazy singletons, zero-config usage
6. `detect_noun_class()` with plural awareness — strips `-lu` before lookup
---
Architecture Decision
The keyboard-ai's `CodeSwitchingDetector` and `CulturalCalendarEngine` were not merged into `nko/morphology.py` because they are prediction-layer concerns, not morphological analysis. They remain available at `core/prediction/morphological_engine.py` and should be wired into `nko/predict.py` or `nko/culture.py` respectively.
The `ToneEngine` from `core/morphology/morphology/tone_engine.py` is referenced (TonePattern enum exposed) but the full engine stays in core — tone analysis warrants its own `nko/tone.py` module in a future pulse.
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
NKo/NKO-1.5-COMPLETE.md
Detected Structure
Method · Evaluation · Code Anchors · Architecture