Grand Diomande Research · Full HTML Reader

NKO-1.5 Complete — Merge Morphology ✅

**Task:** Merge morphology — combine cross-script-bridge/morphology + keyboard-ai/morphological_engine **Status:** DONE **Date:** 2025-07-19

Language as Infrastructure research note experiment writeup candidate score 24 .md

Full Public Reader

NKO-1.5 Complete — Merge Morphology ✅

Task: Merge morphology — combine cross-script-bridge/morphology + keyboard-ai/morphological_engine
Status: DONE
Date: 2025-07-19

---

What Was Merged

### Source 1: `core/morphology/morphology/` (cross-script-bridge)
| File | What It Does | Lines | Merged |
|------|-------------|-------|--------|
| `analyzer.py` | MorphologicalAnalyzer — word decomposition, verb/noun matching, cross-script normalization, sentence analysis, interlinear glossing | ~350 | ✅ Full |
| `conjugator.py` | VerbConjugator — 9 TAM particles × 6 person-numbers, tri-script output, full paradigm generation | ~250 | ✅ Full |
| `compound_splitter.py` | CompoundSplitter — 14 known compounds, greedy longest-match decomposition, compound type classification | ~300 | ✅ Full |
| `tone_engine.py` | ToneEngine — tone extraction, minimal pairs, cross-script tone preservation | ~300 | Referenced (TonePattern enum exposed) |
| `cli.py` | CLI interface for all morphology commands | ~200 | N/A (CLI stays in core) |

### Source 2: `core/prediction/morphological_engine.py` (keyboard-ai)
| Component | What It Does | Merged |
|-----------|-------------|--------|
| `MandingMorphologyAnalyzer` | Prefix/suffix/root tables (VERBAL_SUFFIXES, NOMINAL_SUFFIXES, PREFIXES, ROOT_DICTIONARY), morpheme prediction | ✅ Affix tables merged into AffixInventory; root dictionary merged into analyzer |
| `CodeSwitchingDetector` | French/English/N'Ko detection | Not merged (prediction-layer concern, not morphology) |
| `CulturalCalendarEngine` | Islamic calendar + agricultural seasons | Not merged (belongs in nko.culture) |
| `Gen63MorphologicalEngine` | Unified wrapper | Superseded by nko.morphology API |

---

Unified API: `nko/morphology.py`

### Classes
| Class | Purpose |
|-------|---------|
| `MorphologicalAnalyzer` | Word/sentence analysis, script detection, root extraction, noun class detection |
| `VerbConjugator` | Manding verb phrase generation (9 tenses × 6 persons × 3 scripts) |
| `CompoundDetector` | Compound word recognition, splitting, hypothetical generation |
| `AffixInventory` | Static catalogue of all known Manding prefixes, suffixes, postpositions |

### Enums
`MorphemeType` · `NounClass` (11 classes) · `TenseAspect` (9 values) · `PersonNumber` (6 values) · `CompoundType` (8 types) · `TonePattern`

Convenience Functions

python

# Word analysis
analyze(text, script=None) → List[WordAnalysis]
analyze_word(word, script=None) → WordAnalysis
decompose(word, script=None) → {"prefixes": [], "root": "", "suffixes": []}
extract_root(word, script=None) → str

# Noun classes
detect_noun_class(word, script=None) → NounClass | None

# Verb conjugation
conjugate(verb, tense, person) → ConjugatedForm
full_paradigm(verb) → Dict[str, List[ConjugatedForm]]

# Compounds
is_compound(word, script="latin") → bool
split_compound(word, script="latin") → CompoundWord

# Affix inventory
get_affix(form) → Dict | None
list_prefixes() → Dict
list_suffixes() → Dict
list_postpositions() → Dict

---

Test Results

52 passed in 0.05s

Test Class	Tests	Focus
TestWordDecomposition	5	prefix + root + suffix splitting
TestRootExtraction	5	Root/stem identification
TestNounClassDetection	8	11-class Manding noun classification
TestVerbConjugation	7	TAM particle system, paradigm coverage
TestAffixInventory	6	Prefix/suffix/postposition catalogue
TestCompoundDetection	7	Known & novel compound splitting
TestScriptDetection	4	N'Ko, Latin, Arabic auto-detection
TestSentenceAnalysis	3	Multi-word STV structure detection
TestEdgeCases	7	Empty strings, serialization, reconstruction

---

What's New Beyond Merging

1. NounClass enum — 11 semantic noun classes (PERSON, ANIMAL, THING, LIQUID, MASS, ABSTRACT, PLACE, PLANT, BODY_PART, KINSHIP, TIME) with both root-lookup and suffix-hint detection
2. AffixInventory — unified static catalogue merging both sources' affix tables into one queryable class
3. Prefix stripping — analyzer now handles causative/benefactive/ventive prefixes before root matching
4. `decompose()` function — clean `{"prefixes": [], "root": "", "suffixes": []}` return
5. Module-level convenience functions — lazy singletons, zero-config usage
6. `detect_noun_class()` with plural awareness — strips `-lu` before lookup

---

Architecture Decision

The keyboard-ai's `CodeSwitchingDetector` and `CulturalCalendarEngine` were not merged into `nko/morphology.py` because they are prediction-layer concerns, not morphological analysis. They remain available at `core/prediction/morphological_engine.py` and should be wired into `nko/predict.py` or `nko/culture.py` respectively.

The `ToneEngine` from `core/morphology/morphology/tone_engine.py` is referenced (TonePattern enum exposed) but the full engine stays in core — tone analysis warrants its own `nko/tone.py` module in a future pulse.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

NKo/NKO-1.5-COMPLETE.md

Detected Structure

Method · Evaluation · Code Anchors · Architecture