N'Ko Research Papers
The public closeout series now lives under `final/`. Each paper has its own folder with a local `paper.tex`, compiled `paper.pdf`, `references.bib`, `paper.bbl`, and relative `figures/` assets, so each manuscript can compile from its own directory.
Full Public Reader
N'Ko Research Papers
Final Four-Paper Series
The public closeout series now lives under `final/`. Each paper has its own folder
with a local `paper.tex`, compiled `paper.pdf`, `references.bib`, `paper.bbl`, and
relative `figures/` assets, so each manuscript can compile from its own directory.
| # | Paper | Folder | PDF pages | Role |
|---|---|---|---|---|
| 1 | Dead Circuits: Script Invisibility and Representation Failure for N'Ko in Large Language Models | `final/01-script-invisibility/` | 11 | Establishes the LLM representation failure with research questions, falsification criteria, evidence ladder, tokenizer-burden formalism, evidence artifact contract, remediation agenda, reviewer checklist, and validity threats. |
| 2 | Against WER: Phonemic Evaluation, Orthographic Transparency, and the Script Advantage for Manding ASR | `final/02-phonemic-evaluation/` | 10 | Formalizes the metric problem, N'Ko-vs-Latin script advantage, transparent-script edit preservation, normalization protocol, CER/PER proxy boundary, metric failure taxonomy, and matched-evaluation requirements. |
| 3 | Script-Native ASR for N'Ko: Anticipatory Transformer CTC Decoding and the 20.57 | |||
| 4 | Anticipation Geometry Partition: Row-Level Governance for Script-Native N'Ko ASR Deployment | `final/04-agp-deployment/` | 12 | Defines AGP as the post-ASR correction/provenance/deployment governance layer with system boundaries, pipeline formalization, row contracts, partition scoring, correction benchmark design, failure taxonomy, human review, data lifecycle, Djoko substrate, and ExpF/ExpH evidence. |
Recommended public narrative: use the four papers as the final publishable bundle
and treat `current/paper_canonical_nko_agp_20cer.tex` as the synthesis manuscript
that explains how the four papers connect. The 20.57
archived N'Ko trajectory ASR checkpoint under recorded settings, not as a universal
matched proof against Latin.
Public Blog Series
The readable public companion lives in `blog-series/`. It inherits the stronger
voice from the original `blog/posts/` drafts: historical opening, experiment
chronology, concrete numbers, and plain-English explanations before acronyms. Start
with `blog-series/00-field-guide-to-the-claim.md`, then publish the four essays in
order.
Current Papers
| # | Paper | File | Status | Pages |
|---|---|---|---|---|
| 1 | Dead Circuits: Activation Profiling and Script Invisibility in LLMs | `current/paper1_dead_circuits.tex` | Draft complete | ~20 |
| 2 | Living Speech: Script-Native ASR for N'Ko | `current/paper2_living_speech.tex` | Draft complete | ~20 |
| 3 | Script Invisibility Is Structural: Across Three LLM Families | `current/paper3_cross_model.tex` | Draft complete | ~12 |
| 4 | Does Script Design Matter? Phonetic Transparency and CTC Decoding | `current/paper4_script_advantage.tex` | Draft complete | ~14 |
| 5 | Script as Thought: Indigenous Script in Personalized Models | TBD | Experiment C scaffolded | — |
| 6 | Inscribing Knowledge: Blockchain Provenance | TBD | Experiments D+E scaffolded | — |
Canonical Synthesis
The current synthesis manuscript is `current/paper_canonical_nko_agp_20cer.tex`.
It consolidates Papers 1-5, the April 2026 paper proposals, and the archived
anchor record into one research-grade public narrative around the
20.57
AGP. The current compiled PDF is 29 pages and expands the methods-through-conclusion
arc with an explicit evidence ladder, artifact protocol, data-risk taxonomy,
trajectory-channel formulation, AGP row contract, deployment constraints, and publication
claim boundaries. It also includes a dedicated script-advantage section explaining
N'Ko versus Latin mapping, vowel/consonant codepoint examples, tone/nasalization
marks, and why those properties matter for CTC alignment and CER interpretation.
The related-work frame now expands the WER/CER metric problem and the orthographic
transparency label-unit problem, including scorer-unit requirements and a mapping
ambiguity formulation.
It now places an "Initial Hypothesis Stack" immediately after the introduction,
reconstructing the original project hypotheses and their current evidence status.
See `PAPER_ROADMAP.md` for the full plan, timeline, and dependencies.
Structure
paper/
├── README.md ← You are here
├── PAPER_ROADMAP.md ← 6-paper plan with timeline and costs
├── blog-series/ ← Public narrative essays for the final bundle
├── final/ ← Four standalone final papers with local assets
├── current/ ← Active paper drafts
│ ├── paper1_dead_circuits.tex
│ ├── paper2_living_speech.tex
│ ├── paper3_cross_model.tex
│ └── paper4_script_advantage.tex
├── archive/ ← Superseded drafts (kept for reference)
│ ├── main.tex, main_v2.tex, ...
│ └── *.pdf (compiled versions)
├── acl.sty ← ACL LaTeX style
└── references.bib ← Shared bibliographyPromotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
nko-brain-scanner/paper/README.md
Detected Structure
Method · Evaluation · References · Figures · Architecture