Back to corpus
architecturetechnical paper candidatescore 38

SEA Re-Embedding Complete — Topic-Augmented Embeddings

Re-embedded all 13 skill entities with topic-augmented text for improved Tier 1 matching. The core problem was that embeddings were generated from raw SKILL.md content (technique descriptions, markdown formatting) which sits in a different semantic space than user queries. Fix: restructure embedding text to be query-dominant.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

Re-embedded all 13 skill entities with topic-augmented text for improved Tier 1 matching. The core problem was that embeddings were generated from raw SKILL.md content (technique descriptions, markdown formatting) which sits in a different semantic space than user queries. Fix: restructure embedding text to be query-dominant. **Before:** `"{skill_name}: {yaml_description}\n{skill_md_body}"` truncated to 800 chars. The body was SKILL.md markdown — technique instructions, not user query language. **After:** Query-dominant structure (~70% example queries, ~20% hot topics, ~10% description): New components: - `_load_skill_topics()` — reads `hot_topics` and `cold_topics` from each skill's `state.json` - `SKILL_EXAMPLE_QUERIES` — 7-10 natural-language user queries per skill that should trigger that skill - SKILL.md body stripped — only the YAML `description:` field is kept (body was semantic noise) - Budget increased from 800 to 1200 chars to accommodate richer text **Before:** `get_sea_skill_patterns()` (lines 397-411, used by spawner.py) was missing patterns that existed in `_keyword_fallback()` (lines 153-167):

Promotion decision

What has to happen next

Promote into a technical note or architecture paper with implementation anchors.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.