Skill Entity Migration Guide
How to convert legacy `SKILL.md` files into SEA skill entities with typed inputs/outputs, scoring hooks, and hot-reload support.
Full Public Reader
Skill Entity Migration Guide
How to convert legacy `SKILL.md` files into SEA skill entities with typed inputs/outputs, scoring hooks, and hot-reload support.
Overview
The Skill Entity Architecture (SEA) transforms static SKILL.md files into autonomous daemon entities with:
- Typed inputs/outputs — structured scoring via Tier 1 (embedding) and Tier 2 (MiniMax LLM)
- Persistent memory — activation history, hot/cold topic evolution, mute state
- Versioning — SHA-256 content hashing, semantic versions, rollback snapshots
- Hot-reload — edit SKILL.md, watcher detects changes, caches update without restart
This guide walks through migrating any skill from the legacy format to a fully operational SEA entity.
---
Prerequisites
Before starting, verify these services are available:
| Service | Location | Required For |
|---|---|---|
| Ollama (all-minilm) | `http://[ip]:11434` (Mac4) | Tier 1 embedding generation |
| MiniMax-M2.5 | `http://localhost:18080` | Tier 2 LLM scoring |
| Python 3.8+ | Local | All SEA scripts |
| NumPy | `pip install numpy` | Embedding cache |
---
Step 1: Audit Your Legacy SKILL.md
Legacy skills live at `[home-path]` with this format:
---
name: pwr:power
description: Meaning Full Power - self-empowerment and motivational technique
---
# Meaning Full Power - The Beacon of Self-Empowerment
## Purpose
- Instill confidence
- Encourage exploration of personal limitations
...
## When to Use
Invoke `/power` when exploring:
- Self-empowerment and motivation
- Personal development
...What to extract
You need four things from each SKILL.md:
| Data | Where It Goes | How to Extract |
|---|---|---|
| Skill name (`name:` in YAML header) | `state.json`, `versions.json`, scorer profile | Direct copy from YAML front matter |
| Description (`description:` in YAML header) | `SKILL_DESCRIPTIONS` dict in `tier2_scorer.py` | Direct copy, expand if too terse |
| Hot topics (from "When to Use", "Purpose", key concepts) | `state.json` `hot_topics` array | Extract 3-5 core domain keywords |
| Cold topics (things the skill should NOT activate for) | `state.json` `cold_topics` array | Identify 2-3 adjacent-but-wrong domains |
Choosing hot vs cold topics
Hot topics are the skill's strongest activation signals — concepts where the skill is exactly the right tool.
Cold topics are concepts that seem related but where activation would be noise. They often come from the skill's own vocabulary but represent secondary or tangential ideas.
Example for `phi:veritas`:
- Hot: `truth-seeking`, `fact verification`, `discernment`, `intellectual integrity`, `enlightenment`
- Cold: `intuition`, `revelation` (mentioned in the skill body, but not its core domain)
If your SKILL.md doesn't have obvious cold topics, consider: what queries would mention this skill's vocabulary but actually need a different skill? Those are your cold topics.
---
Step 2: Create the Skill Memory Directory
Each SEA entity lives in `[home-path]`:
mkdir -p [home-path]2a. Create `state.json`
This is the skill's mutable runtime state:
{
"skill": "your:skill",
"total_activations": 0,
"useful_activations": 0,
"suppressed_count": 0,
"hot_topics": [
"topic-1",
"topic-2",
"topic-3"
],
"cold_topics": [
"anti-topic-1",
"anti-topic-2"
],
"context_window": 20,
"confidence_calibration": 0.7,
"last_activated": null
}Field reference:
| Field | Type | Default | Purpose |
|---|---|---|---|
| `skill` | string | — | Skill identifier (e.g., `phi:veritas`) |
| `total_activations` | int | `0` | Lifetime activation count |
| `useful_activations` | int | `0` | Activations marked helpful by user |
| `suppressed_count` | int | `0` | Times scoring was skipped (muted) |
| `hot_topics` | string[] | — | 3-5 keywords for strongest activation domains |
| `cold_topics` | string[] | — | 2-3 keywords where skill should suppress |
| `context_window` | int | `20` | Number of recent exchanges to consider |
| `confidence_calibration` | float | `0.7` | Base confidence threshold (0.0-1.0) |
| `last_activated` | string\|null | `null` | ISO 8601 timestamp of last activation |
2b. Create empty `activation-log.jsonl`
touch [home-path]This append-only log records every activation event:
{"timestamp": "2026-02-18T04:30:15.123456+00:00", "event_type": "activation", "message_hash": "sha256_of_user_message", "score": 0.85, "was_injected": true, "hot_topics_matched": ["fact-checking"], "cold_topics_hit": 0}Event types: `activation`, `injection`, `mute`, `unmute`, `reset`.
---
Step 3: Register in the Tier 2 Scorer
The Tier 2 scorer (`[home-path]`) needs a skill description profile.
3a. Add to `SKILL_DESCRIPTIONS`
Open `tier2_scorer.py` and add your skill to the `SKILL_DESCRIPTIONS` dict:
SKILL_DESCRIPTIONS = {
# ... existing skills ...
"your:skill": {
"name": "YourSkill",
"purpose": "One-line summary of what this skill does",
"activates_for": "comma-separated activation triggers from hot_topics",
"suppresses_for": "comma-separated suppression signals from cold_topics and beyond",
},
}How the scorer uses this
The Tier 2 scorer builds a prompt for MiniMax using your profile:
You are {name}'s activation judge. Your only job is to decide
if this skill should contribute to the current conversation.
SKILL IDENTITY:
{purpose}
SKILL STRONGEST DOMAINS (from history):
{hot_topics from state.json}
SKILL WEAKEST FIT (suppress here):
{cold_topics from state.json}
CURRENT MESSAGE:
"{user_message}"
CONVERSATION CONTEXT (last 3 exchanges):
{recent_history}
SCORE: Output ONLY a JSON object:
{"score": 0.0-1.0, "reason": "one sentence", "inject": true/false}Typed output schema
The scorer returns a typed `ScoreResult`:
{
"score": float, # 0.0 to 1.0 — activation confidence
"reason": str, # One-sentence explanation
"inject": bool # True if score >= 0.7 (INJECT_THRESHOLD)
}Score interpretation:
- `1.0` — exactly what this skill was built for
- `0.7+` — inject threshold, skill contributes its perspective
- `0.5` — relevant but peripheral, no injection
- `0.0` — completely off-domain
---
Step 4: Register in the Embedding Indexer
The Tier 1 router uses embeddings for fast pre-screening (~50ms for all skills). Your skill needs to be added to the embedding cache.
4a. Add to `SEA_SKILLS` list
Open `[home-path]` and add your skill ID:
SEA_SKILLS = [
"phi:veritas", "phi:paradox", "phi:metaphysical",
"art:creative", "art:convergent", "art:divergent",
"art:synthesis", "art:snark", "art:movement", "art:dj",
"nav:nonlinear", "nav:organic", "nav:perspective",
"your:skill", # ← add here
]4b. Rebuild the embedding cache
cd [home-path]
python embedding_indexer.py --buildThis:
1. Reads each skill's SKILL.md (full text, 800+ chars typical)
2. Sends text to Ollama `all-minilm` model (384-dim embeddings)
3. Saves `Nx384` matrix to `[home-path]`
Typed input/output for Tier 1
Input: `route_message(user_message: str, threshold: float = 0.6, top_k: int = 5)`
Output:
{
"candidates": [
{
"skill": str, # e.g., "phi:veritas"
"similarity": float, # Cosine similarity (0.0-1.0)
"tier": str # "fast_pass" (≥0.6) or "tier2_review"
}
],
"message": str # Original user message
}Routing tiers:
- `similarity ≥ 0.6` → `fast_pass` — high confidence, can skip Tier 2
- `similarity < 0.6` but in top_k → `tier2_review` — needs MiniMax confirmation
SKILL.md requirements for good embeddings
Your SKILL.md content directly affects embedding quality. Ensure it:
- Is 800+ characters total (YAML header + body)
- Contains domain-specific vocabulary (not generic filler)
- Has a clear "When to Use" section with concrete triggers
- Includes key concepts as standalone terms (helps the embedding model)
Short or generic SKILL.md files produce embeddings that cluster with everything and discriminate nothing.
---
Step 5: Initialize Versioning
The versioning system (SEA-1.4) tracks SKILL.md changes via SHA-256 content hashing.
5a. Initialize the skill version
If the `skill_versioner.py` script is available:
cd [home-path]
python skill_versioner.py init your:skillThis creates:
- `[home-path]`
- `[home-path]`
5b. Manual initialization (if versioner not available)
Create `versions.json`:
{
"skill": "your:skill",
"current_version": "1.0.0",
"current_hash": "sha256:<first-16-chars-of-sha256>",
"versions": [
{
"version": "1.0.0",
"hash": "sha256:<first-16-chars-of-sha256>",
"timestamp": "2026-02-18T12:00:00+00:00",
"source": "initial",
"changes": "Initial version tracking"
}
]
}Compute the hash:
shasum -a 256 [home-path] | cut -c1-16Create the snapshot directory and copy the initial version:
mkdir -p [home-path]
cp [home-path] \
[home-path]---
Step 6: Enable Hot-Reload
The skill watcher (`skill_watcher.py`) monitors SKILL.md files for changes and automatically:
1. Detects content changes via mtime + SHA-256 comparison
2. Bumps the version (patch by default)
3. Re-embeds the single changed skill (surgical row update in `embedding-cache.npz`)
4. Writes `reload-signal.json` to trigger downstream cache invalidation
How it works
You edit SKILL.md
│
▼
skill_watcher.py detects mtime change
│
├─ SHA-256 differs? → bump version, save snapshot
│ re-embed single row in cache
│ write reload-signal.json
│
└─ SHA-256 same? → skip (cosmetic mtime change)
▼
tier1_router.py polls reload-signal.json before route_message()
└─ signal found → force-reload embedding cache from disk
tier2_scorer.py polls reload-signal.json before score_skill()
└─ signal found → drop cached profiles for changed skillsRequirements for hot-reload to work
Your skill needs:
- SKILL.md at `[home-path]` (the watcher monitors this path)
- Versioning initialized (Step 5)
- Embedding cache includes your skill (Step 4)
The watcher runs either as a daemon or a one-shot check:
# Daemon mode (continuous polling)
python skill_watcher.py --daemon --interval 10
# One-shot check
python skill_watcher.py --check---
Step 7: Verify the Migration
7a. Check state files
# Verify state.json is valid JSON
python3 -c "import json; json.load(open('$HOME/.clawdbot/skill-memory/your:skill/state.json'))"
# Verify versions.json
python3 -c "import json; json.load(open('$HOME/.clawdbot/skill-memory/your:skill/versions.json'))"7b. Test Tier 1 routing
cd [home-path]
python tier1_router.py --message "a message your skill should match" --jsonYour skill should appear in the candidates list with similarity > 0.3. If it doesn't appear at all, your SKILL.md may be too short or too generic.
7c. Test Tier 2 scoring
python tier2_scorer.py "a message your skill should match" '["your:skill"]' --jsonExpected output:
{"score": 0.85, "reason": "...", "inject": true}If score is consistently low for on-domain messages, refine your `SKILL_DESCRIPTIONS` entry and `hot_topics`.
7d. Test mute/unmute (optional)
python skill_controller.py mute your:skill --reason "testing"
python skill_controller.py status your:skill
python skill_controller.py unmute your:skill7e. Test hot-reload
# Edit your SKILL.md
echo "## New Section" >> [home-path]
# Run watcher check
python skill_watcher.py --check
# Verify version bumped
python skill_versioner.py history your:skill---
Complete Migration Checklist
Use this checklist for each skill you migrate:
Migration: {skill_id}
─────────────────────────────────
Preparation
[ ] SKILL.md exists at [home-path]
[ ] SKILL.md is 800+ characters with domain-specific vocabulary
[ ] Hot topics extracted (3-5 keywords)
[ ] Cold topics identified (2-3 keywords)
Skill Memory (Step 2)
[ ] Directory created: [home-path]
[ ] state.json created with all 9 fields
[ ] activation-log.jsonl created (empty)
Tier 2 Scorer (Step 3)
[ ] Entry added to SKILL_DESCRIPTIONS in tier2_scorer.py
[ ] Dry-run validates: python tier2_scorer.py --dry-run
Tier 1 Router (Step 4)
[ ] Skill added to SEA_SKILLS list in embedding_indexer.py
[ ] Embedding cache rebuilt: python embedding_indexer.py --build
[ ] Routing test passes: python tier1_router.py --message "..." --json
Versioning (Step 5)
[ ] versions.json created with v1.0.0
[ ] Snapshot saved: snapshots/v1.0.0.md
Hot-Reload (Step 6)
[ ] Watcher detects changes to SKILL.md
[ ] reload-signal.json is consumed by router and scorer
Validation (Step 7)
[ ] Tier 1 returns skill as candidate for on-domain messages
[ ] Tier 2 scores ≥ 0.7 for on-domain messages
[ ] Tier 2 scores ≤ 0.3 for off-domain messages
[ ] Mute/unmute cycle works
[ ] Hot-reload cycle works---
Migration Example: `pwr:power`
Full worked example converting the `pwr:power` skill.
1. Audit SKILL.md
From `[home-path]`:
- Name: `pwr:power`
- Description: "Meaning Full Power - self-empowerment and motivational technique"
- Hot topics: `self-empowerment`, `motivation`, `personal development`, `resilience`, `confidence`
- Cold topics: `fitness routines`, `career advice`, `therapy`
2. Create skill memory
mkdir -p [home-path]state.json:
{
"skill": "pwr:power",
"total_activations": 0,
"useful_activations": 0,
"suppressed_count": 0,
"hot_topics": [
"self-empowerment",
"motivation",
"personal development",
"resilience",
"confidence"
],
"cold_topics": [
"fitness routines",
"career advice",
"therapy"
],
"context_window": 20,
"confidence_calibration": 0.7,
"last_activated": null
}3. Add scorer profile
"pwr:power": {
"name": "Power",
"purpose": "Self-empowerment, motivation, resilience, and personal growth",
"activates_for": "motivation, self-empowerment, confidence building, overcoming challenges, personal growth",
"suppresses_for": "fitness plans, career planning, therapy, debugging, data queries",
},4. Add to embedding indexer
SEA_SKILLS = [
# ... existing 13 ...
"pwr:power",
]Then rebuild: `python embedding_indexer.py --build`
5. Initialize versioning
python skill_versioner.py init pwr:power6. Validate
python tier1_router.py --message "I need motivation to push through this challenge" --json
# Expected: pwr:power appears with similarity > 0.5
python tier2_scorer.py "I need motivation to push through this challenge" '["pwr:power"]' --json
# Expected: score > 0.7, inject: true
python tier1_router.py --message "fix the null pointer exception in main.py" --json
# Expected: pwr:power does NOT appear (off-domain)---
Troubleshooting
Skill never activates (Tier 1 similarity too low)
- SKILL.md too short — Embedding quality degrades below ~500 characters. Add more domain-specific content.
- SKILL.md too generic — Terms like "help", "improve", "better" appear in every skill. Use precise domain vocabulary.
- Wrong embedding model — Verify Ollama is serving `all-minilm`. Run: `curl http://[ip]:11434/api/tags`
Skill activates for everything (Tier 1 similarity too high)
- SKILL.md overlaps with other skills — Check the similarity matrix: `python embedding_indexer.py --matrix`. Related skills should be 0.5-0.8, unrelated should be < 0.3.
- Hot topics too broad — Narrow from "creativity" to "brainstorming SCAMPER techniques".
Tier 2 scorer returns invalid JSON
- MiniMax occasionally wraps JSON in markdown fences. The `_parse_score_json()` function handles this. If you're getting parse errors, check MiniMax is responding at all: `curl http://localhost:18080/v1/models`
Hot-reload not detecting changes
- Watcher not running — Start it: `python skill_watcher.py --daemon`
- Only mtime changed, not content — The watcher uses SHA-256 hashing, so touching a file without changing content is a no-op (by design).
- Skill not in SEA_SKILLS — The watcher only monitors skills listed in `embedding_indexer.py`'s `SEA_SKILLS`.
Muted skill still appearing
- Mute state is checked at scoring time, not routing time in the default flow. Verify `is_muted()` import in both `tier1_router.py` and `tier2_scorer.py`.
---
Architecture Reference
File Layout After Migration
[home-path]
├── skills/
│ └── your:skill/
│ └── SKILL.md ← Source of truth (edit this)
│
├── skill-memory/
│ ├── your:skill/
│ │ ├── state.json ← Mutable runtime state
│ │ ├── activation-log.jsonl ← Append-only event log
│ │ ├── versions.json ← Version history
│ │ └── snapshots/
│ │ └── v1.0.0.md ← Rollback snapshot
│ │
│ ├── embedding-cache.npz ← Nx384 embedding matrix
│ └── reload-signal.json ← Hot-reload trigger (transient)
│
└── scripts/sea/
├── embedding_indexer.py ← Embedding generation & cache
├── tier1_router.py ← Fast cosine similarity routing
├── tier2_scorer.py ← MiniMax LLM scoring
├── skill_versioner.py ← Version tracking & rollback
├── skill_watcher.py ← File change detection & reload
└── skill_controller.py ← Mute/unmute/reset commandsData Flow
User Message
│
▼
┌──────────────────┐
│ Tier 1 Router │ Embed message → cosine similarity vs all skills
│ (~50ms) │ Output: ranked candidates with tier labels
└────────┬─────────┘
│ candidates where similarity ≥ threshold
▼
┌──────────────────┐
│ Tier 2 Scorer │ Build scoring prompt → call MiniMax → parse JSON
│ (~3.4s async) │ Output: {score, reason, inject} per candidate
└────────┬─────────┘
│ skills where inject=true (score ≥ 0.7)
▼
┌──────────────────┐
│ Injection │ Activated skill perspectives appended to response
│ (pending spec) │ Format TBD: prepend / footnote / sidebar
└──────────────────┘Version Lifecycle
v1.0.0 (init) → v1.0.1 (patch: wording tweak) → v1.1.0 (minor: new section)
│ │
└── snapshots/v1.0.0.md └── snapshots/v1.1.0.md
│
rollback possible
via skill_versioner.pyScoring Hook Integration Points
The SEA router integrates with the Clawdbot gateway via hooks at `[home-path]`:
Gateway receives message
│
├─→ Existing hooks run (message-logger, context-injector, etc.)
│
├─→ SEA Router Hook (proposed, async fire-and-forget)
│ ├─ Reads: embedding-cache.npz, skill state.json files
│ ├─ Writes: activation-log.jsonl, state.json counters
│ └─ Triggers: on response:sent event
│
└─→ Response sent to userThe hook is async and non-blocking — it does not add latency to the main response path. Tier 2 scoring happens after the response is already sent, building up activation history for future routing improvements.
Promotion Decision
Promote into a technical note or architecture paper with implementation anchors.
Source Anchor
skill-entity-architecture/MIGRATION-GUIDE.md
Detected Structure
Method · Evaluation · Code Anchors · Architecture