Grand Diomande Research · Full HTML Reader

Skill Entity Migration Guide

How to convert legacy `SKILL.md` files into SEA skill entities with typed inputs/outputs, scoring hooks, and hot-reload support.

Agents That Account for Themselves architecture technical paper candidate score 54 .md

Full Public Reader

Skill Entity Migration Guide

How to convert legacy `SKILL.md` files into SEA skill entities with typed inputs/outputs, scoring hooks, and hot-reload support.

Overview

The Skill Entity Architecture (SEA) transforms static SKILL.md files into autonomous daemon entities with:

Typed inputs/outputs — structured scoring via Tier 1 (embedding) and Tier 2 (MiniMax LLM)
Persistent memory — activation history, hot/cold topic evolution, mute state
Versioning — SHA-256 content hashing, semantic versions, rollback snapshots
Hot-reload — edit SKILL.md, watcher detects changes, caches update without restart

This guide walks through migrating any skill from the legacy format to a fully operational SEA entity.

---

Prerequisites

Before starting, verify these services are available:

Service	Location	Required For
Ollama (all-minilm)	`http://[ip]:11434` (Mac4)	Tier 1 embedding generation
MiniMax-M2.5	`http://localhost:18080`	Tier 2 LLM scoring
Python 3.8+	Local	All SEA scripts
NumPy	`pip install numpy`	Embedding cache

---

Step 1: Audit Your Legacy SKILL.md

Legacy skills live at `[home-path]` with this format:

markdown

---
name: pwr:power
description: Meaning Full Power - self-empowerment and motivational technique
---

# Meaning Full Power - The Beacon of Self-Empowerment

## Purpose
- Instill confidence
- Encourage exploration of personal limitations
...

## When to Use
Invoke `/power` when exploring:
- Self-empowerment and motivation
- Personal development
...

What to extract

You need four things from each SKILL.md:

Data	Where It Goes	How to Extract
Skill name (`name:` in YAML header)	`state.json`, `versions.json`, scorer profile	Direct copy from YAML front matter
Description (`description:` in YAML header)	`SKILL_DESCRIPTIONS` dict in `tier2_scorer.py`	Direct copy, expand if too terse
Hot topics (from "When to Use", "Purpose", key concepts)	`state.json` `hot_topics` array	Extract 3-5 core domain keywords
Cold topics (things the skill should NOT activate for)	`state.json` `cold_topics` array	Identify 2-3 adjacent-but-wrong domains

Choosing hot vs cold topics

Hot topics are the skill's strongest activation signals — concepts where the skill is exactly the right tool.

Cold topics are concepts that seem related but where activation would be noise. They often come from the skill's own vocabulary but represent secondary or tangential ideas.

Example for `phi:veritas`:
- Hot: `truth-seeking`, `fact verification`, `discernment`, `intellectual integrity`, `enlightenment`
- Cold: `intuition`, `revelation` (mentioned in the skill body, but not its core domain)

If your SKILL.md doesn't have obvious cold topics, consider: what queries would mention this skill's vocabulary but actually need a different skill? Those are your cold topics.

---

Step 2: Create the Skill Memory Directory

Each SEA entity lives in `[home-path]`:

bash

mkdir -p [home-path]

2a. Create `state.json`

This is the skill's mutable runtime state:

json

{
  "skill": "your:skill",
  "total_activations": 0,
  "useful_activations": 0,
  "suppressed_count": 0,
  "hot_topics": [
    "topic-1",
    "topic-2",
    "topic-3"
  ],
  "cold_topics": [
    "anti-topic-1",
    "anti-topic-2"
  ],
  "context_window": 20,
  "confidence_calibration": 0.7,
  "last_activated": null
}

Field reference:

Field	Type	Default	Purpose
`skill`	string	—	Skill identifier (e.g., `phi:veritas`)
`total_activations`	int	`0`	Lifetime activation count
`useful_activations`	int	`0`	Activations marked helpful by user
`suppressed_count`	int	`0`	Times scoring was skipped (muted)
`hot_topics`	string[]	—	3-5 keywords for strongest activation domains
`cold_topics`	string[]	—	2-3 keywords where skill should suppress
`context_window`	int	`20`	Number of recent exchanges to consider
`confidence_calibration`	float	`0.7`	Base confidence threshold (0.0-1.0)
`last_activated`	string\\|null	`null`	ISO 8601 timestamp of last activation

2b. Create empty `activation-log.jsonl`

bash

touch [home-path]

This append-only log records every activation event:

json

{"timestamp": "2026-02-18T04:30:15.123456+00:00", "event_type": "activation", "message_hash": "sha256_of_user_message", "score": 0.85, "was_injected": true, "hot_topics_matched": ["fact-checking"], "cold_topics_hit": 0}

Event types: `activation`, `injection`, `mute`, `unmute`, `reset`.

---

Step 3: Register in the Tier 2 Scorer

The Tier 2 scorer (`[home-path]`) needs a skill description profile.

3a. Add to `SKILL_DESCRIPTIONS`

Open `tier2_scorer.py` and add your skill to the `SKILL_DESCRIPTIONS` dict:

python

SKILL_DESCRIPTIONS = {
    # ... existing skills ...

    "your:skill": {
        "name": "YourSkill",
        "purpose": "One-line summary of what this skill does",
        "activates_for": "comma-separated activation triggers from hot_topics",
        "suppresses_for": "comma-separated suppression signals from cold_topics and beyond",
    },
}

How the scorer uses this

The Tier 2 scorer builds a prompt for MiniMax using your profile:

You are {name}'s activation judge. Your only job is to decide
if this skill should contribute to the current conversation.

SKILL IDENTITY:
{purpose}

SKILL STRONGEST DOMAINS (from history):
{hot_topics from state.json}

SKILL WEAKEST FIT (suppress here):
{cold_topics from state.json}

CURRENT MESSAGE:
"{user_message}"

CONVERSATION CONTEXT (last 3 exchanges):
{recent_history}

SCORE: Output ONLY a JSON object:
{"score": 0.0-1.0, "reason": "one sentence", "inject": true/false}

Typed output schema

The scorer returns a typed `ScoreResult`:

python

{
    "score": float,    # 0.0 to 1.0 — activation confidence
    "reason": str,     # One-sentence explanation
    "inject": bool     # True if score >= 0.7 (INJECT_THRESHOLD)
}

Score interpretation:
- `1.0` — exactly what this skill was built for
- `0.7+` — inject threshold, skill contributes its perspective
- `0.5` — relevant but peripheral, no injection
- `0.0` — completely off-domain

---

Step 4: Register in the Embedding Indexer

The Tier 1 router uses embeddings for fast pre-screening (~50ms for all skills). Your skill needs to be added to the embedding cache.

4a. Add to `SEA_SKILLS` list

Open `[home-path]` and add your skill ID:

python

SEA_SKILLS = [
    "phi:veritas", "phi:paradox", "phi:metaphysical",
    "art:creative", "art:convergent", "art:divergent",
    "art:synthesis", "art:snark", "art:movement", "art:dj",
    "nav:nonlinear", "nav:organic", "nav:perspective",
    "your:skill",  # ← add here
]

4b. Rebuild the embedding cache

bash

cd [home-path]
python embedding_indexer.py --build

This:
1. Reads each skill's SKILL.md (full text, 800+ chars typical)
2. Sends text to Ollama `all-minilm` model (384-dim embeddings)
3. Saves `Nx384` matrix to `[home-path]`

Typed input/output for Tier 1

Input: `route_message(user_message: str, threshold: float = 0.6, top_k: int = 5)`

Output:

python

{
    "candidates": [
        {
            "skill": str,         # e.g., "phi:veritas"
            "similarity": float,  # Cosine similarity (0.0-1.0)
            "tier": str           # "fast_pass" (≥0.6) or "tier2_review"
        }
    ],
    "message": str  # Original user message
}

Routing tiers:
- `similarity ≥ 0.6` → `fast_pass` — high confidence, can skip Tier 2
- `similarity < 0.6` but in top_k → `tier2_review` — needs MiniMax confirmation

SKILL.md requirements for good embeddings

Your SKILL.md content directly affects embedding quality. Ensure it:

Is 800+ characters total (YAML header + body)
Contains domain-specific vocabulary (not generic filler)
Has a clear "When to Use" section with concrete triggers
Includes key concepts as standalone terms (helps the embedding model)

Short or generic SKILL.md files produce embeddings that cluster with everything and discriminate nothing.

---

Step 5: Initialize Versioning

The versioning system (SEA-1.4) tracks SKILL.md changes via SHA-256 content hashing.

5a. Initialize the skill version

If the `skill_versioner.py` script is available:

bash

cd [home-path]
python skill_versioner.py init your:skill

This creates:
- `[home-path]`
- `[home-path]`

5b. Manual initialization (if versioner not available)

Create `versions.json`:

json

{
  "skill": "your:skill",
  "current_version": "1.0.0",
  "current_hash": "sha256:<first-16-chars-of-sha256>",
  "versions": [
    {
      "version": "1.0.0",
      "hash": "sha256:<first-16-chars-of-sha256>",
      "timestamp": "2026-02-18T12:00:00+00:00",
      "source": "initial",
      "changes": "Initial version tracking"
    }
  ]
}

Compute the hash:

bash

shasum -a 256 [home-path] | cut -c1-16

Create the snapshot directory and copy the initial version:

bash

mkdir -p [home-path]
cp [home-path] \
   [home-path]

---

Step 6: Enable Hot-Reload

The skill watcher (`skill_watcher.py`) monitors SKILL.md files for changes and automatically:

1. Detects content changes via mtime + SHA-256 comparison
2. Bumps the version (patch by default)
3. Re-embeds the single changed skill (surgical row update in `embedding-cache.npz`)
4. Writes `reload-signal.json` to trigger downstream cache invalidation

How it works

You edit SKILL.md
    │
    ▼
skill_watcher.py detects mtime change
    │
    ├─ SHA-256 differs? → bump version, save snapshot
    │                     re-embed single row in cache
    │                     write reload-signal.json
    │
    └─ SHA-256 same? → skip (cosmetic mtime change)

    ▼
tier1_router.py polls reload-signal.json before route_message()
    └─ signal found → force-reload embedding cache from disk

tier2_scorer.py polls reload-signal.json before score_skill()
    └─ signal found → drop cached profiles for changed skills

Requirements for hot-reload to work

Your skill needs:
- SKILL.md at `[home-path]` (the watcher monitors this path)
- Versioning initialized (Step 5)
- Embedding cache includes your skill (Step 4)

The watcher runs either as a daemon or a one-shot check:

bash

# Daemon mode (continuous polling)
python skill_watcher.py --daemon --interval 10

# One-shot check
python skill_watcher.py --check

---

Step 7: Verify the Migration

7a. Check state files

bash

# Verify state.json is valid JSON
python3 -c "import json; json.load(open('$HOME/.clawdbot/skill-memory/your:skill/state.json'))"

# Verify versions.json
python3 -c "import json; json.load(open('$HOME/.clawdbot/skill-memory/your:skill/versions.json'))"

7b. Test Tier 1 routing

bash

cd [home-path]
python tier1_router.py --message "a message your skill should match" --json

Your skill should appear in the candidates list with similarity > 0.3. If it doesn't appear at all, your SKILL.md may be too short or too generic.

7c. Test Tier 2 scoring

bash

python tier2_scorer.py "a message your skill should match" '["your:skill"]' --json

Expected output:

json

{"score": 0.85, "reason": "...", "inject": true}

If score is consistently low for on-domain messages, refine your `SKILL_DESCRIPTIONS` entry and `hot_topics`.

7d. Test mute/unmute (optional)

bash

python skill_controller.py mute your:skill --reason "testing"
python skill_controller.py status your:skill
python skill_controller.py unmute your:skill

7e. Test hot-reload

bash

# Edit your SKILL.md
echo "## New Section" >> [home-path]

# Run watcher check
python skill_watcher.py --check

# Verify version bumped
python skill_versioner.py history your:skill

---

Complete Migration Checklist

Use this checklist for each skill you migrate:

Migration: {skill_id}
─────────────────────────────────

Preparation
  [ ] SKILL.md exists at [home-path]
  [ ] SKILL.md is 800+ characters with domain-specific vocabulary
  [ ] Hot topics extracted (3-5 keywords)
  [ ] Cold topics identified (2-3 keywords)

Skill Memory (Step 2)
  [ ] Directory created: [home-path]
  [ ] state.json created with all 9 fields
  [ ] activation-log.jsonl created (empty)

Tier 2 Scorer (Step 3)
  [ ] Entry added to SKILL_DESCRIPTIONS in tier2_scorer.py
  [ ] Dry-run validates: python tier2_scorer.py --dry-run

Tier 1 Router (Step 4)
  [ ] Skill added to SEA_SKILLS list in embedding_indexer.py
  [ ] Embedding cache rebuilt: python embedding_indexer.py --build
  [ ] Routing test passes: python tier1_router.py --message "..." --json

Versioning (Step 5)
  [ ] versions.json created with v1.0.0
  [ ] Snapshot saved: snapshots/v1.0.0.md

Hot-Reload (Step 6)
  [ ] Watcher detects changes to SKILL.md
  [ ] reload-signal.json is consumed by router and scorer

Validation (Step 7)
  [ ] Tier 1 returns skill as candidate for on-domain messages
  [ ] Tier 2 scores ≥ 0.7 for on-domain messages
  [ ] Tier 2 scores ≤ 0.3 for off-domain messages
  [ ] Mute/unmute cycle works
  [ ] Hot-reload cycle works

---

Migration Example: `pwr:power`

Full worked example converting the `pwr:power` skill.

1. Audit SKILL.md

From `[home-path]`:
- Name: `pwr:power`
- Description: "Meaning Full Power - self-empowerment and motivational technique"
- Hot topics: `self-empowerment`, `motivation`, `personal development`, `resilience`, `confidence`
- Cold topics: `fitness routines`, `career advice`, `therapy`

2. Create skill memory

bash

mkdir -p [home-path]

state.json:

json

{
  "skill": "pwr:power",
  "total_activations": 0,
  "useful_activations": 0,
  "suppressed_count": 0,
  "hot_topics": [
    "self-empowerment",
    "motivation",
    "personal development",
    "resilience",
    "confidence"
  ],
  "cold_topics": [
    "fitness routines",
    "career advice",
    "therapy"
  ],
  "context_window": 20,
  "confidence_calibration": 0.7,
  "last_activated": null
}

3. Add scorer profile

python

"pwr:power": {
    "name": "Power",
    "purpose": "Self-empowerment, motivation, resilience, and personal growth",
    "activates_for": "motivation, self-empowerment, confidence building, overcoming challenges, personal growth",
    "suppresses_for": "fitness plans, career planning, therapy, debugging, data queries",
},

4. Add to embedding indexer

python

SEA_SKILLS = [
    # ... existing 13 ...
    "pwr:power",
]

Then rebuild: `python embedding_indexer.py --build`

5. Initialize versioning

bash

python skill_versioner.py init pwr:power

6. Validate

bash

python tier1_router.py --message "I need motivation to push through this challenge" --json
# Expected: pwr:power appears with similarity > 0.5

python tier2_scorer.py "I need motivation to push through this challenge" '["pwr:power"]' --json
# Expected: score > 0.7, inject: true

python tier1_router.py --message "fix the null pointer exception in main.py" --json
# Expected: pwr:power does NOT appear (off-domain)

---

Troubleshooting

Skill never activates (Tier 1 similarity too low)

SKILL.md too short — Embedding quality degrades below ~500 characters. Add more domain-specific content.
SKILL.md too generic — Terms like "help", "improve", "better" appear in every skill. Use precise domain vocabulary.
Wrong embedding model — Verify Ollama is serving `all-minilm`. Run: `curl http://[ip]:11434/api/tags`

Skill activates for everything (Tier 1 similarity too high)

SKILL.md overlaps with other skills — Check the similarity matrix: `python embedding_indexer.py --matrix`. Related skills should be 0.5-0.8, unrelated should be < 0.3.
Hot topics too broad — Narrow from "creativity" to "brainstorming SCAMPER techniques".

Tier 2 scorer returns invalid JSON

- MiniMax occasionally wraps JSON in markdown fences. The `_parse_score_json()` function handles this. If you're getting parse errors, check MiniMax is responding at all: `curl http://localhost:18080/v1/models`

Hot-reload not detecting changes

Watcher not running — Start it: `python skill_watcher.py --daemon`
Only mtime changed, not content — The watcher uses SHA-256 hashing, so touching a file without changing content is a no-op (by design).
Skill not in SEA_SKILLS — The watcher only monitors skills listed in `embedding_indexer.py`'s `SEA_SKILLS`.

Muted skill still appearing

- Mute state is checked at scoring time, not routing time in the default flow. Verify `is_muted()` import in both `tier1_router.py` and `tier2_scorer.py`.

---

Architecture Reference

File Layout After Migration

[home-path]
├── skills/
│   └── your:skill/
│       └── SKILL.md                    ← Source of truth (edit this)
│
├── skill-memory/
│   ├── your:skill/
│   │   ├── state.json                  ← Mutable runtime state
│   │   ├── activation-log.jsonl        ← Append-only event log
│   │   ├── versions.json               ← Version history
│   │   └── snapshots/
│   │       └── v1.0.0.md               ← Rollback snapshot
│   │
│   ├── embedding-cache.npz             ← Nx384 embedding matrix
│   └── reload-signal.json              ← Hot-reload trigger (transient)
│
└── scripts/sea/
    ├── embedding_indexer.py            ← Embedding generation & cache
    ├── tier1_router.py                 ← Fast cosine similarity routing
    ├── tier2_scorer.py                 ← MiniMax LLM scoring
    ├── skill_versioner.py              ← Version tracking & rollback
    ├── skill_watcher.py                ← File change detection & reload
    └── skill_controller.py             ← Mute/unmute/reset commands

Data Flow

User Message
    │
    ▼
┌──────────────────┐
│ Tier 1 Router    │  Embed message → cosine similarity vs all skills
│ (~50ms)          │  Output: ranked candidates with tier labels
└────────┬─────────┘
         │ candidates where similarity ≥ threshold
         ▼
┌──────────────────┐
│ Tier 2 Scorer    │  Build scoring prompt → call MiniMax → parse JSON
│ (~3.4s async)    │  Output: {score, reason, inject} per candidate
└────────┬─────────┘
         │ skills where inject=true (score ≥ 0.7)
         ▼
┌──────────────────┐
│ Injection        │  Activated skill perspectives appended to response
│ (pending spec)   │  Format TBD: prepend / footnote / sidebar
└──────────────────┘

Version Lifecycle

v1.0.0 (init)  →  v1.0.1 (patch: wording tweak)  →  v1.1.0 (minor: new section)
    │                                                        │
    └── snapshots/v1.0.0.md                                  └── snapshots/v1.1.0.md
                                                                    │
                                                              rollback possible
                                                              via skill_versioner.py

Scoring Hook Integration Points

The SEA router integrates with the Clawdbot gateway via hooks at `[home-path]`:

Gateway receives message
    │
    ├─→ Existing hooks run (message-logger, context-injector, etc.)
    │
    ├─→ SEA Router Hook (proposed, async fire-and-forget)
    │   ├─ Reads: embedding-cache.npz, skill state.json files
    │   ├─ Writes: activation-log.jsonl, state.json counters
    │   └─ Triggers: on response:sent event
    │
    └─→ Response sent to user

The hook is async and non-blocking — it does not add latency to the main response path. Tier 2 scoring happens after the response is already sent, building up activation history for future routing improvements.

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

skill-entity-architecture/MIGRATION-GUIDE.md

Detected Structure

Method · Evaluation · Code Anchors · Architecture