Grand Diomande Research Β· Full HTML Reader

Skill Entity Architecture (SEA) β€” DEP + Evo-Cubed Analysis

> **Deprecation note (2026-05-13):** Mac3 was the Tier 2 worker host at the time this design doc was authored. Mac3 has since been retired. Forward-looking references to Mac3 (worker pool, async queue, circuit breaker) should be read as **Mac4:8100** (cognitive twin host) in any current/future implementation. The Mac3-era hardware-assignment sections (Β§2, Step 6, stress-test Β§πŸ”΄ Mac3 Async Worker Reliability) are kept for historical accuracy but are **obsolete for v1.1 onward**. See SOOP-2 launch memory for migrati

Agents That Account for Themselves architecture technical paper candidate score 74 .md

Full Public Reader

# Skill Entity Architecture (SEA) β€” DEP + Evo-Cubed Analysis
Version: 1.0
Date: 2025-07-18
Protocol: Deep Enhancement Protocol v2 + EvolutionΒ³
Analyst: Clawdbot Subagent (sea-dep-evocubed)
Concept Stage: Pre-implementation / Concept Paper

> Deprecation note (2026-05-13): Mac3 was the Tier 2 worker host at the time this design doc was authored. Mac3 has since been retired. Forward-looking references to Mac3 (worker pool, async queue, circuit breaker) should be read as Mac4:8100 (cognitive twin host) in any current/future implementation. The Mac3-era hardware-assignment sections (Β§2, Step 6, stress-test Β§πŸ”΄ Mac3 Async Worker Reliability) are kept for historical accuracy but are obsolete for v1.1 onward. See SOOP-2 launch memory for migration plan. A v1.1 rewrite is tracked as a separate later track.

---

PREAMBLE β€” What We're Evaluating

The Skill Entity Architecture (SEA) proposes inverting the current static-file skill system into a living ecosystem of autonomous, memory-bearing daemon entities. Creative/philosophical skills (phi:, art:, nav:*) β€” 13 of the 139 skills β€” become persistent conversational agents with their own Discord channels, chat history, and relevance scoring via MiniMax local inference. A router middleware intercepts every prompt/response pair, scores all skills 0–1, and injects activated skill perspectives (threshold β‰₯ 0.7) into the final response before delivery.

This analysis applies DEP-2 scoring discipline, then runs EvoΒ³ to generate, compound, and specify the best achievable version of this idea.

---

# ═══════════════════════════════════════════
# PART 1: DEP AUDIT
# ═══════════════════════════════════════════

1.1 Scoring Matrix (0–10 per category)

CATEGORY 1: Feature Completeness β€” 3.5/10

What exists:
- βœ… Concept of skill-as-daemon is defined
- βœ… Scoring via local MiniMax is identified
- βœ… Discord channel-per-skill metaphor is specified
- βœ… Activation threshold (0.7) is named

What's missing (major gaps):
- ❌ No scoring schema β€” what exactly does MiniMax receive? Full message? Summary? What prompt template?
- ❌ No injection format spec β€” "skill perspective" is undefined. How long? What structure?
- ❌ No context pruning strategy β€” rolling window? summarization? FIFO?
- ❌ No cold-start behavior β€” what happens when channel history is empty?
- ❌ No false-positive suppression mechanism designed
- ❌ No multi-skill coordination protocol
- ❌ No graceful degradation when MiniMax is unreachable

CATEGORY 2: Code Quality β€” 1.5/10

What exists:
- βœ… Hooks system exists at [home-path] (context-injector, synthesis-preprocessor)
- βœ… Skill-router SKILL.md exists as conceptual prior art
- βœ… MiniMax already running at localhost:18080 with OpenAI-compatible API

What's missing:
- ❌ Zero implementation code for SEA
- ❌ No router middleware prototype
- ❌ No MiniMax scoring harness
- ❌ No Discord channel management scripts
- ❌ No injection composition logic
- ❌ No test suite, benchmarks, or latency profiling

CATEGORY 3: Data Integrity β€” 3/10

What exists:
- βœ… Discord channels provide natural conversation persistence
- βœ… SKILL.md files provide stable system prompt templates

What's missing:
- ❌ No schema for skill activation history (what gets recorded, where, in what format)
- ❌ No context window budget enforcement β€” Discord history is unbounded
- ❌ No rollback or corruption recovery for skill memory
- ❌ No deduplication of injections across simultaneous skill activations
- ❌ No versioning for skill state evolution
- ❌ No audit trail for "what skill injected what, and why"

CATEGORY 4: Integration Depth β€” 4.5/10

What exists:
- βœ… Clawdbot gateway architecture is real and extensible
- βœ… Discord channels already in heavy use (routing, memory, voice threads)
- βœ… MiniMax provider configured in clawdbot.json
- βœ… Hooks directory has existing examples (context-injector hook exists)
- βœ… skill-router skill shows prior thinking about routing

What's missing:
- ❌ No integration point specified between gateway session handling and the new router
- ❌ No spec for how Discord channel reads insert into the hot path
- ❌ No consideration of Telegram gateway (which also exists) β€” SEA would break channel parity
- ❌ No interaction spec between SEA and existing hooks (synthesis-preprocessor, context-injector)
- ❌ No consideration of subagent sessions (would skills activate for subagent prompts too?)

CATEGORY 5: UX β€” 2.5/10

What exists:
- βœ… "Skill perspective" injection is named
- βœ… Threshold gives a binary ON/OFF feel

What's missing:
- ❌ No injection format UX β€” prepend, footnote, sidebar all produce radically different experiences
- ❌ No user controls β€” can the user mute a skill? force-activate? see which fired?
- ❌ No transparency mechanism β€” injections happen invisibly, which is either magical or creepy
- ❌ No injection quality feedback loop β€” how does the system know an injection was good?
- ❌ No fatigue design β€” high-activation skills would fire on everything and become noise
- ❌ No kill switch or emergency bypass

CATEGORY 6: Production Readiness β€” 2/10

What exists:
- βœ… Hardware is appropriate (Mac1 as primary, Mac3/4 as inference)
- βœ… MiniMax already running in production configuration

What's missing:
- ❌ No latency budget defined or validated β€” N Γ— MiniMax call latency is unknown
- ❌ No circuit breaker for when MiniMax is unavailable
- ❌ No Discord API rate limit handling
- ❌ No observability/metrics system
- ❌ No cost model (Discord channel reads, API limits)
- ❌ No deployment plan or rollout strategy
- ❌ No rollback procedure

---

1.2 Score Summary

CategoryScoreMax
Feature Completeness3.510
Code Quality1.510
Data Integrity3.010
Integration Depth4.510
UX2.510
Production Readiness2.010
TOTAL17/6060
Average2.83/10β€”

---

1.3 Gap List

πŸ”΄ HIGH SEVERITY

IDGapImpactBlocker?
H1Latency budget undefined β€” Scoring 13 creative skills Γ— ~80-200ms MiniMax call = 1-2.6s added to every message, synchronouslyUnusable if >300msYES
H2Context accumulation unbounded β€” Discord channel history will grow forever; no pruning means context window exhaustionData integrity failureYES
H3No gateway integration spec β€” Where exactly in the Clawdbot session lifecycle does the router hook in? Pre-prompt? Post-response? Async?Architecture undefinedYES
H4False positive accumulation β€” A skill that fires 70
H5Multi-skill collision undefined β€” When phi:veritas AND art:creative both hit 0.7+, what happens? Whose injection goes first? Do they interact?Output incoherenceYES

🟑 MEDIUM SEVERITY

IDGapImpact
M1Injection format not designed β€” Prepend vs. footnote vs. sidebar are architecturally different experiences; this choice cascades into everythingUX fundamental
M2Scoring prompt template missing β€” What does MiniMax actually receive? Quality of scoring depends entirely on thisAccuracy risk
M3Cold start behavior undefined β€” Empty channel = no context = low-quality scoring for new skillsEarly UX poor
M4Telegram parity β€” System only considers Discord; Telegram gateway will lack skill injectionsInconsistent
M5Subagent boundary β€” Should skills activate inside subagent sessions? This has deep implications for API cost and context pollutionScope creep
M6Skill evolution feedback loop β€” How does skill "learn when NOT to activate"? This is the hardest ML problem in the systemCore promise unfulfilled

🟒 LOW SEVERITY

IDGapImpact
L1Discord API rate limits β€” Reading 13 channels per message at high volume could hit rate limitsOperational
L2MiniMax 3B vs embeddings tradeoff β€” 3B model may be overqualified for binary relevance scoring; embeddings are faster and deterministicEfficiency
L3Skill visibility β€” User has no way to know which skills are active, which fired, which were suppressedTransparency
L4No mute/force-activate controls β€” User can't express preferences about skill behaviorUX control
L5SKILL.md as system prompt stability β€” Skills evolve but their system prompts are static files; who updates them?Maintenance

---

1.4 Commitment Γ— Uncertainty β†’ DEP Decision

Commitment Score: 8/10
Why high: The idea is genuinely excellent β€” giving skills persistent memory and autonomous activation is a meaningful capability upgrade. The existing infrastructure (MiniMax running, hooks system, Discord channels) makes this less speculative than it appears. The value proposition is clear and unique.

Uncertainty Score: 8.5/10
Why high: H1 (latency), H3 (gateway integration), and H4 (false positives) are individually architecture-defining problems. Any one of them could force a fundamental redesign. We don't have benchmarks, integration proofs, or feedback loop designs. The concept paper describes the what but not the how for any of its critical mechanisms.

Commitment: 8  |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  |
Uncertainty: 8.5 |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ |

         Uncertainty
              High
               β”‚
    RECURSE    β”‚    BRANCH
               β”‚
    ───────────┼───────────  Commitment
               β”‚
    ABORT      β”‚    COMMIT
               β”‚
              Low

Position: TOP-RIGHT β†’ BRANCH

Rationale: Both dimensions are high. This is the BRANCH quadrant β€” the idea is worth pursuing but has too many open forks to serialize. The correct move is parallel path exploration. Evo-Cubed Stage 1 will generate the branches; Stage 2 will collapse them.

DEP Decision: BRANCH β†’ Proceed to Evo-Cubed

---

# ═══════════════════════════════════════════
# PART 2: EVO-CUBED (EvolutionΒ³)
# ═══════════════════════════════════════════

---

# STAGE 1: EXPLORE (Divergent Paths)
Generating 5 genuinely different architecture directions β€” not variations on the same idea

---

## PATH A: "The Daemon Fleet" β€” Full Autonomous Skill Agents
The original proposal, pushed to its logical limit

Concept:
Every creative/philosophical skill becomes a genuinely autonomous agent process β€” not just a Discord channel, but a running MiniMax instance with its own system prompt loaded, persistent state in SQLite, and a subscription to the main session event stream via a lightweight IPC bus. Skills are processes, not files.

Why it works:
True autonomy enables skills to develop reasoning patterns over time. Each daemon can watch patterns across sessions, not just within one. phi:veritas can notice "he asks about truth whenever a major decision is being made" because it has full history.

Value angle:
The skill becomes the equivalent of a specialized advisor who remembers every conversation they've had with you. This is qualitatively different from the current static system.

Architecture fingerprint:
- Skills run as separate processes (Python daemons, managed by supervisord or launchd)
- Each process: SKILL.md as system prompt, SQLite for activation history, subscribes to event bus
- Event bus: Unix socket or Redis pub/sub, emits every user message + assistant response
- Scoring: each daemon independently queries MiniMax with a compact scoring prompt
- Injection: winning daemons write to a shared injection queue; orchestrator composes final response

Risks:
- 13 daemon processes on Mac1 is feasible but messy
- Each MiniMax call is sequential within each daemon β€” parallel calls needed for speed
- Process crash = skill disappears silently; needs health monitoring
- Highest complexity, highest capability

---

## PATH B: "The Embedding Index" β€” Fast Vector Routing, No LLM per Message
Replace LLM scoring with embedding similarity β€” 10Γ— faster

Concept:
Encode each skill's SKILL.md into an embedding vector (one-time, cached). On every incoming message, embed the message and compute cosine similarity against all skill vectors. Activate skills whose similarity > threshold. No MiniMax call in the hot path.

Why it works:
Embeddings are deterministic, cached, and compute in <5ms on CPU. You can score all 13 skills in under 50ms total β€” well under the 300ms UX budget. MiniMax is still used, but only for generating injections (after activation is decided), not for routing.

Value angle:
Removes the latency problem entirely. Enables much lower activation thresholds (0.5 instead of 0.7) without cost. Makes the system testable with offline benchmarks.

Architecture fingerprint:
- Embed each skill on startup β†’ cache vectors in memory (~13 Γ— 1536 floats)
- On each message: embed message β†’ cosine similarity β†’ rank skills β†’ threshold filter
- Activated skills β†’ generate injection via MiniMax (this is async, doesn't block response)
- Injection appended as post-response footnote (doesn't delay the main response)
- Evolution: re-embed skills periodically as SKILL.md files are updated

Risks:
- Embedding model needed (could use MiniMax's embedding endpoint or a local model)
- Embedding similarity β‰  conceptual relevance β€” phi:veritas may score high for any factual question
- Loss of skill "personality" β€” injection is still from MiniMax, but routing is mechanical
- Async injection means the user sees response first, then skill perspective arrives

---

## PATH C: "The Event Bus Listener" β€” Regex/Topic Triggers, Zero Inference Cost
Ultra-lightweight: skills register as pattern subscribers

Concept:
No LLM scoring at all. Skills declare their activation patterns in SKILL.md as structured metadata: keywords, regex, topic tags, semantic categories. A router reads these patterns at startup and compiles them into a fast matching automaton. Zero inference cost, <1ms activation.

Why it works:
Many activations are predictable. phi:veritas fires when words like "true", "authentic", "verify", "honest" appear. nav:nonlinear fires on "chaos", "uncertainty", "complex". art:creative fires on "brainstorm", "idea", "creative". Pattern matching captures 80

Value angle:
Instant, deterministic, free. No API calls, no rate limits, no latency. Skills can declare their own patterns and update them without code changes.

Architecture fingerprint:
- Extended SKILL.md frontmatter:

yaml
  triggers:
    keywords: [truth, authentic, verify, honest, integrity]
    regex: ["(true|false)\?", "is .+ real"]
    topics: [epistemology, fact-checking, philosophy]
    negative_keywords: [fiction, hypothetical, joke]
  • Compiled into Aho-Corasick automaton on startup
  • No LLM in routing path; LLM only for injection generation
  • Injection generation is still async via MiniMax
  • Skills "learn" by updating their own trigger patterns based on feedback

Risks:
- Misses semantic triggers that aren't lexical ("what should I do?" β†’ might need nav:perspective but no obvious keywords)
- High maintenance burden as language evolves
- No sophisticated contextual judgment β€” just pattern matching
- Skills can't detect their own relevance in ambiguous situations

---

## PATH D: "The Lazy Oracle" β€” Post-Hoc Enrichment, Never Blocks
Decouple skill activation entirely from the response path

Concept:
The main response is never delayed. Every user message and assistant response is logged to a fast queue. Skills process the queue asynchronously and post their perspectives to a dedicated Discord thread (not the main channel). The user sees the main response immediately. Skill perspectives arrive 5-30 seconds later as Discord thread replies.

Why it works:
Completely solves the latency problem. Skills can take as long as they need β€” they're never in the critical path. This also changes the UX fundamentally: skill perspectives feel more like "second thoughts" or "expert follow-up" than injections. More like having a panel of advisors who comment after the fact.

Value angle:
Reduces the complexity of the main session by removing the router entirely. Skills become Discord bot threads. The experience is richer, not faster. Philosophy and art perspectives arriving 10 seconds after a response can create a more contemplative dynamic.

Architecture fingerprint:
- Main session: zero changes, full speed
- All messages β†’ async queue (Redis, file queue, or direct Discord webhook)
- Skill worker pool (one worker per skill, can run on Mac3/Mac4)
- Each worker: reads queue, scores relevance (its own domain), generates if relevant
- Output: Discord thread reply in main conversation thread, labeled by skill
- User can read or ignore; no injection into the main response

Risks:
- Temporal disconnect β€” skill perspective arrives after conversation has moved on
- Discord thread UX may be confusing (is this me? the bot? another bot?)
- Skills never directly influence the main response β€” indirect value only
- May feel like noise rather than enrichment over time

---

## PATH E: "The Synthesizer Singularity" β€” One Meta-Skill to Rule Them All
Instead of N daemons, distill all creative/philosophical skills into one rich meta-skill

Concept:
Don't run 13 separate skills. Build a single "Creative-Philosophical Synthesizer" (CPS) meta-skill whose system prompt is a distillation of all phi:, art:, and nav:* skill SKILL.md files β€” their wisdom, frameworks, guiding questions, and activation patterns β€” compressed into a single, rich context. This CPS skill activates once per message (simple binary trigger: does the message have any creative/philosophical dimension?) and provides a unified multi-perspective injection.

Why it works:
Eliminates routing complexity entirely. One activation decision vs. 13. The synthesized skill is richer than any individual skill because it carries all their frameworks simultaneously. A single MiniMax call produces a perspective that integrates Veritas, Convergence, Divergence, and Non-Linear navigation simultaneously.

Value angle:
Lowest complexity path to high capability. Can build the meta-skill today. Zero new infrastructure. Just one new skill file that's a synthesis of 13.

Architecture fingerprint:
- Build CPS SKILL.md by synthesizing all 13 source skills
- Simple routing: if message_has_creative_or_philosophical_dimension β†’ activate CPS
- Single MiniMax call; injection is multi-perspective unified output
- Individual skills remain for explicit invocation (/veritas, /creative, etc.)
- Evolution: CPS system prompt updated periodically as source skills evolve

Risks:
- Skills lose individual identity β€” phi:veritas doesn't exist as distinct entity
- Combined context may be too large for MiniMax 3B context window
- Less targeted: can't say "only veritas activated for this question"
- Harder to understand why a perspective was included

---

# STAGE 2: COMPOUND (Sequential Linear)
Building the best elements into a unified system. Each step builds on ALL previous steps.

---

## STEP 1: Foundation β€” The Non-Negotiable Constraints
No inheritance yet β€” this is ground truth

Before choosing any architecture, establish the invariants that any solution must satisfy:

Performance invariant: Main session response latency must not increase by more than 200ms (P50). Users on Mac1 with typical Discord messages should not feel friction.

Architecture invariant: The system must run without code changes to the Clawdbot gateway source. We only have hooks and skill files to work with.

Correctness invariant: A skill injection must be clearly attributable (user knows which skill fired and why), must be suppressible (user can mute any skill), and must degrade gracefully when MiniMax is unavailable.

Evolution invariant: Skills must be able to develop genuine memory β€” not just conversation history, but patterns about when they've been useful vs. not.

Hardware reality: Mac1 is the hot path. Mac3 (M1, 8GB) and Mac4 (M4 Mini, 16GB) are available for async workloads. MiniMax-3B-v0.1 at localhost:18080 is the local inference endpoint. Discord API is the persistence layer.

These five constraints immediately eliminate or heavily constrain several paths:
- PATH A (Daemon Fleet): Process management complexity violates the "no gateway code changes" invariant if daemons need to intercept sessions
- PATH C (Event Bus): Pure keyword matching violates the "genuine memory" evolution invariant
- PATH E (Synthesizer Singularity): Single meta-skill loses individual skill identity; violates "attributable injection" correctness invariant

Surviving paths: B (Embedding Index) and D (Lazy Oracle) β€” plus hybrid combinations.

---

## STEP 2: Routing Architecture β€” Embedding + Async Split
Builds on Step 1: we need <200ms latency AND genuine skill activation quality

From Step 1, we know:
- LLM scoring in the hot path (PATH A's approach) is too slow
- Pure pattern matching (PATH C) isn't smart enough
- Async injection (PATH D) is valid but loses the ability to influence the main response

Step 2 Resolution: Two-tier routing architecture

Tier 1 (HOT PATH, <50ms):
  Embedding similarity β†’ FAST PASS / SLOW QUEUE decision

Tier 2 (ASYNC, 5-30s):
  MiniMax scoring β†’ Injection generation β†’ Discord thread

The Hot Path (Tier 1):
On every user message, embed the message using a local embedding model (or MiniMax's embedding endpoint). Compute cosine similarity against pre-cached skill embeddings. Skills above 0.6 similarity go into the ACTIVATION QUEUE. This entire operation: <50ms.

The Async Path (Tier 2):
The activation queue is processed asynchronously on Mac3 or Mac4. Each activated skill:
1. Reads its Discord channel history (last 20 messages, cached)
2. Constructs scoring prompt: `[SKILL context] + [conversation history] + [this message] β†’ score 0-1 + reasoning`
3. If final score β‰₯ 0.7, generates injection
4. Posts injection to a designated Discord thread or delivery channel

What this resolves from Step 1:
- Performance invariant: βœ… Hot path is embedding-only, <50ms
- Architecture invariant: βœ… Hooks handle the embedding call; no gateway changes
- Correctness invariant: βœ… Both tiers produce attributable injections
- Evolution invariant: βœ… Tier 2 has full context to develop patterns

New constraint introduced: We need an embedding model. MiniMax 3B may not have an embeddings endpoint; need to verify or use an alternative (all-MiniLM-L6-v2 via Ollama on Mac4 would work).

---

## STEP 3: Persistence Layer β€” The Skill Memory Schema
Builds on Steps 1+2: we have two-tier routing; now define what skills actually remember

From Steps 1-2 we have:
- Tier 1 embedding filter β†’ Tier 2 MiniMax scoring β†’ Discord injection
- Skills need to remember patterns about when they've been useful

The problem: Discord channel history is unbounded and unstructured. Skills can't learn from it; they can only scroll it.

Step 3 Resolution: Skill State Documents (SSDs)

Each skill gets a `[home-path]` directory containing:

skill-memory/
  phi:veritas/
    state.json           # Current skill state
    activation-log.jsonl # Immutable log of every activation
    pattern-model.json   # Learned trigger patterns
    discord-cursor.json  # Last-read position in Discord channel

state.json schema:
{
  "skill": "phi:veritas",
  "total_activations": 847,
  "useful_activations": 312,  // user engaged with injection
  "suppressed_count": 43,     // user muted this skill
  "hot_topics": ["decision-making", "fact-checking", "authenticity"],
  "cold_topics": ["casual greetings", "code review", "shopping lists"],
  "context_window": 20,       // current rolling window size
  "confidence_calibration": 0.73,  // adjusted threshold
  "last_activated": "2025-07-18T03:22:11Z",
  "last_useful": "2025-07-18T01:45:33Z"
}

activation-log.jsonl (append-only):
{"ts": "...", "message_hash": "...", "embedding_similarity": 0.82, "minimax_score": 0.91, "injected": true, "feedback": "engaged", "context_summary": "..."}

Context pruning:
Discord channel serves as long-term conversation reference only. For scoring, each skill reads only its last `context_window` activations from its local JSONL log (not Discord). Context window grows from 5 to 50 as skill accumulates activation history.

What this resolves from Steps 1+2:
- Evolution invariant: βœ… Skills now have structured, queryable history
- Data integrity: βœ… JSONL is append-only; no corruption risk; size is bounded
- Cold start: βœ… State begins at defaults; grows with experience
- Feedback loop: βœ… `feedback` field enables learning which activations were useful

---

## STEP 4: Scoring Prompt Engineering β€” The Quality Gate
Builds on Steps 1+2+3: routing is two-tier, memory is structured; now make scoring accurate

From Steps 1-3 we have:
- Tier 1 embedding filter (fast, coarse)
- Tier 2 MiniMax scoring (slow, accurate, async)
- Structured skill memory with activation history

The problem: MiniMax 3B's scoring quality depends entirely on what we send it. A bad prompt produces garbage scores.

Step 4 Resolution: The Activation Prompt Template

python
SCORING_PROMPT = """You are {skill_name}'s activation judge. Your only job is to decide if this skill should contribute to the current conversation.

SKILL IDENTITY:
{skill_description_2_sentences}

SKILL STRONGEST DOMAINS (from history):
{hot_topics_list}

SKILL WEAKEST FIT (suppress here):
{cold_topics_list}

CURRENT MESSAGE:
"{user_message}"

CONVERSATION CONTEXT (last 3 exchanges):
{recent_history}

SCORE: Output ONLY a JSON object:
{{"score": 0.0-1.0, "reason": "one sentence", "inject": true/false}}

Score 1.0 = this is exactly what this skill was built for.
Score 0.5 = relevant but peripheral.
Score 0.0 = completely off-domain.
Threshold for inject=true: 0.7
"""

Injection generation prompt (separate, only called when inject=true):

python
INJECTION_PROMPT = """You are {skill_name}. {skill_system_prompt}

The user said: "{user_message}"
The assistant replied: "{assistant_response}"

Add your perspective in 2-4 sentences. Be specific to THIS exchange, not generic. Draw on your guiding frameworks. Your voice is {skill_voice_descriptor}.

Format: Start with "**{skill_display_name}:** " then your contribution.
"""

Pattern learning loop:
After 30 activations, the skill reads its JSONL log, identifies the `hot_topics` and `cold_topics` from engaged vs. suppressed activations, and updates `state.json`. Next scoring calls include these learned patterns.

What this resolves from Steps 1+2+3:
- Correctness invariant: βœ… Structured prompts produce attributable, controllable injections
- False positive suppression: βœ… Hot/cold topic injection into prompt guides the model away from bad activations
- Evolution invariant: βœ… Learned patterns feed back into next scoring call

---

## STEP 5: Multi-Skill Coordination β€” The Injection Compositor
Builds on Steps 1+2+3+4: routing, memory, and scoring are designed; handle simultaneous activations

From Steps 1-4 we have a complete single-skill pipeline. Now: what happens when multiple skills activate simultaneously?

The problem scenarios:
1. phi:veritas (score 0.85) + phi:metaphysical (score 0.78) + art:creative (score 0.71) all activate for one message
2. Two skills say contradictory things
3. Injections make the total response 3Γ— longer than the main response

Step 5 Resolution: The Injection Budget + Ranking Protocol

python
class InjectionCompositor:
    MAX_INJECTIONS_PER_MESSAGE = 2      # Hard cap
    MAX_INJECTION_TOTAL_CHARS = 600     # Budget
    COOLDOWN_SAME_SKILL = 5            # Messages before same skill can fire again
    FAMILY_COOLDOWN = {"phi:*": 3}      # Only 1 phi:* per 3 messages

    def compose(self, activations: List[Activation]) -> InjectionResult:
        # Step 1: Apply cooldown filter
        eligible = [a for a in activations if not self.in_cooldown(a.skill)]

        # Step 2: Apply family limits
        eligible = self.apply_family_limits(eligible)

        # Step 3: Rank by score Γ— (1 + historical_usefulness_rate)
        ranked = sorted(eligible,
                       key=lambda a: a.score * (1 + a.skill.useful_activation_rate),
                       reverse=True)

        # Step 4: Take top N within budget
        selected = []
        char_budget = self.MAX_INJECTION_TOTAL_CHARS
        for activation in ranked[:self.MAX_INJECTIONS_PER_MESSAGE]:
            if len(activation.injection) <= char_budget:
                selected.append(activation)
                char_budget -= len(activation.injection)

        return InjectionResult(selected)

Injection format (resolved here):

Injections appear as a collapsible footer block in Discord, formatted with Discord spoiler syntax for non-intrusive delivery:

[Main assistant response]

||**Skill Perspectives** β€” tap to reveal
**Veritas:** The factual anchor here is X. Worth verifying Y before committing.
**Creative:** What if the constraint itself is the creative material?||

This is:
- Non-blocking (main response reads complete)
- Optional (user can ignore the spoiler block)
- Attributable (clearly labeled by skill)
- Bounded (budget enforced)

What this resolves from all previous steps:
- Multi-skill collision: βœ… Compositor with budget, ranking, and cooldowns
- UX invariant: βœ… Collapsible footer is non-intrusive but always accessible
- Fatigue prevention: βœ… Cooldowns prevent any skill from dominating

---

## STEP 6: Gateway Integration β€” The Hooks Implementation
Builds on Steps 1+2+3+4+5: full system designed; now wire it into Clawdbot's actual architecture

From Steps 1-5 we have a complete SEA design:
- Two-tier routing (embedding β†’ MiniMax)
- Structured skill memory (JSONL + state.json)
- Quality-engineered scoring prompts
- Multi-skill compositor with cooldowns and budget
- Collapsible Discord footer injection format

Step 6 Resolution: The Hook Integration Architecture

Clawdbot's hooks system runs before/after session messages. The SEA integrates as a post-response hook to avoid any hot-path latency impact:

bash
# [home-path]
#!/usr/bin/env python3
"""
SEA Router Hook β€” runs after assistant response is sent.
Async: never blocks the main session.
"""
import asyncio, json, sys
from sea.router import SEARouter

event = json.load(sys.stdin)  # {user_message, assistant_response, session_id, channel_id}

# Fire and forget β€” user already has the response
asyncio.create_task(SEARouter().process(event))
sys.exit(0)  # Return immediately, don't wait

The SEA runs entirely in its own process space. The hook fires the router, which runs async on Mac3 (via SSH) or locally in a background thread, and the injection appears in Discord independently of the main response.

Full process flow:

User Message β†’ Clawdbot Gateway β†’ [Session Processing] β†’ Assistant Response (fast)
                                         ↓ (async, non-blocking)
                                    SEA Hook fires
                                         ↓
                               [Mac3] SEA Router process
                                         ↓
                          Tier 1: Embedding similarity (<50ms)
                          Tier 2: MiniMax scoring (async per skill)
                          Compositor: Rank, budget, format
                                         ↓
                          Discord: Post injection to conversation
                                     (5-30 seconds later)

What this resolves from all previous steps:
- Performance invariant: βœ… Zero hot-path impact; main response fires first
- Architecture invariant: βœ… Only a new hook file; no gateway code changes
- All other invariants: βœ… Inherited from previous steps

---

## STEP 7: Synthesis β€” The Unified SEA System
Final compound step: integrate everything into a coherent named architecture

The compounded system is now fully specified. Here is the unified SEA architecture:

The Skill Entity Architecture (SEA) β€” v1 Spec

Name: Skill Entity Architecture
Version: 1.0
Mode: Post-response async enrichment
Latency impact: Zero (0ms added to main session)
Injection delivery: Discord collapsible footer, 5-30s post-response

Three-layer stack:

╔══════════════════════════════════════════════════════════════╗
β•‘  LAYER 1: ROUTING ENGINE (Mac1, hook)                       β•‘
β•‘  Input: user_message + assistant_response                   β•‘
β•‘  Tier 1: Embedding similarity β†’ ACTIVATION QUEUE            β•‘
β•‘  Latency: <50ms (async, doesn't block response)             β•‘
╠══════════════════════════════════════════════════════════════╣
β•‘  LAYER 2: SKILL COUNCIL (Mac3 async workers)                β•‘
β•‘  Each activated skill:                                      β•‘
β•‘    β†’ Reads skill memory (state.json + JSONL log)            β•‘
β•‘    β†’ MiniMax scoring with engineered prompt                 β•‘
β•‘    β†’ If score β‰₯ calibrated threshold: generate injection    β•‘
β•‘    β†’ Submit to Compositor                                   β•‘
╠══════════════════════════════════════════════════════════════╣
β•‘  LAYER 3: COMPOSITOR + DELIVERY (Mac1 or Mac3)              β•‘
β•‘  β†’ Apply budget (600 chars), cooldowns, family limits       β•‘
β•‘  β†’ Format as Discord spoiler footer                         β•‘
β•‘  β†’ Post to conversation thread                              β•‘
β•‘  β†’ Update activation logs                                   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

Skill set: 13 creative/philosophical skills
`phi:veritas, phi:paradox, phi:metaphysical, art:creative, art:convergent, art:divergent, art:synthesis, art:snark, art:movement, art:dj, nav:nonlinear, nav:organic, nav:perspective`

State per skill: `[home-path]` with `state.json` + `activation-log.jsonl`

Scoring: Embedding (Tier 1) β†’ MiniMax-3B (Tier 2) with hot/cold topic injection

Injection format: Discord spoiler block, max 2 skills, 600 chars total

Evolution: After 30 activations, skills auto-update hot/cold topic patterns from JSONL feedback

---

STAGE 3: EXPAND + MASTER PLAN

3a. Stress Test β€” What Breaks Under Pressure?

### πŸ”΄ CRITICAL STRESS: Embedding Model Gap
Scenario: We don't have a local embedding model. MiniMax 3B's primary interface is completion, not embeddings.
Failure mode: Tier 1 routing can't function without embeddings.
Resolution: Use `all-MiniLM-L6-v2` via Ollama on Mac4 (http://[ip]:11434). It's 384-dim, 22M parameters, runs in <10ms on Mac4. Pre-cache all 13 skill embeddings on startup. Mac1 calls Mac4's embedding endpoint for Tier 1 routing.
Validation: `curl http://[ip]:11434/api/embeddings -d '{"model":"all-minilm","prompt":"test"}'`

### πŸ”΄ CRITICAL STRESS: Mac3 Async Worker Reliability
> Obsolete as of 2026-05-13. Mac3 has been retired. Current Tier 2 host is Mac4:8100 (cognitive twin). Skip this stress-test section if reading for current architecture β€” it analyzed a worker pool that no longer exists.

Scenario: Mac3 (M1 8GB) goes to sleep, runs out of memory, or loses network.
Failure mode: Tier 2 scoring never happens; no injections ever delivered.
Resolution: Dual-track:
a. Primary: Mac3 async worker pool (5 concurrent skills)
b. Fallback: Local Mac1 thread pool if Mac3 unreachable (slower but functional)
c. Circuit breaker: If Mac3 unreachable for >60s, disable Tier 2, run Tier 1 only as "mood indicator" (emoji reaction to show which skills detected relevance, no full injection)

### 🟑 MEDIUM STRESS: Discord Rate Limits
Scenario: 50+ messages/hour Γ— posting injections = 100 Discord API calls/hour.
Failure mode: 429 Too Many Requests; injections drop silently.
Resolution: Queue Discord posts with exponential backoff. If injection is >60 seconds stale (conversation moved on), discard rather than deliver. Implement injection TTL.

### 🟑 MEDIUM STRESS: Context Window Pollution
Scenario: MiniMax 3B scoring prompt includes 3 conversation exchanges + skill memory + scoring instructions. At MiniMax 3B's context window (varies by quantization, likely 4K-8K tokens), this could overflow.
Resolution: Hard-cap scoring prompt at 1500 tokens. Truncate conversation context from oldest. Never truncate skill identity or hot/cold topics (those are most informative).

### 🟑 MEDIUM STRESS: Skill Cooldown Gaming
Scenario: phi:veritas activates every message and hits cooldown (5 messages). User keeps having philosophical conversations.
Failure mode: Most relevant skill is suppressed; less relevant skills fire instead.
Resolution: Cooldown applies per-skill, not globally. If a skill is suppressed by cooldown but scores >0.9, log the miss for learning but don't deliver. Users can explicitly invoke `/veritas` to bypass cooldown.

### 🟒 LOW STRESS: SKILL.md Drift
Scenario: A skill's SKILL.md is updated but cached embedding is stale.
Resolution: Watch for file changes with `fsevents` (Mac). On SKILL.md change, re-embed and update cache. Add startup check: if SKILL.md mtime > embedding cache mtime, re-embed.

### 🟒 LOW STRESS: First 30 Activations (Cold Start)
Scenario: New installation; no activation history; hot/cold topics empty.
Resolution: Skills ship with `default_hot_topics` and `default_cold_topics` in their SKILL.md frontmatter. These seed the scoring prompt until learned patterns are available. After 30 activations, learned patterns take over.

---

3b. Missing Components (Gap Fill)

### GAP FILL 1: Feedback Collection System
What's missing: No mechanism to tell the system "that injection was good/bad."
Solution: Discord reaction collector. When skill injection is posted:
- Bot adds πŸ‘ and 🀫 reactions to the injection message
- If user reacts πŸ‘: log `feedback=engaged` in activation JSONL
- If user reacts 🀫: log `feedback=suppressed`, increment skill's suppression count
- After 10 suppression reacts for a skill, lower its threshold from 0.7 to 0.8
- After 20 engage reacts, lower threshold from 0.7 to 0.6 (more sensitive)

### GAP FILL 2: Skill Dashboard
What's missing: No visibility into which skills are active, what's fired, what patterns they've learned.
Solution: `/skills status` command in Discord:

🧠 SEA Status
β”œβ”€β”€ phi:veritas    β–“β–“β–“β–“β–“β–“β–“β–‘β–‘β–‘ 847 acts | 73% useful | πŸ”₯ decision-making
β”œβ”€β”€ art:creative   β–“β–“β–“β–“β–“β–‘β–‘β–‘β–‘β–‘ 412 acts | 61% useful | πŸ”₯ brainstorming
β”œβ”€β”€ nav:nonlinear  β–“β–“β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 89 acts  | 45% useful | ❄️  casual greetings
└── ... (10 more)
Top injection: phi:veritas (today: 12 injections)
Cooldown active: art:snark (3 more messages)

### GAP FILL 3: Mute Controls
What's missing: User can't suppress specific skills.
Solution: `/skill mute phi:veritas [duration]` β€” adds skill to cooldown list for specified duration. `/skill unmute phi:veritas`. `/skill reset phi:veritas` β€” wipes activation history and starts fresh.

### GAP FILL 4: Manual Invocation Bridge
What's missing: Explicit skill invocation (/veritas) should update the skill's history too.
Solution: When user explicitly invokes a skill (via slash command or `/veritas` prefix), log it as a forced activation with `explicit=true` in the JSONL. These don't count toward threshold calibration but do inform topic learning.

### GAP FILL 5: Skill Retirement Protocol
What's missing: A skill with <20
Solution: SEA tracks "zombie skills" β€” skills that have been activated 100+ times with <25

---

3c. Master Execution Checklist

---

### PHASE 0: Prerequisites & Validation (Days 1-2)
Validate all assumed infrastructure before writing any SEA code

  • [ ] 0.1 Verify Ollama embedding model on Mac4
  • Owner: Claude Code
  • Input: Mac4 SSH access (`ssh mac4`)
  • Output: `all-minilm` model confirmed running, test embedding returns 384-dim vector
  • Validation: `curl http://[ip]:11434/api/embeddings -d '{"model":"all-minilm","prompt":"philosophical truth"}' | python3 -m json.tool`
  • Depends on: Nothing
  • Status: Not Started
  • [ ] 0.2 Benchmark MiniMax scoring prompt latency
  • Owner: Claude Code
  • Input: MiniMax-3B-v0.1 at localhost:18080, scoring prompt template from Step 4
  • Output: P50/P95 latency for a single scoring call; confirm it's <5s for async use case
  • Validation: Script runs 20 scoring calls, reports min/mean/max
  • Depends on: Nothing
  • Status: Not Started
  • [ ] 0.3 Confirm Mac3 availability and connectivity
  • Owner: Mohamed
  • Input: Mac3 IP/hostname on Tailscale
  • Output: SSH access confirmed, Python 3.x available, sufficient RAM for 5 concurrent workers
  • Validation: `ssh mac3 "python3 --version && free -m"`
  • Depends on: Nothing
  • Status: Not Started
  • [ ] 0.4 Audit existing hooks for conflicts
  • Owner: Claude Code
  • Input: `[home-path]` directory
  • Output: Inventory of all hooks; confirm no existing hooks will conflict with SEA router
  • Validation: Document hook names, trigger conditions, and any shared state
  • Depends on: Nothing
  • Status: Not Started
  • [ ] 0.5 Identify Discord channel IDs for 13 skill channels
  • Owner: Claude Code
  • Input: Discord bot token, guild ID
  • Output: Channel ID map `{skill_name: channel_id}` saved to `[home-path]`
  • Validation: All 13 skill channels exist and are readable by bot
  • Depends on: 0.3
  • Status: Not Started

---

### PHASE 1: Core Infrastructure (Days 3-6)
Build the minimum viable SEA: routing + single skill working end-to-end

  • [ ] 1.1 Create skill-memory directory structure
  • Owner: Claude Code
  • Input: List of 13 creative skills
  • Output: `[home-path]` + `activation-log.jsonl` for all 13 skills
  • Validation: `ls [home-path]` shows all 13 skill directories with valid JSON
  • Depends on: 0.1
  • Status: Not Started
  • [ ] 1.2 Build embedding indexer
  • Owner: Claude Code
  • Input: 13 SKILL.md files, Mac4 Ollama endpoint
  • Output: `[home-path]` β€” 13 Γ— 384 numpy array + skill name index
  • Validation: Load cache, compute cosine similarity for test message, top result matches expected skill
  • Depends on: 0.1, 1.1
  • Status: Not Started
  • [ ] 1.3 Implement Tier 1 router (embedding filter)
  • Owner: Claude Code
  • Input: embedding-cache.npz, Mac4 embedding endpoint
  • Output: `[home-path]` β€” accepts message string, returns ranked skill list
  • Validation: Test with 10 messages; confirm correct skills rank highest for each domain
  • Depends on: 1.2
  • Status: Not Started
  • [ ] 1.4 Implement Tier 2 scorer for one skill (phi:veritas)
  • Owner: Claude Code
  • Input: Scoring prompt template (Step 4), MiniMax endpoint, phi:veritas state.json
  • Output: `[home-path]` β€” accepts skill + context, returns {score, reason, inject}
  • Validation: Run 5 test prompts; verify scores are calibrated (philosophical prompt >0.7, shopping list <0.3)
  • Depends on: 0.2, 1.3
  • Status: Not Started
  • [ ] 1.5 Implement injection generator
  • Owner: Claude Code
  • Input: Injection prompt template (Step 4), skill state.json, MiniMax endpoint
  • Output: `[home-path]` β€” accepts activated skill + message + response, returns injection string
  • Validation: Generated injection is 2-4 sentences, starts with `{skill}:`, is contextually relevant
  • Depends on: 1.4
  • Status: Not Started
  • [ ] 1.6 Implement Compositor
  • Owner: Claude Code
  • Input: Compositor spec from Step 5 (budget, cooldowns, family limits)
  • Output: `[home-path]` β€” accepts list of injections, returns formatted Discord spoiler block
  • Validation: Test with 3 simultaneous activations; confirm only top 2 are included within 600-char budget
  • Depends on: 1.5
  • Status: Not Started

---

### PHASE 2: Gateway Integration (Days 7-9)
Wire the SEA into Clawdbot's hook system

  • [ ] 2.1 Implement SEA hook (post-response, async)
  • Owner: Claude Code
  • Input: Hook system spec, all Phase 1 scripts
  • Output: `[home-path]` β€” executable script, reads event JSON from stdin, fires async SEA pipeline
  • Validation: Hook fires after test message; confirm it returns immediately (no blocking)
  • Depends on: 1.6
  • Status: Not Started
  • [ ] 2.2 Implement Discord delivery
  • Owner: Claude Code
  • Input: Discord bot token, channel map (0.5), compositor output
  • Output: `[home-path]` β€” posts spoiler block to correct Discord thread
  • Validation: End-to-end test β€” send message in Discord, confirm injection appears as spoiler reply within 30s
  • Depends on: 2.1, 0.5
  • Status: Not Started

> Obsolete as of 2026-05-13. Tasks 2.3 and 2.4 below targeted the Mac3 worker pool, which has been retired. For v1.1, retarget to Mac4:8100 (cognitive twin host); the `mac3-worker-config/` artifacts are archived to `archive/mac3-era/`. Skip these two tasks if planning current work.

  • [ ] 2.3 Mac3 worker pool setup
  • Owner: Claude Code + Mohamed
  • Input: Mac3 SSH access, SEA scripts
  • Output: Supervisord/launchd config running 5 SEA worker processes on Mac3; Mac1 can enqueue jobs via SSH or Redis
  • Validation: Send 10 messages rapidly; confirm all process without dropping
  • Depends on: 2.2, 0.3
  • Status: Not Started
  • [ ] 2.4 Circuit breaker implementation
  • Owner: Claude Code
  • Input: Mac3 connectivity check logic
  • Output: `[home-path]` β€” monitors Mac3, falls back to local thread pool, posts emoji reactions in fallback mode
  • Validation: Kill Mac3 workers; confirm emoji fallback activates within 60s; restore Mac3, confirm full mode resumes
  • Depends on: 2.3
  • Status: Not Started

---

### PHASE 3: All 13 Skills + Feedback Loop (Days 10-14)
Extend to full skill set and add learning mechanisms

  • [ ] 3.1 Extend Tier 2 scorer to all 13 skills
  • Owner: Claude Code
  • Input: 12 remaining SKILL.md files, default hot/cold topics
  • Output: State.json files with default_hot_topics + default_cold_topics seeded for all 13 skills
  • Validation: All skills produce calibrated scores for domain-appropriate and domain-inappropriate test messages
  • Depends on: 2.4
  • Status: Not Started
  • [ ] 3.2 Implement feedback collector (Discord reactions)
  • Owner: Claude Code
  • Input: Discord bot token, injection message IDs
  • Output: Reaction listener that logs πŸ‘/🀫 to activation JSONL
  • Validation: React to test injection; confirm JSONL updated within 5s
  • Depends on: 3.1
  • Status: Not Started
  • [ ] 3.3 Implement pattern learner (30-activation trigger)
  • Owner: Claude Code
  • Input: activation-log.jsonl (after 30+ entries), state.json
  • Output: `[home-path]` β€” reads JSONL, extracts hot/cold topics, updates state.json
  • Validation: Seed 30 synthetic activations; run learner; confirm state.json reflects correct hot/cold topics
  • Depends on: 3.2
  • Status: Not Started
  • [ ] 3.4 Implement threshold calibration
  • Owner: Claude Code
  • Input: state.json useful_activation_rate, suppression_count
  • Output: Calibration script that adjusts skill threshold based on feedback signals
  • Validation: Skill with 80
  • Depends on: 3.3
  • Status: Not Started

---

### PHASE 4: User Controls + Dashboard (Days 15-18)
Make the system transparent and controllable

  • [ ] 4.1 `/skills status` Discord command
  • Owner: Claude Code
  • Input: state.json files for all 13 skills
  • Output: Formatted Discord embed showing activation counts, useful rates, top hot topics, cooldown status
  • Validation: Command returns within 2s; data is accurate against JSONL files
  • Depends on: 3.4
  • Status: Not Started
  • [ ] 4.2 `/skill mute` and `/skill unmute` commands
  • Owner: Claude Code
  • Input: Skill name, duration
  • Output: Mute list in `[home-path]`; compositor respects mute list
  • Validation: Mute phi:veritas; confirm no injections from it; unmute; confirm injections resume
  • Depends on: 4.1
  • Status: Not Started
  • [x] 4.3 `/skill reset` command
  • Owner: Claude Code
  • Input: Skill name
  • Output: Wipes activation-log.jsonl, resets state.json to defaults
  • Validation: After reset, skill behaves as if newly installed
  • Depends on: 4.2
  • Status: Complete β€” `skill_controller.py reset` command with `--confirm`/`--json` flags
  • [ ] 4.4 Zombie skill reporter
  • Owner: Claude Code
  • Input: All skill state.json files, 30-day JSONL history
  • Output: Monthly Discord post listing skills with <25
  • Validation: Synthetic test with artificially poor skill data triggers report
  • Depends on: 3.4
  • Status: Not Started

---

### PHASE 5: Hardening + Production (Days 19-21)
The system should run without attention

  • [ ] 5.1 Comprehensive logging + alerting
  • Owner: Claude Code
  • Input: All SEA scripts
  • Output: Structured JSON logs to `[home-path]` with daily rotation; alerts to Discord #alerts if error rate >10
  • Validation: Inject synthetic error; confirm alert fires within 60s
  • Depends on: 4.4
  • Status: Not Started
  • [ ] 5.2 SKILL.md file watcher + re-embedder
  • Owner: Claude Code
  • Input: `[home-path]` directory, fsevents or `watchdog`
  • Output: Watcher process that re-embeds skill on SKILL.md change, updates embedding-cache.npz
  • Validation: Edit phi:veritas SKILL.md; confirm embedding cache updated within 30s
  • Depends on: 1.2
  • Status: Not Started
  • [ ] 5.3 Injection staleness TTL
  • Owner: Claude Code
  • Input: Injection timestamp, conversation activity since injection was queued
  • Output: Discard injection if >90 seconds old (conversation has moved on)
  • Validation: Queue injection, wait 120 seconds, confirm it's discarded rather than delivered
  • Depends on: 5.1
  • Status: Not Started
  • [ ] 5.4 Latency benchmark + SLO documentation
  • Owner: Claude Code + Mohamed
  • Input: Production system after Phase 4
  • Output: Documented P50/P95/P99 for injection delivery time; confirmed zero hot-path impact
  • Validation: 100-message benchmark; main response latency before/after = same; injection latency <30s P95
  • Depends on: 5.3
  • Status: Not Started
  • [ ] 5.5 Full system runbook
  • Owner: Claude Code
  • Input: All Phase 0-5 learnings
  • Output: `Desktop/skill-entity-architecture/RUNBOOK.md` β€” startup, shutdown, troubleshooting, skill onboarding guide
  • Validation: Mohamed reads and can restart the SEA system from scratch using only the runbook
  • Depends on: 5.4
  • Status: Not Started

---

Final Architecture Summary

╔══════════════════════════════════════════════════════════════════════╗
β•‘              SKILL ENTITY ARCHITECTURE β€” v1 FINAL SPEC              β•‘
╠══════════════════════════════════════════════════════════════════════╣
β•‘                                                                      β•‘
β•‘  TRIGGER: Every message in monitored Discord channels                β•‘
β•‘                                                                      β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β•‘
β•‘  β”‚  POST-RESPONSE HOOK (Mac1, <1ms, async)                 β”‚        β•‘
β•‘  β”‚  Fires after assistant response is already delivered     β”‚        β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β•‘
β•‘                         β”‚ (event: user_msg + assistant_response)     β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β•‘
β•‘  β”‚  TIER 1: EMBEDDING FILTER (Mac4, <50ms)                 β”‚        β•‘
β•‘  β”‚  all-MiniLM-L6-v2 via Ollama                           β”‚        β•‘
β•‘  β”‚  Cosine similarity against 13 skill embeddings          β”‚        β•‘
β•‘  β”‚  Output: ranked skill list (top N with score >0.4)      β”‚        β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β•‘
β•‘                         β”‚ (async queue to Mac3)                      β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β•‘
β•‘  β”‚  TIER 2: SKILL COUNCIL (Mac3, 5 parallel workers)       β”‚        β•‘
β•‘  β”‚  Each worker: skill memory + MiniMax scoring prompt     β”‚        β•‘
β•‘  β”‚  Scoring: MiniMax-3B with hot/cold topic injection      β”‚        β•‘
β•‘  β”‚  Threshold: calibrated per skill (default 0.7)          β”‚        β•‘
β•‘  β”‚  Output: scored injections β†’ Compositor queue           β”‚        β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β•‘
β•‘                         β”‚                                            β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β•‘
β•‘  β”‚  COMPOSITOR (Mac1 or Mac3)                              β”‚        β•‘
β•‘  β”‚  Budget: 600 chars, max 2 skills                        β”‚        β•‘
β•‘  β”‚  Ranking: score Γ— (1 + useful_rate)                    β”‚        β•‘
β•‘  β”‚  Cooldowns: per-skill 5-msg, family 3-msg               β”‚        β•‘
β•‘  β”‚  Format: Discord spoiler block                          β”‚        β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β•‘
β•‘                         β”‚                                            β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β•‘
β•‘  β”‚  DISCORD DELIVERY                                        β”‚        β•‘
β•‘  β”‚  Posts spoiler block to conversation                    β”‚        β•‘
β•‘  β”‚  Adds πŸ‘ 🀫 reactions for feedback                      β”‚        β•‘
β•‘  β”‚  Discards if injection >90s stale                       β”‚        β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β•‘
β•‘                         β”‚                                            β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β•‘
β•‘  β”‚  LEARNING LOOP (cron, every 24h)                        β”‚        β•‘
β•‘  β”‚  Reads activation JSONL + reaction feedback             β”‚        β•‘
β•‘  β”‚  Updates hot/cold topics in state.json                  β”‚        β•‘
β•‘  β”‚  Calibrates threshold per skill                         β”‚        β•‘
β•‘  β”‚  Flags zombie skills (>100 activations, <25% useful)    β”‚        β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β•‘
β•‘                                                                      β•‘
β•‘  SKILLS: phi:veritas, phi:paradox, phi:metaphysical                  β•‘
β•‘          art:creative, art:convergent, art:divergent, art:synthesis  β•‘
β•‘          art:snark, art:movement, art:dj                             β•‘
β•‘          nav:nonlinear, nav:organic, nav:perspective                 β•‘
β•‘                                                                      β•‘
║  HARDWARE: Routing→Mac1 | Inference→Mac3 | Embeddings→Mac4          ║
β•‘  LATENCY: 0ms hot path | 5-30s injection delivery                   β•‘
β•‘  EVOLUTION: Continuous via JSONL feedback + 24h calibration cron     β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

---

DEP Final Re-Score (Post Evo-Cubed)

After completing the Evo-Cubed analysis and producing the unified spec, the concept is now significantly better defined. Re-scoring:

CategoryInitialPost-EvoDelta
Feature Completeness3.57.5+4.0
Code Quality1.54.0+2.5
Data Integrity3.07.0+4.0
Integration Depth4.57.5+3.0
UX2.56.5+4.0
Production Readiness2.05.5+3.5
TOTAL17/6038/60+21

Revised DEP Decision: COMMIT to Phase 0-1 immediately. RECURSE on Phase 2+ after Phase 0 benchmarks validate latency assumptions.

The single most important Phase 0 task: 0.2 β€” Benchmark MiniMax scoring latency. If async Tier 2 calls take >10s each, the 30-second injection delivery SLO fails and the architecture needs adjustment (likely move to Path B pure-embedding for Tier 2 as well).

---

Analysis complete. SEA v1 Spec is ready for Phase 0 execution.
Generated: 2025-07-18 | Protocol: DEP v2 + EvoΒ³ | Session: sea-dep-evocubed

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

skill-entity-architecture/CREATIVE_EVOLUTION_SEA_v1.md

Detected Structure

Method Β· Evaluation Β· References Β· Figures Β· Code Anchors Β· Architecture