Back to corpus
research noteexperiment writeup candidatescore 24

CLAUDE.md

KARL (Knowledge Agents via Reinforcement Learning) records what AI coding agents do during real work sessions as trajectories, scores them with a multi-signal reward engine, and uses the best trajectories for LoRA fine-tuning via MLX. It also provides vector-based skill routing that shadows and can replace regex routing.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. KARL (Knowledge Agents via Reinforcement Learning) records what AI coding agents do during real work sessions as trajectories, scores them with a multi-signal reward engine, and uses the best trajectories for LoRA fine-tuning via MLX. It also provides vector-based skill routing that shadows and can replace regex routing. The extended CLI lives in `karl/karl_cli.py` (67 commands). The simpler `karl/cli.py` is the pip-installed entry point (`karl` command) with 10 core subcommands. The system records agent sessions through 4 tap points in `trajectory_tap.py`: 1. **Tap A** - `init_session_buffer()`: Start recording, run shadow skill routing 2. **Tap B** - `append_tool_event()`: Log each tool call (name, params, success) 3. **Tap C** - `flush_session()`: End recording, compute reward, write to `trajectories.jsonl` 4. **Tap D** - `annotate_previous()`: Detect corrections on next prompt, update prior trajectory `reward_engine.py` computes composite scores. Note: the README documents 3 signals with weights 0.40/0.35/0.25, but the actual code uses 5 signals: - Outcome (0.30), Process (0.25), Efficiency (0.15), Verification (0.15), Consistency (0.15)

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.