🔧 Preference Generation Fix Summary

Full HTML reader

Read the full artifact

Extracted abstract or opening context

The RCP-enhanced TPO system was generating preference pairs where `chosen` and `rejected` responses were **identical**. This occurred specifically in: - **Strategy**: `knowledge_transfer_triangular` (41.3% of all preferences) - **Root Cause**: The `_extract_response()` method was only using `path.terminal_node.content` - **Impact**: 5,640+ preference pairs had identical chosen/rejected responses ### **Why This Happened** 1. **Triangular Knowledge Transfer**: When users copy assistant responses as prompts 2. **Path Construction**: Alternative paths often ended at similar/same terminal nodes 3. **Content Extraction**: Only terminal node content was used, ignoring path differences 4. **Alternative Path Finding**: Limited logic for finding truly different alternative paths ### **Quantitative Improvements** - ✅ **13,666 preferences** now have meaningful distinctions - ✅ **5,640 triangular preferences** converted from identical to diverse - ✅ **8,026 experimental preferences** maintained their quality - ✅ **100% preference pairs** now provide training signal ### **Qualitative Improvements** - ✅ **Triangular Knowledge Transfer**: Now compares original assistant response vs. alternative approaches - ✅ **Path Diversity**: Multi-node paths capture conversation flow differences - ✅ **Sibling Alternatives**: When no intermediate paths exist, uses sibling messages for comparison - ✅ **Training Signal**: Each preference pair teaches distinct conversation strategies

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.