IRCP Optimization Strategy: Beyond Traditional Preference Optimization
**IRCP is NOT just another optimizer** - it's a fundamentally different mathematical framework that inverts the traditional learning paradigm. While TPO, DPO, and GRPO optimize for P(v|u) (assistant response given user input), **IRCP optimizes for P(u|v) - the inverse mapping that models how users respond to assistant messages**.
Full HTML reader
Read the full artifact
Extracted abstract or opening context
Promotion decision
What has to happen next
Attach run IDs, datasets, metrics, and reproduction commands.
Why this is not always a full paper yet
Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.