Back to corpus
research noteexperiment writeup candidatescore 18

OmniContext

As part of OmniGen2, we introduce a new benchmark for in-context generation, **OmniContext**, which aims to provide a more comprehensive evaluation of models' in-context generation abilities. It incorporates a diverse set of input images and instructions, and utilizes GPT-4.1 for interpretable, metric-driven assessment.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

As part of OmniGen2, we introduce a new benchmark for in-context generation, **OmniContext**, which aims to provide a more comprehensive evaluation of models' in-context generation abilities. It incorporates a diverse set of input images and instructions, and utilizes GPT-4.1 for interpretable, metric-driven assessment. <p align="center"> <img src="../assets/omnicontext_overview.png" width="95%"> <br> <em>Overview of OmniContext benchmark.</em> </p> <p align="center"> <img src="../assets/omnicontext_evaluation.png" width="95%"> <br> <em>An illustrative evaluation case in the OmniContext benchmark.</em> </p> Note: we fix the resolution of the output images at 1024 × 1024 to ensure that the settings are consistent across different models. You may try generating results using OmniGen2 or other models; please ensure that the output image directory structure and format are consistent with the format specified below. 1. We use GPT-4.1 to evaluate the quality of the generated images. Please make sure to set up your API key before running the script.

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.