Mohamed Diomande

Full HTML reader

Read the full artifact

Extracted abstract or opening context

# Model configs The model config file for a diffusion model should set the `model_type` to `diffusion_cond` if the model uses conditioning, or `diffusion_uncond` if it does not, and the `model` object should have the following properties: - `diffusion` - The configuration for the diffusion model itself. See below for more information on the diffusion model config - `pretransform` - The configuration of the diffusion model's [pretransform](pretransforms.md), such as an autoencoder for latent diffusion. - Optional - `conditioning` - The configuration of the various [conditioning](conditioning.md) modules for the diffusion model - Only required for `diffusion_cond` - `io_channels` - The base number of input/output channels for the diffusion model - Used by inference scripts to determine the shape of the noise to generate for the diffusion model # Diffusion configs - `type` - The underlying model type for the transformer - For conditioned diffusion models, be one of `dit` ([Diffusion Transformer](#diffusion-transformers-dit)), `DAU1d` ([Dance Diffusion U-Net](#dance-diffusion-u-net)), or `adp_cfg_1d` ([audio-diffusion-pytorch U-Net](#audio-diffusion-pytorch-u-net-adp)) - Unconditioned diffusion models can also use `adp_1d` - `cross_attention_cond_ids` - Conditioner ids for conditioning information to be used as cross-attention input - If multiple ids are specified, the conditioning tensors will be concatenated along the sequence dimension - `global_cond_ids` - Conditioner ids for conditioning information to be used as global conditioning input - If multiple ids are specified, the conditioning tensors will be concatenated along the channel dimension - `prepend_cond_ids` - Conditioner ids for conditioning information to be prepended to the model input - If multiple ids are specified, the conditioning tensors will be concatenated along the sequence dimension - Only works with diffusion transformer models - `input_concat_ids` - Conditioner ids for conditioning information to be concatenated to the model input - If multiple ids are specified, the conditioning tensors will be concatenated along the channel dimension - If the conditioning tensors are not the same length as the model input, they will be interpolated along the sequence dimension to be the same length. - The interpolation algorithm is model-dependent, but usually uses nearest-neighbor resampling. - `config` - The configuration for the model backbone itself - Model-dependent # Training configs The `training` config in the diffusion model config file should have the following properties: - `learning_rate` - The learning rate to use during training - Defaults to constant learning rate, can be overridden with `optimizer_configs` - `use_ema` - If true, a copy of the model weights is maintained duri

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.