Back to corpus
architecturetechnical paper candidatescore 44

Stable Audio 3 Model

All configurations (`small-music`, `small-sfx`, and `medium`) share the same interface — see the [model table](../../README.md#models) for hardware requirements and generation speed.

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

# Stable Audio 3 Model > For a more in-depth breakdown of Stable Audio 3, please see our [tech report](https://arxiv.org/abs/2605.17991). All configurations (`small-music`, `small-sfx`, and `medium`) share the same interface — see the [model table](../../README.md#models) for hardware requirements and generation speed. | Input | Description | |---|---| | `prompt` | Text description of the audio to generate | | `duration` | Length of audio to generate, in seconds | | Output | Value | |---|---| | Format | 44.1 kHz stereo audio | | Bit depth | 32-bit float | **Limitations** - Not designed for speech or voice generation - Trained on English descriptions; other languages will underperform

Promotion decision

What has to happen next

Promote into a technical note or architecture paper with implementation anchors.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.