Back to corpus
architecturetechnical paper candidatescore 44
Stable Audio 3 Model
All configurations (`small-music`, `small-sfx`, and `medium`) share the same interface — see the [model table](../../README.md#models) for hardware requirements and generation speed.
Full HTML reader
Read the full artifact
Extracted abstract or opening context
# Stable Audio 3 Model > For a more in-depth breakdown of Stable Audio 3, please see our [tech report](https://arxiv.org/abs/2605.17991).
All configurations (`small-music`, `small-sfx`, and `medium`) share the same interface — see the [model table](../../README.md#models) for hardware requirements and generation speed.
| Input | Description | |---|---| | `prompt` | Text description of the audio to generate | | `duration` | Length of audio to generate, in seconds |
| Output | Value | |---|---| | Format | 44.1 kHz stereo audio | | Bit depth | 32-bit float |
**Limitations** - Not designed for speech or voice generation - Trained on English descriptions; other languages will underperform
Promotion decision
What has to happen next
Promote into a technical note or architecture paper with implementation anchors.
Why this is not always a full paper yet
Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.