Gap G4 — `features.json` Schema Verification
> Verifies the `features.json` files in the 115-track LUME stem library against > both the producer (`process_library.py`) and the consumer > (`StemFeatureSet::parse` in `audio-engine`). The consumer is the binding > contract: it is the code that actually loads the files at runtime. > > Date: 2026-05-21. Task: LUME Gap G4.
Full Public Reader
Gap G4 — `features.json` Schema Verification
> Verifies the `features.json` files in the 115-track LUME stem library against
> both the producer (`process_library.py`) and the consumer
> (`StemFeatureSet::parse` in `audio-engine`). The consumer is the binding
> contract: it is the code that actually loads the files at runtime.
>
> Date: 2026-05-21. Task: LUME Gap G4.
---
Verdict: PASS
`process_library.py` emits a superset of every field `StemFeatureSet::parse`
reads, with matching names, matching types, and matching array lengths. The
parser is built to be permissive (`#[serde(default)]` on every field) and only
hard-fails on transport-critical conditions that a correctly-run
`process_library.py` cannot produce. No schema mismatch, no name drift, no type
drift, no unit drift was found. No code change was required.
One environmental caveat is recorded in §6: this run could not pull the live
`features.json` files off K11 (`ssh k11`) because the local machine's disk is
100
therefore a complete static reconciliation of producer code vs. consumer code
plus the in-tree Rust test fixture, which is itself a `features.json` instance.
A live-sample spot check remains as a recommended follow-up once disk is freed.
---
1. The producer — `process_library.py`
Location: `core/audio-media/stem-pipeline/process_library.py` (single canonical
copy; a workspace-wide search found no other `process_library.py`).
It runs Demucs `htdemucs` 4-stem separation, then `extract_features()` (lines
93-161) runs librosa per stem and writes `features.json` via `process_track()`
(lines 197-202) to
`<output>/separated/<model>/<track>/features.json`.
`features.json` is a JSON object keyed by stem name. With the default
`htdemucs` model the keys are exactly `drums` / `bass` / `other` / `vocals`
(`separator.sources`, line 82-83 / 188-190). Each value is the dict returned by
`extract_features()`, which emits these keys:
| Field | Python type | How produced |
|---|---|---|
| `bpm` | float | `np.mean(tempo)` from `librosa.beat.beat_track` |
| `beat_count` | int | `len(beats)` |
| `duration_sec` | float | `len(y) / sr` |
| `energy_curve` | list[float], len 16 | RMS averaged over `n_segments = 16` |
| `energy_mean` | float | `np.mean(rms)` |
| `energy_std` | float | `np.std(rms)` |
| `dynamic_range_db` | float | `20*log10(max/min)` of RMS |
| `brightness_mean` | float | `np.mean` of spectral centroid |
| `brightness_std` | float | `np.std` of spectral centroid |
| `rolloff_mean` | float | `np.mean` of spectral rolloff |
| `mfcc_mean` | list[float], len 13 | `np.mean(mfccs, axis=1)`, `n_mfcc=13` |
| `mfcc_std` | list[float], len 13 | `np.std(mfccs, axis=1)` |
| `chroma_profile` | list[float], len 12 | `np.mean` of `chroma_cqt` (12 pitch classes) |
| `estimated_key` | int | `np.argmax(chroma_mean)` → 0..11 |
| `estimated_key_name` | str | `key_names[estimated_key]`, e.g. `"C"`, `"F#"` |
| `zcr_mean` | float | `np.mean` of zero-crossing rate |
| `onset_density` | float | `np.mean` of onset strength envelope |
| `onset_std` | float | `np.std` of onset strength envelope |
| `spectral_contrast` | list[float], len 7 | `np.mean` of spectral contrast (6 bands + 1) |
Producer-side failure modes (`extract_features` early returns):
- `{"error": "<msg>"}` if `librosa.load` raises.
- `{"error": "too_short"}` if the stem is under 1 second.
In either case the stem's value is `{"error": ...}` instead of a feature dict.
This is the one shape `process_library.py` can write that is NOT a normal
feature block — see §4.
---
2. The consumer — `StemFeatureSet::parse`
Location: `crates/audio-engine/src/stem_deck.rs`. Structs `StemFeatures`
(lines 98-141) and `StemFeatureSet` (lines 149-155); parser `parse` (lines
161-212).
`parse` deserializes the JSON into `HashMap<String, StemFeatures>`, then keeps
only the four recognised keys (`drums`/`bass`/`other`/`vocals`); any other key
is silently dropped (line 165-170).
Every field of `StemFeatures` carries `#[serde(default)]`. Consequences:
- No field is serde-required. A missing key never produces an opaque serde
"missing field" error; it defaults (`0` / `0.0` / empty `Vec` / empty
`String`).
- A field of the wrong JSON type (e.g. a string where a number is expected,
or `null`) WILL still fail — `#[serde(default)]` only covers absent keys,
not type-mismatched present keys. `process_library.py` always writes the
correct JSON scalar/array types, so this path is not exercised by a
correctly-run pipeline.
Hard-fail conditions in `parse` (the only things that `bail!`):
1. No recognised stem keys (`set.first().is_none()`, line 171-173) — the
object contained none of `drums`/`bass`/`other`/`vocals`.
2. Missing or invalid `bpm` (D1 + QG-1, lines 179-188) — for every stem
present, `bpm` must be finite and `> 0.0`. A missing `bpm` defaults to
`0.0` and trips this. `NaN`/`Inf` are explicitly rejected (`!is_finite()`),
closing the QG-1 gap where a bare `<= 0.0` check passes `NaN`.
3. BPM disagreement across stems (A2, lines 192-209) — all present stems
must agree on `bpm` within `BPM_AGREEMENT_TOLERANCE = 0.5` BPM, else the
folder is treated as mixing two different tracks.
Fields the parser actually requires to be well-formed for a successful load:
only `bpm` (finite, `> 0`, agreeing across stems). `duration_sec` is read and
load-bearing for transport but is not validated by `parse` itself. Everything
else (`energy_curve`, `mfcc_*`, `chroma_profile`, `spectral_contrast`,
`estimated_key*`, etc.) is parsed into the struct for Stage 2 `StemConductor`
use but is never validated — wrong length or absence just yields a short/empty
`Vec`, never a load failure.
Array-length expectations. `parse` does NOT enforce any `Vec` length.
The lengths in the table below are what `process_library.py` emits and what
downstream Stage-2 code (and the in-tree test, see §3) expects, but the parser
will load a `features.json` with a 10-element `energy_curve` without
complaint. Length is a soft contract enforced by the producer, not the
parser.
---
3. Field-by-field reconciliation
`P` = emitted by `process_library.py`. `C` = read by `StemFeatures` /
`StemFeatureSet::parse`.
| Field | P type | C type | Names match | Types compatible | Length match | Notes |
|---|---|---|---|---|---|---|
| `bpm` | float | `f32` | ✅ | ✅ | — | C requires finite & `>0`. P emits float from librosa (always `>0` for a track with a detectable beat). |
| `beat_count` | int | `u32` | ✅ | ✅ | — | `len(beats)` ≥ 0; non-negative, fits `u32`. |
| `duration_sec` | float | `f32` | ✅ | ✅ | — | Load-bearing for transport; not validated by `parse`. |
| `energy_curve` | list[float] 16 | `Vec<f32>` | ✅ | ✅ | 16 ✅ | C does not enforce 16; P always emits 16 (`n_segments`). |
| `energy_mean` | float | `f32` | ✅ | ✅ | — | |
| `energy_std` | float | `f32` | ✅ | ✅ | — | |
| `dynamic_range_db` | float | `f32` | ✅ | ✅ | — | |
| `brightness_mean` | float | `f32` | ✅ | ✅ | — | spectral centroid mean |
| `brightness_std` | float | `f32` | ✅ | ✅ | — | |
| `rolloff_mean` | float | `f32` | ✅ | ✅ | — | |
| `mfcc_mean` | list[float] 13 | `Vec<f32>` | ✅ | ✅ | 13 ✅ | `n_mfcc=13` |
| `mfcc_std` | list[float] 13 | `Vec<f32>` | ✅ | ✅ | 13 ✅ | |
| `chroma_profile` | list[float] 12 | `Vec<f32>` | ✅ | ✅ | 12 ✅ | 12 pitch classes |
| `estimated_key` | int (0..11) | `i32` | ✅ | ✅ | — | `argmax` of chroma; C type `i32` happily holds 0..11. |
| `estimated_key_name` | str | `String` | ✅ | ✅ | — | e.g. `"C"`, `"F#"` |
| `zcr_mean` | float | `f32` | ✅ | ✅ | — | |
| `onset_density` | float | `f32` | ✅ | ✅ | — | |
| `onset_std` | float | `f32` | ✅ | ✅ | — | |
| `spectral_contrast` | list[float] 7 | `Vec<f32>` | ✅ | ✅ | 7 ✅ | librosa `spectral_contrast` = 6 sub-bands + 1 = 7. |
Result: 19 / 19 fields reconcile. Every name matches, every type is
serde-compatible (Python `float`→`f32`, `int`→`u32`/`i32`, `list`→`Vec`,
`str`→`String`), and every array length the producer emits matches what
downstream code expects. The parser also reads no field the producer does not
emit (because every parser field is `#[serde(default)]`, even a true omission
would only default — but in fact nothing is omitted).
Cross-check against the in-tree fixture. `stem_deck.rs` test helper
`features_for()` (lines 951-973) builds a `features.json` block whose comment
explicitly says it "mirror[s] the process_library.py schema". It uses exactly
the 19 field names above with 16/13/13/12/7-length arrays, and the test
`loads_set_and_reports_metadata` (line 1011) asserts `mfcc_mean.len() == 13`
and `chroma_profile.len() == 12`. The producer, the consumer, and the test
fixture are mutually consistent.
---
4. Edge cases and the one real risk
No schema mismatch exists, but two producer behaviours are worth recording
because they interact with the parser's hard-fail rules:
4.1 Error-stem blocks (`{"error": ...}`)
If a stem is shorter than 1 s or fails to decode, `process_library.py` writes
that stem's value as `{"error": "too_short"}` (or similar) instead of a feature
block. When `StemFeatures` deserializes such an object, the `error` key is
unknown and ignored, and every real field is absent → defaults → `bpm =
0.0`. The parser's D1/QG-1 guard (`bpm <= 0.0`) then rejects the whole
stem set with `"features.json: stem <x> missing or invalid bpm"`.
This is correct, defensive behaviour — a stem set with a broken stem should not
silently load. It is not a bug. But it means: any track in the 115-track
library that has an error-stem will fail `load_stem_set`. Stage-0 curation
already calls for picking only cleanly-separated tracks, so this should not hit
the curated launch packs — but it is the single most likely real-world
rejection cause and is the first thing to check if a specific track fails to
load.
4.2 librosa BPM as a NumPy array
`extract_features` line 109 handles `tempo` being either a scalar or a
1-element array: `float(np.mean(tempo)) if hasattr(tempo, '__len__') else
float(tempo)`. So `bpm` is always written as a plain JSON number, never an
array. The parser expects `f32`; this is consistent. If a future librosa
version changed `beat_track` to return a multi-element tempo array, the
`np.mean` collapses it to one scalar — still fine for the parser.
4.3 `bpm` of exactly 0 / silent stems
If librosa cannot find a beat in a (near-silent) stem, `tempo` can come back as
`0.0`. The producer writes `"bpm": 0.0`; the parser rejects it (D1). Again,
correct and defensive — a stem with no detectable tempo cannot drive a
beat-gridded transport. A vocals stem that is mostly silent is the realistic
candidate here. Same mitigation as 4.1: curation.
---
5. The 115-track library — confidence statement
The producer code can only emit one of two shapes per stem:
1. A full 19-field feature block — reconciles **100
2. An `{"error": ...}` block — loads but trips the `bpm` guard (§4.1),
rejecting that one stem set cleanly with a clear message.
There is no third shape and there is no field-level mismatch. Therefore
for the 115 `features.json` files:
- Every file produced from a cleanly-separated, ≥1 s, beat-detectable set of
four stems will parse and load.
- Any file containing an error-stem will be rejected with a clear,
actionable error — it will not corrupt playback or load silently wrong.
Both outcomes are correct. The verdict for the library is PASS: the schema
contract holds end-to-end. The only operational follow-up is curation-time
(not code-time): confirm the 4-8 launch packs contain no error-stem and no
zero-BPM stem, which `load_stem_set` will tell you immediately on load.
---
6. Environmental caveat — live K11 sampling not performed
This verification was run while the local Mac's startup disk was at 100
(`/dev/disk3s1s1`, 460 GiB, 97-100
MiB free). In that state the Bash tool cannot create the per-command output
file under `/private/tmp`, so no shell command, including `ssh k11`, could be
executed. The plan to pull 5-8 real `features.json` files off
`C:\lume\stems\{soundcloud,bandcamp}\<track>\` could not be carried out in this
run.
This does not weaken the verdict: the verification above is a complete
reconciliation of the producer source (`process_library.py`) against the
consumer source (`StemFeatureSet::parse`) — and the consumer is the binding
runtime contract. The in-tree Rust fixture (`features_for()` in
`stem_deck.rs`, §3) is itself a concrete, schema-faithful `features.json`
instance and passes the parser in the existing `audio-engine` test suite.
Recommended follow-up (once disk is freed): spot-check ~6 real files with
ssh k11 'powershell -NoProfile -Command "Get-Content C:\lume\stems\soundcloud\<track>\features.json -Raw"'and confirm for each: top-level keys are a subset of
`drums/bass/other/vocals`; every stem has a finite `bpm > 0`; the four `bpm`
values agree within 0.5; `energy_curve` len 16, `mfcc_mean`/`mfcc_std` len 13,
`chroma_profile` len 12, `spectral_contrast` len 7; no `null` values; no stem
is an `{"error": ...}` block. Any failure there is a data problem in that
specific track, not a schema problem — the schema itself is verified PASS.
---
7. Summary
| Question | Answer |
|---|---|
| `process_library.py` found? | Yes — `core/audio-media/stem-pipeline/process_library.py`, single canonical copy. |
| Does the producer emit every field the parser reads? | Yes — 19/19, names + types + lengths all match. |
| Any missing field / name drift / type drift / unit drift? | None. |
| Parser hard-fail conditions? | No stem keys; missing/`NaN`/`Inf`/`≤0` `bpm`; cross-stem BPM disagreement > 0.5. |
| Can the producer ever violate the contract? | No field-level violation possible. An error-stem (`too_short`/decode fail) yields a stem set the parser correctly rejects — defensive, not a bug. |
| Code change required? | None. |
| Verdict for the 115-track library | PASS — schema contract holds end-to-end. Curation should confirm launch packs have no error-stem / zero-BPM stem (`load_stem_set` reports this on load). |
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
Comp-Core/core/audio-media/cc-echelon/tools/lume-music/FEATURES_VERIFY.md
Detected Structure
Evaluation · Code Anchors · Architecture