Motion Training Data Pipeline Protocol
> **Version**: 1.0 > **Status**: DRAFT > **Scope**: End-to-end data architecture for motion capture, processing, and training
Full Public Reader
Motion Training Data Pipeline Protocol
> Version: 1.0
> Status: DRAFT
> Scope: End-to-end data architecture for motion capture, processing, and training
---
I. Overview
This document defines the complete data pipeline from sensor capture to ML training corpus. The system is designed as an infinite training grind - a perpetual motion data collection environment that:
1. Captures motion from multiple sensor sources
2. Synchronizes with audio/phrase playback
3. Processes and normalizes data in real-time
4. Stores for immediate and future training
5. Supports upstream (generation) and downstream (analysis) tasks
---
II. Sensor Sources & Data Formats
II.1 Supported Sensor Types
| Source | Data Type | Frequency | Landmarks | Priority |
|---|---|---|---|---|
| MediaPipe (Webcam) | Holistic pose | 30 fps | 33 pose + 21×2 hands + 468 face | Primary |
| Mocopi | Full body IMU | 50 fps | 27 bones (Sony BVH) | Primary |
| Dual Phones | Accelerometer + Gyro | 100 Hz | 2 devices (hands) | Secondary |
| Apple Watch | Motion + Heart Rate | 50 Hz | Wrist orientation + HR | Secondary |
| Headphone Sensors | Head orientation | 100 Hz | 3-axis rotation | Tertiary |
| LIDAR/Depth | Point cloud | 30 fps | Sparse body points | Experimental |
II.2 Canonical Data Schema
All sensor data normalizes to this schema:
interface MotionFrame {
// Identity
frame_id: string // UUID
session_id: string // Parent session
phrase_id?: string // Linked audio phrase
// Timing
timestamp_ms: number // Relative to session start
audio_position_ms?: number // Position in phrase audio
beat_position?: number // Beat number (fractional)
// Source metadata
source_type: SensorSource
source_device_id: string
source_confidence: number // 0-1 overall confidence
// Normalized body state (canonical representation)
body: {
// Root position (world space)
root_position: Vec3 // x, y, z in meters
root_rotation: Quaternion // World-space orientation
// Joint rotations (local space, relative to parent)
joints: Record<JointName, {
rotation: Quaternion
confidence: number
}>
// Derived features (computed)
center_of_mass: Vec3
facing_direction: Vec3
velocity: Vec3
angular_velocity: Vec3
}
// Hand state (if available)
hands?: {
left?: HandState
right?: HandState
}
// Face state (if available)
face?: {
landmarks?: FaceLandmark[]
expression?: ExpressionWeights
gaze_direction?: Vec3
}
// Raw sensor data (preserved for reprocessing)
raw_data?: {
format: string
data: Uint8Array | object
}
}
type SensorSource =
| 'mediapipe_holistic'
| 'mocopi_bvh'
| 'phone_imu'
| 'watch_motion'
| 'headphone_imu'
| 'lidar_depth'
type JointName =
| 'hips' | 'spine' | 'chest' | 'neck' | 'head'
| 'shoulder_l' | 'upper_arm_l' | 'lower_arm_l' | 'hand_l'
| 'shoulder_r' | 'upper_arm_r' | 'lower_arm_r' | 'hand_r'
| 'hip_l' | 'upper_leg_l' | 'lower_leg_l' | 'foot_l' | 'toes_l'
| 'hip_r' | 'upper_leg_r' | 'lower_leg_r' | 'foot_r' | 'toes_r'
interface HandState {
fingers: {
thumb: FingerState
index: FingerState
middle: FingerState
ring: FingerState
pinky: FingerState
}
open_amount: number // 0 = fist, 1 = fully open
gesture?: string // Detected gesture name
}---
III. Session Architecture
III.1 Session Hierarchy
TrainingCorpus
└── Session (one continuous recording)
├── Metadata (performer, date, sensors, quality)
├── PhraseSegments[] (aligned to audio phrases)
│ ├── phrase_id → motion_phrases table
│ ├── frames[] → motion data
│ └── features → computed motion features
└── FreeformSegments[] (unaligned exploration)
├── frames[]
└── detected_patterns[]III.2 Session States
┌─────────┐ start() ┌─────────┐ phrase_start ┌──────────────┐
│ IDLE │ ────────────▶ │ ACTIVE │ ───────────────▶ │ PHRASE_SYNC │
└─────────┘ └─────────┘ └──────────────┘
▲ │ │
│ stop() │ phrase_end │
└─────────────────────────┴──────────────────────────────┘III.3 Session Table Schema
CREATE TABLE motion_sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
-- Identity
corpus_id UUID REFERENCES training_corpora(id),
performer_id UUID REFERENCES performers(id),
session_name TEXT,
-- Timing
started_at TIMESTAMPTZ NOT NULL,
ended_at TIMESTAMPTZ,
duration_seconds FLOAT,
-- Sensor config
sensor_sources JSONB NOT NULL, -- Array of active sensors
primary_source TEXT NOT NULL, -- Which sensor is authoritative
-- Quality metrics
total_frames INTEGER DEFAULT 0,
avg_fps FLOAT,
avg_confidence FLOAT,
tracking_gaps_count INTEGER DEFAULT 0,
tracking_gaps_total_ms INTEGER DEFAULT 0,
-- Training metadata
is_validated BOOLEAN DEFAULT FALSE,
validation_score FLOAT,
notes TEXT,
tags TEXT[],
-- Status
status session_status NOT NULL DEFAULT 'active',
error_message TEXT,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TYPE session_status AS ENUM (
'active',
'completed',
'error',
'processing',
'validated',
'archived'
);---
IV. Phrase Synchronization Protocol
IV.1 The Phrase-Motion Binding
When a phrase plays, all captured motion becomes bound to that phrase:
PHRASE TIMELINE
│ phrase.t_start phrase.t_end │
▼ ▼
┌────────────────────────────────────────────────────────────────┐
│ Audio Waveform │
└────────────────────────────────────────────────────────────────┘
↕ synchronized ↕
┌────────────────────────────────────────────────────────────────┐
│ Motion Frames (body_energy, arm_spread, etc.) │
└────────────────────────────────────────────────────────────────┘
↕ beat-aligned ↕
┌────────────────────────────────────────────────────────────────┐
│ Beat Grid (from phrase.tempo_bpm) │
│ │ │ │ │ │ │ │ │ │ │ │ │ │
│ 1 2 3 4 1 2 3 4 1 2 3 4 │
└────────────────────────────────────────────────────────────────┘IV.2 Phrase Segment Table
CREATE TABLE phrase_motion_segments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
-- Links
session_id UUID REFERENCES motion_sessions(id) ON DELETE CASCADE,
phrase_id UUID REFERENCES motion_phrases(id),
-- Timing (relative to phrase)
phrase_start_beat FLOAT, -- When recording started (beat number)
phrase_end_beat FLOAT, -- When recording ended
motion_start_ms INTEGER, -- Session-relative start
motion_end_ms INTEGER, -- Session-relative end
-- Alignment quality
audio_motion_offset_ms INTEGER, -- Latency correction applied
beat_alignment_score FLOAT, -- How well motion aligns to beats
-- Frame references
start_frame_id UUID,
end_frame_id UUID,
frame_count INTEGER,
-- Computed features (denormalized for fast access)
features JSONB, -- Aggregated motion features
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Index for fast phrase lookups
CREATE INDEX idx_phrase_motion_segments_phrase
ON phrase_motion_segments(phrase_id);IV.3 Real-Time Sync Protocol
interface PhraseSyncState {
phrase_id: string
phrase: MotionPhrase
// Audio state
audio_started_at: number // System timestamp
audio_position_ms: number // Current playback position
// Beat tracking
tempo_bpm: number
beat_offset: number // Phase offset
current_beat: number // Fractional beat number
// Motion binding
motion_start_frame_id: string
frame_count: number
}
// Protocol messages
type SyncMessage =
| { type: 'PHRASE_START', phrase: MotionPhrase, timestamp: number }
| { type: 'PHRASE_BEAT', beat: number, timestamp: number }
| { type: 'PHRASE_END', phrase_id: string, timestamp: number }
| { type: 'MOTION_FRAME', frame: MotionFrame }---
V. Multi-Sensor Fusion
V.1 Fusion Strategy
When multiple sensors capture simultaneously:
┌─────────────────────────────────────────────────────────────────┐
│ SENSOR FUSION PIPELINE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ MediaPipe ──┐ │
│ │ ┌──────────────┐ ┌─────────────────┐ │
│ Mocopi ─────┼───▶│ Time Align │───▶│ Confidence │ │
│ │ │ (sync clocks)│ │ Weighting │ │
│ Phones ─────┤ └──────────────┘ └────────┬────────┘ │
│ │ │ │
│ Watch ──────┘ ▼ │
│ ┌──────────────┐ │
│ │ Joint-Level │ │
│ │ Fusion │ │
│ └──────┬───────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ Canonical │ │
│ │ MotionFrame │ │
│ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘V.2 Confidence-Weighted Fusion
interface FusionConfig {
// Per-joint source priorities
joint_sources: Record<JointName, {
primary: SensorSource
fallbacks: SensorSource[]
min_confidence: number
}>
// Fusion weights by source
source_weights: Record<SensorSource, number>
// Smoothing
temporal_smoothing: number // 0-1, higher = more smoothing
confidence_threshold: number // Below this = interpolate
}
function fuseFrame(
frames: Map<SensorSource, MotionFrame>,
config: FusionConfig
): MotionFrame {
const fused: MotionFrame = createEmptyFrame()
for (const joint of ALL_JOINTS) {
const sources = config.joint_sources[joint]
let bestRotation: Quaternion | null = null
let bestConfidence = 0
// Try sources in priority order
for (const source of [sources.primary, ...sources.fallbacks]) {
const frame = frames.get(source)
if (!frame) continue
const jointData = frame.body.joints[joint]
if (!jointData) continue
const weightedConf = jointData.confidence * config.source_weights[source]
if (weightedConf > bestConfidence && weightedConf >= config.min_confidence) {
bestRotation = jointData.rotation
bestConfidence = weightedConf
}
}
if (bestRotation) {
fused.body.joints[joint] = {
rotation: bestRotation,
confidence: bestConfidence
}
}
}
return fused
}V.3 Default Fusion Priorities
| Body Region | Primary | Fallback 1 | Fallback 2 |
|---|---|---|---|
| Head | Headphones | MediaPipe | Mocopi |
| Torso | Mocopi | MediaPipe | - |
| Arms | Mocopi | MediaPipe | Phones |
| Hands | MediaPipe | Phones | Mocopi |
| Legs | Mocopi | MediaPipe | - |
| Feet | Mocopi | MediaPipe | - |
---
VI. Feature Extraction Pipeline
VI.1 Real-Time Features (Computed Every Frame)
interface RealtimeFeatures {
// Energy metrics
body_energy: number // Overall movement intensity 0-1
upper_body_energy: number // Arms + torso
lower_body_energy: number // Legs + hips
// Pose metrics
arm_spread: number // 0 = arms down, 1 = T-pose
arm_raise: number // 0 = down, 1 = overhead
crouch_level: number // 0 = standing, 1 = crouching
lean_angle: number // Forward/back lean in degrees
// Dynamics
velocity_magnitude: number // Overall movement speed
acceleration_magnitude: number // Rate of speed change
angular_velocity: number // Rotation speed
// Hand features
left_hand_open: number // 0 = fist, 1 = open
right_hand_open: number
hands_together: boolean // Are hands near each other?
// Face features (if available)
head_yaw: number // Left/right turn
head_pitch: number // Up/down tilt
head_roll: number // Side tilt
mouth_open: number
smile_intensity: number
// Beat alignment (if phrase active)
on_beat: boolean // Peak motion near beat?
beat_phase: number // 0-1 position in beat cycle
}VI.2 Segment-Level Features (Computed Per Phrase)
interface SegmentFeatures {
// Statistical aggregates
energy_mean: number
energy_std: number
energy_min: number
energy_max: number
// Pattern detection
repetition_score: number // How repetitive is the motion?
complexity_score: number // Movement variety
symmetry_score: number // Left/right balance
// Rhythm analysis
beat_hit_rate: number // % of beats with motion peak
off_beat_rate: number // % motion on off-beats
rhythmic_consistency: number // How steady is the rhythm?
// Pose vocabulary
dominant_poses: string[] // Most common pose clusters
pose_transition_matrix: number[][] // Markov transitions
// Quality metrics
tracking_coverage: number // % frames with good tracking
jitter_score: number // Unwanted noise level
}VI.3 Feature Storage
-- Real-time features (one per frame)
CREATE TABLE motion_frame_features (
frame_id UUID PRIMARY KEY REFERENCES motion_frames(id),
-- Core features (fast access)
body_energy FLOAT,
arm_spread FLOAT,
arm_raise FLOAT,
crouch_level FLOAT,
velocity_magnitude FLOAT,
-- All features (JSONB for flexibility)
features JSONB NOT NULL,
-- Beat alignment
beat_phase FLOAT,
on_beat BOOLEAN
);
-- Segment features (one per phrase recording)
CREATE TABLE segment_features (
segment_id UUID PRIMARY KEY REFERENCES phrase_motion_segments(id),
-- Aggregates
energy_mean FLOAT,
energy_std FLOAT,
beat_hit_rate FLOAT,
complexity_score FLOAT,
-- Full feature set
features JSONB NOT NULL,
-- Embedding for similarity search
motion_embedding VECTOR(256)
);---
VII. Training Corpus Organization
VII.1 Corpus Structure
TrainingCorpus
├── metadata.json # Corpus config and stats
├── performers/
│ ├── performer_001/
│ │ ├── profile.json
│ │ └── sessions/
│ │ ├── session_001/
│ │ │ ├── metadata.json
│ │ │ ├── frames.parquet
│ │ │ └── features.parquet
│ │ └── ...
│ └── ...
├── phrases/
│ ├── phrase_001/
│ │ ├── audio_features.json
│ │ ├── motion_samples/ # All recordings for this phrase
│ │ │ ├── sample_001.parquet
│ │ │ ├── sample_002.parquet
│ │ │ └── ...
│ │ └── aggregated_features.json
│ └── ...
└── exports/
├── training_v1/ # Versioned training sets
│ ├── train.parquet
│ ├── val.parquet
│ └── test.parquet
└── ...VII.2 Corpus Table
CREATE TABLE training_corpora (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
description TEXT,
version TEXT NOT NULL,
-- Stats
total_sessions INTEGER DEFAULT 0,
total_frames BIGINT DEFAULT 0,
total_duration_hours FLOAT DEFAULT 0,
unique_performers INTEGER DEFAULT 0,
unique_phrases INTEGER DEFAULT 0,
-- Quality thresholds
min_confidence_threshold FLOAT DEFAULT 0.7,
min_segment_duration_sec FLOAT DEFAULT 2.0,
-- Export config
export_format TEXT DEFAULT 'parquet',
feature_version TEXT,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);---
VIII. Real-Time Parameter Signaling
VIII.1 The Dance-to-Parameter Protocol
While dancing, motion features map to controllable parameters:
interface ParameterMapping {
// Source feature
feature: keyof RealtimeFeatures
// Target parameter
parameter: string // e.g., "synth.filter_cutoff"
// Mapping curve
curve: 'linear' | 'exponential' | 'sigmoid' | 'stepped'
// Range mapping
input_range: [number, number] // Feature value range
output_range: [number, number] // Parameter value range
// Smoothing
smoothing: number // 0-1
// Activation
active_condition?: string // e.g., "body_energy > 0.3"
}
// Example mappings
const DEFAULT_MAPPINGS: ParameterMapping[] = [
{
feature: 'arm_raise',
parameter: 'synth.filter_cutoff',
curve: 'exponential',
input_range: [0, 1],
output_range: [200, 8000],
smoothing: 0.8
},
{
feature: 'body_energy',
parameter: 'effects.reverb_wet',
curve: 'linear',
input_range: [0, 1],
output_range: [0.1, 0.7],
smoothing: 0.9
},
{
feature: 'crouch_level',
parameter: 'synth.pitch_bend',
curve: 'linear',
input_range: [0, 1],
output_range: [0, -12], // Semitones
smoothing: 0.7
}
]VIII.2 Parameter Event Stream
interface ParameterEvent {
timestamp: number
parameter: string
value: number
source_feature: string
raw_feature_value: number
}
// Real-time emission
function emitParameterEvents(
features: RealtimeFeatures,
mappings: ParameterMapping[]
): ParameterEvent[] {
return mappings
.filter(m => !m.active_condition || evaluateCondition(m.active_condition, features))
.map(mapping => {
const rawValue = features[mapping.feature]
const mappedValue = applyMapping(rawValue, mapping)
return {
timestamp: Date.now(),
parameter: mapping.parameter,
value: mappedValue,
source_feature: mapping.feature,
raw_feature_value: rawValue
}
})
}VIII.3 Parameter Recording
All parameter changes during a session are recorded:
CREATE TABLE session_parameter_events (
id BIGSERIAL PRIMARY KEY,
session_id UUID REFERENCES motion_sessions(id),
timestamp_ms INTEGER NOT NULL,
parameter TEXT NOT NULL,
value FLOAT NOT NULL,
-- Source tracking
source_feature TEXT,
raw_feature_value FLOAT,
mapping_id UUID
);
-- Efficient time-series queries
CREATE INDEX idx_param_events_session_time
ON session_parameter_events(session_id, timestamp_ms);---
IX. LIMRPS Integration
IX.1 LIMRPS Feature Mapping
The LIM-RPS (Latent Intent Motion - Reactive Parameter System) uses motion as intent signals:
┌─────────────────────────────────────────────────────────────────┐
│ LIMRPS ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Motion Input Intent Layer Output Layer │
│ ──────────── ──────────── ──────────── │
│ │
│ body_energy ──────────▶ INTENSITY ──────────▶ tempo_scale │
│ ──────────▶ volume │
│ ──────────▶ density │
│ │
│ arm_spread ───────────▶ OPENNESS ───────────▶ chord_spread │
│ arm_raise ───────────▶ register │
│ ───────────▶ reverb │
│ │
│ crouch_level ─────────▶ TENSION ────────────▶ filter_cutoff │
│ velocity ────────────▶ distortion │
│ ────────────▶ attack │
│ │
│ hand_gestures ────────▶ ARTICULATION ───────▶ note_duration │
│ finger_state ─────────▶ staccato/legato│
│ ─────────▶ ornaments │
│ │
│ head_direction ───────▶ ATTENTION ──────────▶ spatial_pan │
│ gaze ──────────▶ focus_freq │
│ │
└─────────────────────────────────────────────────────────────────┘IX.2 Intent-to-Generation Pipeline
interface MotionIntent {
intensity: number // 0-1: How much energy/urgency
openness: number // 0-1: Expansive vs contained
tension: number // 0-1: Tight/aggressive vs relaxed
articulation: number // 0-1: Precise vs flowing
attention: Vec3 // Direction of focus
}
function extractIntent(features: RealtimeFeatures): MotionIntent {
return {
intensity: weightedAverage([
[features.body_energy, 0.5],
[features.velocity_magnitude, 0.3],
[features.acceleration_magnitude, 0.2]
]),
openness: weightedAverage([
[features.arm_spread, 0.6],
[features.arm_raise, 0.4]
]),
tension: weightedAverage([
[1 - features.crouch_level, 0.4],
[features.angular_velocity, 0.3],
[features.lean_angle / 45, 0.3]
]),
articulation: weightedAverage([
[features.left_hand_open, 0.25],
[features.right_hand_open, 0.25],
[features.beat_phase < 0.1 ? 1 : 0, 0.5] // On-beat = articulated
]),
attention: calculateAttentionVector(features)
}
}---
X. CC-Motion-Gen Integration
X.1 Training Data Format
For the motion generation model:
interface TrainingSample {
// Input: Audio features
audio: {
mel_spectrogram: Float32Array // [n_frames, n_mels]
tempo_bpm: number
beat_positions: number[] // Frame indices of beats
key: string
energy_curve: number[] // Audio energy over time
}
// Output: Motion sequence
motion: {
joint_rotations: Float32Array // [n_frames, n_joints, 4] quaternions
root_velocities: Float32Array // [n_frames, 3]
features: Float32Array // [n_frames, n_features]
}
// Metadata
metadata: {
phrase_id: string
performer_id: string
session_id: string
quality_score: number
}
}X.2 Export Pipeline
async function exportTrainingData(
corpusId: string,
outputPath: string,
config: ExportConfig
): Promise<void> {
const corpus = await loadCorpus(corpusId)
const samples: TrainingSample[] = []
for (const session of corpus.sessions) {
for (const segment of session.phrase_segments) {
// Skip low quality
if (segment.quality_score < config.min_quality) continue
// Load audio features
const audioFeatures = await loadPhraseAudioFeatures(segment.phrase_id)
// Load motion frames
const frames = await loadMotionFrames(segment)
// Resample to fixed frame rate
const resampledMotion = resampleMotion(frames, config.target_fps)
// Align to audio
const alignedMotion = alignToAudio(
resampledMotion,
audioFeatures,
segment.audio_motion_offset_ms
)
samples.push({
audio: audioFeatures,
motion: alignedMotion,
metadata: {
phrase_id: segment.phrase_id,
performer_id: session.performer_id,
session_id: session.id,
quality_score: segment.quality_score
}
})
}
}
// Split into train/val/test
const splits = splitDataset(samples, config.split_ratios)
// Write parquet files
await writeParquet(`${outputPath}/train.parquet`, splits.train)
await writeParquet(`${outputPath}/val.parquet`, splits.val)
await writeParquet(`${outputPath}/test.parquet`, splits.test)
}---
XI. Quality Assurance Pipeline
XI.1 Automated Quality Checks
interface QualityReport {
session_id: string
// Tracking quality
frame_coverage: number // % frames with valid tracking
avg_confidence: number
tracking_gaps: {
count: number
total_duration_ms: number
longest_gap_ms: number
}
// Motion quality
jitter_score: number // Lower = smoother
physics_violations: number // Impossible poses
frozen_frames: number // Duplicate poses
// Sync quality
beat_alignment: number // How well motion matches beats
latency_estimate_ms: number // Audio-motion delay
// Overall
usable: boolean
usable_segments: number
quality_score: number // 0-100
flags: string[] // Issues found
}
async function assessQuality(sessionId: string): Promise<QualityReport> {
const session = await loadSession(sessionId)
const frames = await loadFrames(sessionId)
return {
session_id: sessionId,
frame_coverage: calculateCoverage(frames),
avg_confidence: calculateAvgConfidence(frames),
tracking_gaps: findTrackingGaps(frames),
jitter_score: calculateJitter(frames),
physics_violations: countPhysicsViolations(frames),
frozen_frames: countFrozenFrames(frames),
beat_alignment: calculateBeatAlignment(frames, session.phrase_segments),
latency_estimate_ms: estimateLatency(frames, session),
usable: /* computed */,
usable_segments: /* computed */,
quality_score: /* computed */,
flags: /* collected */
}
}XI.2 Quality Thresholds
| Metric | Acceptable | Good | Excellent |
|---|---|---|---|
| Frame Coverage | > 80 | ||
| Avg Confidence | > 0.6 | > 0.75 | > 0.9 |
| Jitter Score | < 0.3 | < 0.15 | < 0.05 |
| Beat Alignment | > 0.5 | > 0.7 | > 0.85 |
| Quality Score | > 50 | > 70 | > 90 |
---
XII. API Endpoints
XII.1 Session Management
POST /api/motion/sessions # Start new session
GET /api/motion/sessions/:id # Get session details
PATCH /api/motion/sessions/:id # Update session
DELETE /api/motion/sessions/:id # Delete session
POST /api/motion/sessions/:id/frames # Upload frames (batch)
GET /api/motion/sessions/:id/frames # Get frames (paginated)
POST /api/motion/sessions/:id/complete # End and process session
GET /api/motion/sessions/:id/quality # Get quality reportXII.2 Phrase Sync
POST /api/motion/sync/phrase_start # Signal phrase started
POST /api/motion/sync/phrase_end # Signal phrase ended
GET /api/motion/sync/current # Get current sync state
WS /api/motion/sync/stream # Real-time sync eventsXII.3 Training Corpus
GET /api/motion/corpus # List corpora
POST /api/motion/corpus # Create corpus
GET /api/motion/corpus/:id/stats # Corpus statistics
POST /api/motion/corpus/:id/export # Trigger export
GET /api/motion/corpus/:id/phrases # Phrases with motion data
GET /api/motion/corpus/:id/phrases/:phrase_id/samples # Samples for phrase---
XIII. Implementation Phases
### Phase 1: Foundation (Current)
- [x] MediaPipe capture to Supabase
- [x] Session management
- [x] Phrase synchronization
- [ ] Basic feature extraction
### Phase 2: Multi-Sensor
- [ ] Mocopi integration
- [ ] Phone IMU capture
- [ ] Sensor fusion pipeline
- [ ] Clock synchronization
### Phase 3: Quality & Processing
- [ ] Automated quality assessment
- [ ] Jitter filtering
- [ ] Gap interpolation
- [ ] Physics constraints
### Phase 4: Training Pipeline
- [ ] Corpus management
- [ ] Export pipeline
- [ ] CC-Motion-Gen integration
- [ ] Incremental training
### Phase 5: Real-Time Control
- [ ] Parameter mapping UI
- [ ] LIMRPS integration
- [ ] Live performance mode
- [ ] Latency optimization
---
XIV. Boundaries & Rules
### XIV.1 Data Retention
- Raw frames: 30 days (then archived)
- Features: Indefinite
- Sessions: Indefinite
- Exports: Version-controlled
### XIV.2 Privacy
- Performer consent required
- No facial data in exports (optional)
- Anonymization for public datasets
### XIV.3 Quality Gates
- Sessions < 50
- Jitter > 0.5: Flagged for review
- Physics violations > 10
### XIV.4 Versioning
- Feature extraction version tracked
- Training exports versioned
- Model compatibility matrix maintained
---
This protocol defines the complete data pipeline from sensor to training. Implementation proceeds phase by phase, with each phase validated before proceeding.
Promotion Decision
Promote into a technical note or architecture paper with implementation anchors.
Source Anchor
Comp-Core/.governance/architecture/DATA_PIPELINE_PROTOCOL.md
Detected Structure
Method · Evaluation · References · Architecture