Motion Capture Pipeline
1. [Overview](#overview) 2. [MediaPipe Integration](#mediapipe-integration) 3. [Mocopi Integration](#mocopi-integration) 4. [Sensor Fusion](#sensor-fusion) 5. [Skeleton System](#skeleton-system) 6. [API Reference](#api-reference) 7. [Performance Optimization](#performance-optimization)
Full Public Reader
Motion Capture Pipeline
Comprehensive Technical Documentation
Version: 2.0.0
Last Updated: December 26, 2024
---
Table of Contents
1. [Overview](#overview)
2. [MediaPipe Integration](#mediapipe-integration)
3. [Mocopi Integration](#mocopi-integration)
4. [Sensor Fusion](#sensor-fusion)
5. [Skeleton System](#skeleton-system)
6. [API Reference](#api-reference)
7. [Performance Optimization](#performance-optimization)
---
1. Overview
The Motion Capture Pipeline provides real-time human motion tracking through multiple input sources, fusing them into a unified skeleton representation suitable for audio synthesis and motion generation.
Architecture Diagram
┌─────────────────────────────────────────────────────────────────────────────┐
│ MOTION CAPTURE PIPELINE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ INPUT LAYER │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────┐ │ │
│ │ │ WEBCAM │ │ MOCOPI │ │ iOS DEVICE │ │ │
│ │ │ (Camera) │ │ (6 IMUs) │ │ (CoreMotion) │ │ │
│ │ │ 720p/30fps │ │ 60Hz/sensor │ │ 100Hz │ │ │
│ │ └────────┬────────┘ └────────┬────────┘ └──────────┬──────────┘ │ │
│ │ │ │ │ │ │
│ └───────────┼────────────────────┼──────────────────────┼───────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ CAPTURE LAYER │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────┐ │ │
│ │ │ MediaPipe │ │ Mocopi Parser │ │ Motion Manager │ │ │
│ │ │ Holistic │ │ │ │ │ │ │
│ │ │ │ │ BVH Protocol │ │ CMMotionManager │ │ │
│ │ │ 543 landmarks │ │ Quaternions │ │ Device Motion │ │ │
│ │ └────────┬────────┘ └────────┬────────┘ └──────────┬──────────┘ │ │
│ │ │ │ │ │ │
│ │ │ PoseOutput │ MocopiFrame │ iOSMotion │ │
│ │ │ FaceOutput │ │ │ │
│ │ │ HandOutput │ │ │ │
│ └───────────┼────────────────────┼──────────────────────┼───────────────┘ │
│ │ │ │ │
│ └────────────────────┼──────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ FUSION LAYER │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
│ │ │ Kalman Filter Fusion │ │ │
│ │ │ │ │ │
│ │ │ Position (MediaPipe) ─────┐ │ │ │
│ │ │ High visual accuracy │ │ │ │
│ │ │ Occlusion issues ├──▶ FusedSkeletonState │ │ │
│ │ │ │ │ │ │
│ │ │ Orientation (Mocopi) ─────┤ │ │ │
│ │ │ High rotational accuracy │ │ │ │
│ │ │ No absolute position │ │ │ │
│ │ │ │ │ │ │
│ │ │ Acceleration (iOS) ───────┘ │ │ │
│ │ │ Supplementary data │ │ │
│ │ │ │ │ │
│ │ └──────────────────────────────┬──────────────────────────────────┘ │ │
│ └─────────────────────────────────┼─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ OUTPUT LAYER │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────┐ │ │
│ │ │ Skeleton │ │ Derived │ │ Motion │ │ │
│ │ │ State │ │ Kinematics │ │ Features │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ - Positions │ │ - Velocities │ │ - Energy │ │ │
│ │ │ - Rotations │ │ - Accelerations│ │ - Gestures │ │ │
│ │ │ - Confidence │ │ - Jerk │ │ - Expressions │ │ │
│ │ └─────────────────┘ └─────────────────┘ └─────────────────────┘ │ │
│ │ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘---
2. MediaPipe Integration
2.1 Architecture
The MediaPipe integration follows a modular, plugin-based architecture with three layers:
Core Layer
// Location: apps/web/cc-dashboard/src/lib/mediapipe/core/
// EventEmitter.ts - Type-safe pub/sub
export class EventEmitter<Events extends EventMap = EventMap> {
private listeners: Map<keyof Events, Set<EventCallback<Events[keyof Events]>>>
on<K extends keyof Events>(event: K, callback: EventCallback<Events[K]>): () => void
off<K extends keyof Events>(event: K, callback: EventCallback<Events[K]>): void
emit<K extends keyof Events>(event: K, data: Events[K]): void
once<K extends keyof Events>(event: K, callback: EventCallback<Events[K]>): void
removeAllListeners(): void
}
// CameraService.ts - Webcam lifecycle management
export class CameraService extends EventEmitter<CameraEvents> {
private videoElement: HTMLVideoElement | null
private stream: MediaStream | null
private state: CameraState
async initialize(videoElement: HTMLVideoElement): Promise<boolean>
start(): void
stop(): void
destroy(): void
getVideoElement(): HTMLVideoElement | null
getState(): CameraState
isRunning(): boolean
}Analyzer Layer
// Location: apps/web/cc-dashboard/src/lib/mediapipe/analyzers/
// BaseAnalyzer.ts - Abstract base class
export abstract class BaseAnalyzer<TInput, TOutput extends AnalyzerOutput, TEvents>
extends EventEmitter<TEvents> {
protected config: AnalyzerConfig
protected lastOutput: TOutput | null
protected frameCount: number
constructor(config: Partial<AnalyzerConfig>)
process(input: TInput, timestamp: number): TOutput | null
protected abstract analyze(input: TInput, timestamp: number): TOutput | null
protected applySmoothing(current: TOutput): TOutput
protected checkThreshold(value: number, threshold: number): boolean
}
// FaceAnalyzer.ts - Face expression extraction
export interface FaceOutput extends AnalyzerOutput {
// Mouth
mouthOpen: number // 0-1, how open the mouth is
smileIntensity: number // 0-1, smile strength
// Eyes
eyeOpenLeft: number // 0-1
eyeOpenRight: number // 0-1
eyebrowRaiseLeft: number // 0-1
eyebrowRaiseRight: number // 0-1
// Head pose
headYaw: number // -1 to 1 (left/right)
headPitch: number // -1 to 1 (up/down)
headRoll: number // -1 to 1 (tilt)
// Events
isBlinking: boolean
blinkCount: number
}
// HandAnalyzer.ts - Gesture recognition
export interface HandOutput extends AnalyzerOutput {
left: SingleHandOutput | null
right: SingleHandOutput | null
}
export interface SingleHandOutput {
openness: number // 0-1, how open the hand is
grip: number // 0-1, grip strength
gesture: HandGesture // Detected gesture
// Individual fingers
thumb: FingerState
index: FingerState
middle: FingerState
ring: FingerState
pinky: FingerState
// Palm
palmPosition: [number, number, number]
palmNormal: [number, number, number]
// Pinch detection
pinchDistance: number
isPinching: boolean
}
export type HandGesture =
| 'open' // All fingers extended
| 'fist' // All fingers closed
| 'point' // Index extended
| 'peace' // Index + middle extended
| 'thumbsUp' // Thumb up, others closed
| 'pinch' // Thumb + index touching
| 'unknown'
// PoseAnalyzer.ts - Body pose analysis
export interface PoseOutput extends AnalyzerOutput {
// Torso
bodyLean: number // -1 to 1 (left/right lean)
bodyTwist: number // -1 to 1 (rotation)
crouchLevel: number // 0-1 (standing to crouching)
// Arms
armSpread: number // 0-1 (arms at sides to spread)
armRaiseLeft: number // 0-1 (arm height)
armRaiseRight: number // 0-1
elbowBendLeft: number // 0-1
elbowBendRight: number // 0-1
// Legs
legSpread: number // 0-1
kneeFlexLeft: number // 0-1
kneeFlexRight: number // 0-1
// Energy metrics
bodyEnergy: number // 0-1, overall movement energy
upperBodyEnergy: number // 0-1
lowerBodyEnergy: number // 0-1
// Joint states
joints: Map<string, JointState>
limbs: Map<string, LimbState>
}Pipeline Layer
// Location: apps/web/cc-dashboard/src/lib/mediapipe/pipeline/
// HolisticPipeline.ts - Main orchestrator
export interface PipelineConfig {
face: FaceAnalyzerConfig
hands: HandAnalyzerConfig
pose: PoseAnalyzerConfig
// Global settings
smoothing: number // 0-1, global smoothing factor
minDetectionConfidence: number
minTrackingConfidence: number
// Performance
maxFPS: number
enableGPU: boolean
}
export class HolisticPipeline extends EventEmitter<PipelineEvents> {
private holistic: Holistic | null
private camera: CameraService
private faceAnalyzer: FaceAnalyzer
private handAnalyzer: HandAnalyzer
private poseAnalyzer: PoseAnalyzer
private state: PipelineState
private lastOutput: PipelineOutput | null
constructor(config: Partial<PipelineConfig>)
async initialize(
videoElement: HTMLVideoElement,
canvasElement?: HTMLCanvasElement
): Promise<boolean>
start(): void
stop(): void
destroy(): void
// Event subscriptions
on(event: 'frame', callback: (output: PipelineOutput) => void): () => void
on(event: 'face', callback: (face: FaceOutput) => void): () => void
on(event: 'hands', callback: (hands: HandOutput) => void): () => void
on(event: 'pose', callback: (pose: PoseOutput) => void): () => void
on(event: 'error', callback: (error: Error) => void): () => void
getState(): PipelineState
getLastOutput(): PipelineOutput | null
}2.2 Landmark Reference
Face Landmarks (468 points)
Key landmarks for expression detection:
Mouth:
- 13: Upper lip center
- 14: Lower lip center
- 78: Left mouth corner
- 308: Right mouth corner
Eyes:
- 159: Left eye upper
- 145: Left eye lower
- 386: Right eye upper
- 374: Right eye lower
Eyebrows:
- 70: Left eyebrow inner
- 107: Left eyebrow outer
- 336: Right eyebrow inner
- 300: Right eyebrow outer
Nose:
- 1: Nose tip
- 4: Nose bridge
Forehead:
- 10: Forehead centerPose Landmarks (33 points)
POSE_LANDMARKS = {
0: 'nose',
1: 'left_eye_inner',
2: 'left_eye',
3: 'left_eye_outer',
4: 'right_eye_inner',
5: 'right_eye',
6: 'right_eye_outer',
7: 'left_ear',
8: 'right_ear',
9: 'mouth_left',
10: 'mouth_right',
11: 'left_shoulder',
12: 'right_shoulder',
13: 'left_elbow',
14: 'right_elbow',
15: 'left_wrist',
16: 'right_wrist',
17: 'left_pinky',
18: 'right_pinky',
19: 'left_index',
20: 'right_index',
21: 'left_thumb',
22: 'right_thumb',
23: 'left_hip',
24: 'right_hip',
25: 'left_knee',
26: 'right_knee',
27: 'left_ankle',
28: 'right_ankle',
29: 'left_heel',
30: 'right_heel',
31: 'left_foot_index',
32: 'right_foot_index'
}Hand Landmarks (21 points per hand)
HAND_LANDMARKS = {
0: 'wrist',
1: 'thumb_cmc',
2: 'thumb_mcp',
3: 'thumb_ip',
4: 'thumb_tip',
5: 'index_mcp',
6: 'index_pip',
7: 'index_dip',
8: 'index_tip',
9: 'middle_mcp',
10: 'middle_pip',
11: 'middle_dip',
12: 'middle_tip',
13: 'ring_mcp',
14: 'ring_pip',
15: 'ring_dip',
16: 'ring_tip',
17: 'pinky_mcp',
18: 'pinky_pip',
19: 'pinky_dip',
20: 'pinky_tip'
}2.3 Usage Examples
Basic Usage
import { HolisticPipeline } from '@/lib/mediapipe'
// Initialize
const pipeline = new HolisticPipeline({
face: { smoothing: 0.3 },
hands: { detectGestures: true },
pose: { smoothing: 0.5 }
})
// Get video element
const video = document.getElementById('video') as HTMLVideoElement
// Initialize and start
await pipeline.initialize(video)
pipeline.start()
// Subscribe to events
pipeline.on('face', (face) => {
if (face.smileIntensity > 0.5) {
console.log('User is smiling!')
}
})
pipeline.on('hands', (hands) => {
if (hands.left?.gesture === 'fist') {
console.log('Left fist detected')
}
})
pipeline.on('pose', (pose) => {
console.log(`Body energy: ${pose.bodyEnergy}`)
})
// Cleanup
pipeline.destroy()React Hook Usage
import { useMediaPipe } from '@/lib/mediapipe'
function MotionCapture() {
const videoRef = useRef<HTMLVideoElement>(null)
const {
start,
stop,
isRunning,
face,
hands,
pose,
error
} = useMediaPipe({
videoRef,
autoStart: true,
config: {
face: { detectBlinks: true },
hands: { detectGestures: true }
}
})
return (
<div>
<video ref={videoRef} />
{face && (
<div>Smile: {(face.smileIntensity * 100).toFixed(0)}%</div>
)}
{hands?.left && (
<div>Left hand: {hands.left.gesture}</div>
)}
{pose && (
<div>Energy: {(pose.bodyEnergy * 100).toFixed(0)}%</div>
)}
</div>
)
}---
3. Mocopi Integration
3.1 Overview
Sony Mocopi provides 6 IMU sensors for full-body motion tracking:
| Sensor | Placement | Primary Function |
|---|---|---|
| HEAD | Forehead band | Head orientation |
| BODY | Chest/hip clip | Torso orientation |
| L_WRIST | Left wrist band | Left arm tracking |
| R_WRIST | Right wrist band | Right arm tracking |
| L_ANKLE | Left ankle band | Left leg tracking |
| R_ANKLE | Right ankle band | Right leg tracking |
3.2 Data Flow
┌─────────────────────────────────────────────────────────────────────────────┐
│ MOCOPI DATA FLOW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ │
│ │ Mocopi │ │
│ │ Sensors │ │
│ │ (6 IMUs) │ │
│ └────────┬────────┘ │
│ │ Bluetooth │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Mocopi App │ │
│ │ (iOS/Android) │ │
│ │ │ │
│ │ - Calibration │ │
│ │ - Skeleton │ │
│ │ solving │ │
│ └────────┬────────┘ │
│ │ UDP (Port 12351) │
│ │ BVH Protocol │
│ ▼ │
│ ┌─────────────────┐ │
│ │ BVH Sender │ │
│ │ (macOS App) │ │
│ │ │ │
│ │ - UDP receive │ │
│ │ - WebSocket │ │
│ │ broadcast │ │
│ └────────┬────────┘ │
│ │ WebSocket │
│ │ ws://localhost:8765 │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ LOCAL PROCESSING │ │
│ │ │ │
│ │ ┌───────────────────┐ ┌───────────────────┐ │ │
│ │ │ MocopiParser │───▶│ SkeletonState │ │ │
│ │ │ │ │ │ │ │
│ │ │ - Parse BVH │ │ - Joint pos │ │ │
│ │ │ - Quaternions │ │ - Rotations │ │ │
│ │ │ - Timestamps │ │ - Velocities │ │ │
│ │ └───────────────────┘ └───────────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ OR │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ CLOUD RELAY (for remote access) │ │
│ │ │ │
│ │ BVH Sender ──▶ VPS Relay ──▶ Browser │ │
│ │ │ │
│ │ wss://your-vps.com:443/relay │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘3.3 Type Definitions
// Location: apps/web/cc-dashboard/src/lib/mocopi/types.ts
export interface MocopiSensorId {
HEAD: 'head'
BODY: 'body'
L_WRIST: 'l_wrist'
R_WRIST: 'r_wrist'
L_ANKLE: 'l_ankle'
R_ANKLE: 'r_ankle'
}
export interface MocopiFrame {
timestamp: number
frameId: number
sensors: {
[K in keyof MocopiSensorId]: SensorData
}
skeleton?: SkeletonData
}
export interface SensorData {
quaternion: Quaternion // [w, x, y, z]
acceleration: Vector3 // [x, y, z] in m/s²
angularVelocity?: Vector3 // [x, y, z] in rad/s
batteryLevel?: number // 0-100
signalStrength?: number // 0-100
}
export interface SkeletonData {
joints: SkeletonJoint[]
rootPosition: Vector3
rootRotation: Quaternion
}
export interface SkeletonJoint {
name: string
position: Vector3 // Local position
rotation: Quaternion // Local rotation
worldPosition?: Vector3 // Computed world position
worldRotation?: Quaternion // Computed world rotation
}
// BVH Joint hierarchy
export const MOCOPI_SKELETON_HIERARCHY = {
root: {
name: 'Hips',
children: ['Spine', 'LeftUpLeg', 'RightUpLeg']
},
Spine: {
children: ['Spine1']
},
Spine1: {
children: ['Spine2']
},
Spine2: {
children: ['Neck', 'LeftShoulder', 'RightShoulder']
},
Neck: {
children: ['Head']
},
LeftShoulder: {
children: ['LeftArm']
},
LeftArm: {
children: ['LeftForeArm']
},
LeftForeArm: {
children: ['LeftHand']
},
RightShoulder: {
children: ['RightArm']
},
RightArm: {
children: ['RightForeArm']
},
RightForeArm: {
children: ['RightHand']
},
LeftUpLeg: {
children: ['LeftLeg']
},
LeftLeg: {
children: ['LeftFoot']
},
LeftFoot: {
children: ['LeftToeBase']
},
RightUpLeg: {
children: ['RightLeg']
},
RightLeg: {
children: ['RightFoot']
},
RightFoot: {
children: ['RightToeBase']
}
}3.4 React Hook
// Location: apps/web/cc-dashboard/src/lib/mocopi/useMocopiStream.ts
export interface UseMocopiStreamOptions {
url?: string // WebSocket URL
autoConnect?: boolean // Connect on mount
reconnect?: boolean // Auto-reconnect on disconnect
reconnectInterval?: number // ms between reconnect attempts
}
export interface UseMocopiStreamReturn {
// Connection
connect: () => void
disconnect: () => void
isConnected: boolean
// Data
frame: MocopiFrame | null
skeleton: SkeletonData | null
sensors: Record<string, SensorData>
// Metrics
fps: number
latency: number
// Errors
error: Error | null
}
export function useMocopiStream(options?: UseMocopiStreamOptions): UseMocopiStreamReturn3.5 Cloud Relay Setup
# Location: deploy/mocopi-relay/mocopi_relay_v4.py
"""
Mocopi Cloud Relay Server
Features:
- WebSocket server with SSL
- Session management
- Multi-client broadcasting
- Authentication
- Metrics logging
"""
import asyncio
import ssl
import json
from websockets.server import serve
from dataclasses import dataclass
from typing import Dict, Set
@dataclass
class Session:
id: str
sender: WebSocket | None
receivers: Set[WebSocket]
created_at: float
last_activity: float
frame_count: int
class MocopiRelay:
def __init__(self, host: str, port: int, ssl_context: ssl.SSLContext | None = None):
self.host = host
self.port = port
self.ssl_context = ssl_context
self.sessions: Dict[str, Session] = {}
async def handler(self, websocket, path):
"""Handle incoming WebSocket connections."""
try:
# First message should be registration
message = await websocket.recv()
data = json.loads(message)
if data['type'] == 'register':
session_id = data['sessionId']
client_type = data['clientType'] # 'sender' or 'receiver'
if client_type == 'sender':
await self.handle_sender(websocket, session_id)
else:
await self.handle_receiver(websocket, session_id)
except Exception as e:
print(f"Error: {e}")
async def handle_sender(self, websocket, session_id: str):
"""Handle sender (BVH Sender app)."""
session = self.get_or_create_session(session_id)
session.sender = websocket
try:
async for message in websocket:
# Broadcast to all receivers
if session.receivers:
await asyncio.gather(
*[r.send(message) for r in session.receivers],
return_exceptions=True
)
session.frame_count += 1
session.last_activity = time.time()
finally:
session.sender = None
async def handle_receiver(self, websocket, session_id: str):
"""Handle receiver (browser)."""
session = self.get_or_create_session(session_id)
session.receivers.add(websocket)
try:
await websocket.wait_closed()
finally:
session.receivers.discard(websocket)
async def start(self):
"""Start the relay server."""
async with serve(
self.handler,
self.host,
self.port,
ssl=self.ssl_context
):
print(f"Mocopi Relay running on {self.host}:{self.port}")
await asyncio.Future() # Run forever
# Usage
if __name__ == "__main__":
relay = MocopiRelay("[ip]", 443, ssl_context)
asyncio.run(relay.start())---
4. Sensor Fusion
4.1 Fusion Strategy
The fusion system combines data from multiple sources with different characteristics:
| Source | Strengths | Weaknesses |
|---|---|---|
| MediaPipe | Position accuracy, facial detail | Occlusion, lighting dependent |
| Mocopi | Rotation accuracy, no occlusion | No absolute position, drift |
| iOS Motion | High frequency, acceleration | Single device only |
4.2 Fusion Algorithm
// Location: apps/web/cc-dashboard/src/lib/mocopi/MocopiLimbFusion.ts
export class MocopiLimbFusion {
private kalmanFilters: Map<string, KalmanFilter>
private lastMediaPipe: PoseOutput | null
private lastMocopi: MocopiFrame | null
constructor(config: FusionConfig) {
// Initialize Kalman filters for each joint
this.kalmanFilters = new Map()
for (const joint of FUSION_JOINTS) {
this.kalmanFilters.set(joint, new KalmanFilter({
processNoise: config.processNoise,
measurementNoise: config.measurementNoise,
initialState: [0, 0, 0, 0, 0, 0] // pos + vel
}))
}
}
fuse(mediapipe: PoseOutput | null, mocopi: MocopiFrame | null): FusedSkeletonState {
const result: FusedSkeletonState = {
joints: new Map(),
timestamp: Date.now(),
confidence: 0
}
// Weight sources based on availability and quality
const mpWeight = mediapipe ? this.computeMediaPipeWeight(mediapipe) : 0
const mocopiWeight = mocopi ? this.computeMocopiWeight(mocopi) : 0
const totalWeight = mpWeight + mocopiWeight
if (totalWeight === 0) return result
// Fuse each joint
for (const [jointName, kalman] of this.kalmanFilters) {
// Get position from MediaPipe (if available)
const mpPosition = mediapipe
? this.getMediaPipeJointPosition(mediapipe, jointName)
: null
// Get rotation from Mocopi (if available)
const mocopiRotation = mocopi
? this.getMocopiJointRotation(mocopi, jointName)
: null
// Kalman update
if (mpPosition) {
kalman.updatePosition(mpPosition, mpWeight / totalWeight)
}
if (mocopiRotation) {
kalman.updateRotation(mocopiRotation, mocopiWeight / totalWeight)
}
// Get fused state
const state = kalman.getState()
result.joints.set(jointName, {
position: state.position,
rotation: state.rotation,
velocity: state.velocity,
confidence: (mpWeight + mocopiWeight) / 2
})
}
result.confidence = totalWeight / 2
return result
}
private computeMediaPipeWeight(mp: PoseOutput): number {
// Weight based on visibility and confidence
return mp.confidence * (mp.visibleLandmarks / 33)
}
private computeMocopiWeight(mocopi: MocopiFrame): number {
// Weight based on sensor connectivity
const activeSensors = Object.values(mocopi.sensors)
.filter(s => s.signalStrength > 50).length
return activeSensors / 6
}
}4.3 Kalman Filter Implementation
// Simplified Kalman filter for position/velocity fusion
class KalmanFilter {
private state: Float64Array // [x, y, z, vx, vy, vz]
private covariance: Float64Array // 6x6 matrix
private F: Float64Array // State transition matrix
private Q: Float64Array // Process noise
private H: Float64Array // Measurement matrix
private R: Float64Array // Measurement noise
constructor(config: KalmanConfig) {
this.state = new Float64Array(6)
this.covariance = this.identity(6)
// State transition: x' = x + v*dt
this.F = new Float64Array([
1, 0, 0, 1, 0, 0, // x' = x + vx
0, 1, 0, 0, 1, 0, // y' = y + vy
0, 0, 1, 0, 0, 1, // z' = z + vz
0, 0, 0, 1, 0, 0, // vx' = vx
0, 0, 0, 0, 1, 0, // vy' = vy
0, 0, 0, 0, 0, 1 // vz' = vz
])
this.Q = this.diagonal(6, config.processNoise)
this.R = this.diagonal(3, config.measurementNoise)
// Measure position only
this.H = new Float64Array([
1, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0
])
}
predict(dt: number): void {
// Update F with actual dt
this.F[3] = this.F[10] = this.F[17] = dt
// x' = F * x
this.state = this.matVecMul(this.F, this.state)
// P' = F * P * F' + Q
this.covariance = this.matAdd(
this.matMul(this.F, this.matMul(this.covariance, this.transpose(this.F))),
this.Q
)
}
update(measurement: [number, number, number], weight: number): void {
// Innovation: y = z - H * x
const predicted = this.matVecMul(this.H, this.state)
const innovation = measurement.map((z, i) => z - predicted[i])
// Kalman gain: K = P * H' * (H * P * H' + R)^-1
const PHt = this.matMul(this.covariance, this.transpose(this.H))
const S = this.matAdd(
this.matMul(this.H, PHt),
this.R.map(r => r / weight) // Scale noise by weight
)
const K = this.matMul(PHt, this.invert(S))
// Update state: x = x + K * y
const correction = this.matVecMul(K, innovation)
this.state = this.state.map((s, i) => s + correction[i])
// Update covariance: P = (I - K * H) * P
const KH = this.matMul(K, this.H)
const IKH = this.identity(6).map((v, i) => v - KH[i])
this.covariance = this.matMul(IKH, this.covariance)
}
getState(): { position: Vector3, velocity: Vector3 } {
return {
position: [this.state[0], this.state[1], this.state[2]],
velocity: [this.state[3], this.state[4], this.state[5]]
}
}
}---
5. Skeleton System
5.1 Unified Skeleton Representation
# Location: core/cc-core/cc_core/skeleton/pose_frame.py
from dataclasses import dataclass
from typing import Dict, List, Optional
import numpy as np
@dataclass
class PoseFrame:
"""Single frame of skeleton data."""
timestamp: float
joints: Dict[str, JointData]
root_position: np.ndarray # [3,]
root_rotation: np.ndarray # [4,] quaternion
# Metadata
source: str # 'mediapipe', 'mocopi', 'fused'
confidence: float
frame_id: int
@dataclass
class JointData:
"""Data for a single joint."""
position: np.ndarray # [3,] local position
rotation: np.ndarray # [4,] quaternion
world_position: np.ndarray # [3,] world position
world_rotation: np.ndarray # [4,] world quaternion
velocity: Optional[np.ndarray] = None # [3,]
angular_velocity: Optional[np.ndarray] = None # [3,]
confidence: float = 1.0
# Standard joint names
JOINT_NAMES = [
'Hips', 'Spine', 'Spine1', 'Spine2', 'Neck', 'Head',
'LeftShoulder', 'LeftArm', 'LeftForeArm', 'LeftHand',
'RightShoulder', 'RightArm', 'RightForeArm', 'RightHand',
'LeftUpLeg', 'LeftLeg', 'LeftFoot', 'LeftToeBase',
'RightUpLeg', 'RightLeg', 'RightFoot', 'RightToeBase'
]
# Parent-child relationships
SKELETON_PARENTS = {
'Hips': None,
'Spine': 'Hips',
'Spine1': 'Spine',
'Spine2': 'Spine1',
'Neck': 'Spine2',
'Head': 'Neck',
'LeftShoulder': 'Spine2',
'LeftArm': 'LeftShoulder',
'LeftForeArm': 'LeftArm',
'LeftHand': 'LeftForeArm',
'RightShoulder': 'Spine2',
'RightArm': 'RightShoulder',
'RightForeArm': 'RightArm',
'RightHand': 'RightForeArm',
'LeftUpLeg': 'Hips',
'LeftLeg': 'LeftUpLeg',
'LeftFoot': 'LeftLeg',
'LeftToeBase': 'LeftFoot',
'RightUpLeg': 'Hips',
'RightLeg': 'RightUpLeg',
'RightFoot': 'RightLeg',
'RightToeBase': 'RightFoot'
}5.2 Derived Kinematics
# Location: core/cc-core/cc_core/skeleton/derived_kinematics.py
import numpy as np
from typing import List
from .pose_frame import PoseFrame
class DerivedKinematics:
"""Compute velocities, accelerations, and jerk from pose sequences."""
def __init__(self, fps: float = 30.0):
self.fps = fps
self.dt = 1.0 / fps
self.history: List[PoseFrame] = []
self.max_history = 5
def update(self, frame: PoseFrame) -> PoseFrame:
"""Add frame and compute derivatives."""
self.history.append(frame)
if len(self.history) > self.max_history:
self.history.pop(0)
# Compute velocities
if len(self.history) >= 2:
self._compute_velocities(frame)
# Compute accelerations
if len(self.history) >= 3:
self._compute_accelerations(frame)
return frame
def _compute_velocities(self, frame: PoseFrame):
"""Compute joint velocities."""
prev = self.history[-2]
for joint_name, joint in frame.joints.items():
if joint_name in prev.joints:
prev_joint = prev.joints[joint_name]
# Linear velocity
joint.velocity = (
joint.world_position - prev_joint.world_position
) / self.dt
# Angular velocity (from quaternion difference)
joint.angular_velocity = self._quat_to_angular_velocity(
prev_joint.world_rotation,
joint.world_rotation,
self.dt
)
def _quat_to_angular_velocity(
self,
q0: np.ndarray,
q1: np.ndarray,
dt: float
) -> np.ndarray:
"""Convert quaternion change to angular velocity."""
# q_diff = q1 * q0^-1
q0_inv = np.array([q0[0], -q0[1], -q0[2], -q0[3]])
q_diff = self._quat_mul(q1, q0_inv)
# Extract axis-angle
angle = 2 * np.arccos(np.clip(q_diff[0], -1, 1))
if angle < 1e-6:
return np.zeros(3)
axis = q_diff[1:] / np.sin(angle / 2)
return axis * angle / dt
def get_energy(self, frame: PoseFrame) -> float:
"""Compute total kinetic energy."""
energy = 0.0
for joint in frame.joints.values():
if joint.velocity is not None:
# Translational energy: 1/2 * m * v^2
energy += 0.5 * np.sum(joint.velocity ** 2)
if joint.angular_velocity is not None:
# Rotational energy (simplified)
energy += 0.5 * np.sum(joint.angular_velocity ** 2)
return energy---
6. API Reference
6.1 MediaPipe Exports
// apps/web/cc-dashboard/src/lib/mediapipe/index.ts
// Types
export * from './types'
// Legacy API
export {
MediaPipeService,
getMediaPipeService,
destroyMediaPipeService,
} from './MediaPipeService'
// Legacy Hooks
export {
useMediaPipe,
useMediaPipeStore,
useFaceExpressions,
useHandGestures,
useBodyPose,
useMediaPipeEnhancements,
} from './useMediaPipe'
// Modular API - Core
export { EventEmitter, CameraService } from './core'
export type { CameraConfig, CameraState, CameraEvents } from './core'
// Modular API - Analyzers
export {
BaseAnalyzer,
FaceAnalyzer,
HandAnalyzer,
PoseAnalyzer,
} from './analyzers'
export type {
AnalyzerConfig,
AnalyzerOutput,
FaceAnalyzerConfig,
FaceOutput,
HandAnalyzerConfig,
HandOutput,
HandGesture,
FingerState,
PoseAnalyzerConfig,
PoseOutput,
JointState,
LimbState,
} from './analyzers'
// Modular API - Pipeline
export { HolisticPipeline } from './pipeline'
export type {
PipelineConfig,
PipelineState,
PipelineOutput,
PipelineEvents,
} from './pipeline'6.2 Mocopi Exports
// apps/web/cc-dashboard/src/lib/mocopi/index.ts
export * from './types'
export { useMocopiStream } from './useMocopiStream'
export { useConductorStream } from './useConductorStream'
export { MocopiLimbFusion } from './MocopiLimbFusion'
export { mocopiToConductor } from './conductorAdapter'---
7. Performance Optimization
7.1 Target Metrics
| Metric | Target | Measurement |
|---|---|---|
| MediaPipe FPS | 30 fps | requestAnimationFrame |
| Mocopi Latency | < 50ms | timestamp delta |
| Fusion Latency | < 10ms | processing time |
| Memory Usage | < 200MB | heap size |
7.2 Optimization Strategies
MediaPipe
// Use GPU acceleration
const holistic = new Holistic({
locateFile: (file) =>
`https://cdn.jsdelivr.net/npm/@mediapipe/holistic/${file}`,
});
holistic.setOptions({
modelComplexity: 1, // 0, 1, or 2
smoothLandmarks: true,
enableSegmentation: false, // Disable if not needed
smoothSegmentation: false,
refineFaceLandmarks: false, // Disable for performance
minDetectionConfidence: 0.5,
minTrackingConfidence: 0.5,
});WebSocket
// Use binary protocol for Mocopi
const ws = new WebSocket(url);
ws.binaryType = 'arraybuffer';
ws.onmessage = (event) => {
// Parse binary directly instead of JSON
const view = new DataView(event.data);
const timestamp = view.getFloat64(0);
// ... parse rest of frame
};React Optimization
// Memoize expensive computations
const processedPose = useMemo(() => {
if (!pose) return null;
return computeExpensiveMetrics(pose);
}, [pose?.timestamp]);
// Use refs for high-frequency updates
const frameRef = useRef<PoseOutput | null>(null);
useEffect(() => {
const unsubscribe = pipeline.on('pose', (pose) => {
frameRef.current = pose; // Don't trigger re-render
});
return unsubscribe;
}, []);---
Document Version: 2.0.0
Generated: December 26, 2024
Promotion Decision
Promote into a technical note or architecture paper with implementation anchors.
Source Anchor
projects/Documentation/01-architecture/systems/MOTION_CAPTURE_PIPELINE.md
Detected Structure
Method · Evaluation · Code Anchors · Architecture