Grand Diomande Research · Full HTML Reader

Motion Capture Pipeline

1. [Overview](#overview) 2. [MediaPipe Integration](#mediapipe-integration) 3. [Mocopi Integration](#mocopi-integration) 4. [Sensor Fusion](#sensor-fusion) 5. [Skeleton System](#skeleton-system) 6. [API Reference](#api-reference) 7. [Performance Optimization](#performance-optimization)

Embodied Trajectory Systems architecture technical paper candidate score 54 .md

Full Public Reader

Motion Capture Pipeline

Comprehensive Technical Documentation

Version: 2.0.0
Last Updated: December 26, 2024

---

Table of Contents

1. [Overview](#overview)
2. [MediaPipe Integration](#mediapipe-integration)
3. [Mocopi Integration](#mocopi-integration)
4. [Sensor Fusion](#sensor-fusion)
5. [Skeleton System](#skeleton-system)
6. [API Reference](#api-reference)
7. [Performance Optimization](#performance-optimization)

---

1. Overview

The Motion Capture Pipeline provides real-time human motion tracking through multiple input sources, fusing them into a unified skeleton representation suitable for audio synthesis and motion generation.

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                     MOTION CAPTURE PIPELINE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                        INPUT LAYER                                     │ │
│  │                                                                        │ │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────┐   │ │
│  │  │    WEBCAM       │  │    MOCOPI       │  │     iOS DEVICE      │   │ │
│  │  │   (Camera)      │  │   (6 IMUs)      │  │   (CoreMotion)      │   │ │
│  │  │   720p/30fps    │  │   60Hz/sensor   │  │   100Hz             │   │ │
│  │  └────────┬────────┘  └────────┬────────┘  └──────────┬──────────┘   │ │
│  │           │                    │                      │               │ │
│  └───────────┼────────────────────┼──────────────────────┼───────────────┘ │
│              │                    │                      │                  │
│              ▼                    ▼                      ▼                  │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                        CAPTURE LAYER                                   │ │
│  │                                                                        │ │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────┐   │ │
│  │  │   MediaPipe     │  │  Mocopi Parser  │  │  Motion Manager     │   │ │
│  │  │   Holistic      │  │                 │  │                     │   │ │
│  │  │                 │  │  BVH Protocol   │  │  CMMotionManager    │   │ │
│  │  │  543 landmarks  │  │  Quaternions    │  │  Device Motion      │   │ │
│  │  └────────┬────────┘  └────────┬────────┘  └──────────┬──────────┘   │ │
│  │           │                    │                      │               │ │
│  │           │ PoseOutput         │ MocopiFrame          │ iOSMotion     │ │
│  │           │ FaceOutput         │                      │               │ │
│  │           │ HandOutput         │                      │               │ │
│  └───────────┼────────────────────┼──────────────────────┼───────────────┘ │
│              │                    │                      │                  │
│              └────────────────────┼──────────────────────┘                  │
│                                   │                                         │
│                                   ▼                                         │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                        FUSION LAYER                                    │ │
│  │                                                                        │ │
│  │  ┌─────────────────────────────────────────────────────────────────┐ │ │
│  │  │                    Kalman Filter Fusion                          │ │ │
│  │  │                                                                  │ │ │
│  │  │   Position (MediaPipe)  ─────┐                                  │ │ │
│  │  │   High visual accuracy       │                                  │ │ │
│  │  │   Occlusion issues           ├──▶  FusedSkeletonState           │ │ │
│  │  │                              │                                  │ │ │
│  │  │   Orientation (Mocopi)  ─────┤                                  │ │ │
│  │  │   High rotational accuracy   │                                  │ │ │
│  │  │   No absolute position       │                                  │ │ │
│  │  │                              │                                  │ │ │
│  │  │   Acceleration (iOS)  ───────┘                                  │ │ │
│  │  │   Supplementary data                                            │ │ │
│  │  │                                                                  │ │ │
│  │  └──────────────────────────────┬──────────────────────────────────┘ │ │
│  └─────────────────────────────────┼─────────────────────────────────────┘ │
│                                    │                                        │
│                                    ▼                                        │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                        OUTPUT LAYER                                    │ │
│  │                                                                        │ │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────┐   │ │
│  │  │  Skeleton       │  │  Derived        │  │  Motion             │   │ │
│  │  │  State          │  │  Kinematics     │  │  Features           │   │ │
│  │  │                 │  │                 │  │                     │   │ │
│  │  │  - Positions    │  │  - Velocities   │  │  - Energy           │   │ │
│  │  │  - Rotations    │  │  - Accelerations│  │  - Gestures         │   │ │
│  │  │  - Confidence   │  │  - Jerk         │  │  - Expressions      │   │ │
│  │  └─────────────────┘  └─────────────────┘  └─────────────────────┘   │ │
│  │                                                                        │ │
│  └───────────────────────────────────────────────────────────────────────┘ │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

---

2. MediaPipe Integration

2.1 Architecture

The MediaPipe integration follows a modular, plugin-based architecture with three layers:

Core Layer

typescript
// Location: apps/web/cc-dashboard/src/lib/mediapipe/core/

// EventEmitter.ts - Type-safe pub/sub
export class EventEmitter<Events extends EventMap = EventMap> {
  private listeners: Map<keyof Events, Set<EventCallback<Events[keyof Events]>>>

  on<K extends keyof Events>(event: K, callback: EventCallback<Events[K]>): () => void
  off<K extends keyof Events>(event: K, callback: EventCallback<Events[K]>): void
  emit<K extends keyof Events>(event: K, data: Events[K]): void
  once<K extends keyof Events>(event: K, callback: EventCallback<Events[K]>): void
  removeAllListeners(): void
}

// CameraService.ts - Webcam lifecycle management
export class CameraService extends EventEmitter<CameraEvents> {
  private videoElement: HTMLVideoElement | null
  private stream: MediaStream | null
  private state: CameraState

  async initialize(videoElement: HTMLVideoElement): Promise<boolean>
  start(): void
  stop(): void
  destroy(): void

  getVideoElement(): HTMLVideoElement | null
  getState(): CameraState
  isRunning(): boolean
}

Analyzer Layer

typescript
// Location: apps/web/cc-dashboard/src/lib/mediapipe/analyzers/

// BaseAnalyzer.ts - Abstract base class
export abstract class BaseAnalyzer<TInput, TOutput extends AnalyzerOutput, TEvents>
  extends EventEmitter<TEvents> {

  protected config: AnalyzerConfig
  protected lastOutput: TOutput | null
  protected frameCount: number

  constructor(config: Partial<AnalyzerConfig>)

  process(input: TInput, timestamp: number): TOutput | null
  protected abstract analyze(input: TInput, timestamp: number): TOutput | null
  protected applySmoothing(current: TOutput): TOutput
  protected checkThreshold(value: number, threshold: number): boolean
}

// FaceAnalyzer.ts - Face expression extraction
export interface FaceOutput extends AnalyzerOutput {
  // Mouth
  mouthOpen: number          // 0-1, how open the mouth is
  smileIntensity: number     // 0-1, smile strength

  // Eyes
  eyeOpenLeft: number        // 0-1
  eyeOpenRight: number       // 0-1
  eyebrowRaiseLeft: number   // 0-1
  eyebrowRaiseRight: number  // 0-1

  // Head pose
  headYaw: number            // -1 to 1 (left/right)
  headPitch: number          // -1 to 1 (up/down)
  headRoll: number           // -1 to 1 (tilt)

  // Events
  isBlinking: boolean
  blinkCount: number
}

// HandAnalyzer.ts - Gesture recognition
export interface HandOutput extends AnalyzerOutput {
  left: SingleHandOutput | null
  right: SingleHandOutput | null
}

export interface SingleHandOutput {
  openness: number           // 0-1, how open the hand is
  grip: number               // 0-1, grip strength
  gesture: HandGesture       // Detected gesture

  // Individual fingers
  thumb: FingerState
  index: FingerState
  middle: FingerState
  ring: FingerState
  pinky: FingerState

  // Palm
  palmPosition: [number, number, number]
  palmNormal: [number, number, number]

  // Pinch detection
  pinchDistance: number
  isPinching: boolean
}

export type HandGesture =
  | 'open'      // All fingers extended
  | 'fist'      // All fingers closed
  | 'point'     // Index extended
  | 'peace'     // Index + middle extended
  | 'thumbsUp'  // Thumb up, others closed
  | 'pinch'     // Thumb + index touching
  | 'unknown'

// PoseAnalyzer.ts - Body pose analysis
export interface PoseOutput extends AnalyzerOutput {
  // Torso
  bodyLean: number           // -1 to 1 (left/right lean)
  bodyTwist: number          // -1 to 1 (rotation)
  crouchLevel: number        // 0-1 (standing to crouching)

  // Arms
  armSpread: number          // 0-1 (arms at sides to spread)
  armRaiseLeft: number       // 0-1 (arm height)
  armRaiseRight: number      // 0-1
  elbowBendLeft: number      // 0-1
  elbowBendRight: number     // 0-1

  // Legs
  legSpread: number          // 0-1
  kneeFlexLeft: number       // 0-1
  kneeFlexRight: number      // 0-1

  // Energy metrics
  bodyEnergy: number         // 0-1, overall movement energy
  upperBodyEnergy: number    // 0-1
  lowerBodyEnergy: number    // 0-1

  // Joint states
  joints: Map<string, JointState>
  limbs: Map<string, LimbState>
}

Pipeline Layer

typescript
// Location: apps/web/cc-dashboard/src/lib/mediapipe/pipeline/

// HolisticPipeline.ts - Main orchestrator
export interface PipelineConfig {
  face: FaceAnalyzerConfig
  hands: HandAnalyzerConfig
  pose: PoseAnalyzerConfig

  // Global settings
  smoothing: number          // 0-1, global smoothing factor
  minDetectionConfidence: number
  minTrackingConfidence: number

  // Performance
  maxFPS: number
  enableGPU: boolean
}

export class HolisticPipeline extends EventEmitter<PipelineEvents> {
  private holistic: Holistic | null
  private camera: CameraService
  private faceAnalyzer: FaceAnalyzer
  private handAnalyzer: HandAnalyzer
  private poseAnalyzer: PoseAnalyzer

  private state: PipelineState
  private lastOutput: PipelineOutput | null

  constructor(config: Partial<PipelineConfig>)

  async initialize(
    videoElement: HTMLVideoElement,
    canvasElement?: HTMLCanvasElement
  ): Promise<boolean>

  start(): void
  stop(): void
  destroy(): void

  // Event subscriptions
  on(event: 'frame', callback: (output: PipelineOutput) => void): () => void
  on(event: 'face', callback: (face: FaceOutput) => void): () => void
  on(event: 'hands', callback: (hands: HandOutput) => void): () => void
  on(event: 'pose', callback: (pose: PoseOutput) => void): () => void
  on(event: 'error', callback: (error: Error) => void): () => void

  getState(): PipelineState
  getLastOutput(): PipelineOutput | null
}

2.2 Landmark Reference

Face Landmarks (468 points)

Key landmarks for expression detection:

Mouth:
  - 13: Upper lip center
  - 14: Lower lip center
  - 78: Left mouth corner
  - 308: Right mouth corner

Eyes:
  - 159: Left eye upper
  - 145: Left eye lower
  - 386: Right eye upper
  - 374: Right eye lower

Eyebrows:
  - 70: Left eyebrow inner
  - 107: Left eyebrow outer
  - 336: Right eyebrow inner
  - 300: Right eyebrow outer

Nose:
  - 1: Nose tip
  - 4: Nose bridge

Forehead:
  - 10: Forehead center

Pose Landmarks (33 points)

POSE_LANDMARKS = {
  0: 'nose',
  1: 'left_eye_inner',
  2: 'left_eye',
  3: 'left_eye_outer',
  4: 'right_eye_inner',
  5: 'right_eye',
  6: 'right_eye_outer',
  7: 'left_ear',
  8: 'right_ear',
  9: 'mouth_left',
  10: 'mouth_right',
  11: 'left_shoulder',
  12: 'right_shoulder',
  13: 'left_elbow',
  14: 'right_elbow',
  15: 'left_wrist',
  16: 'right_wrist',
  17: 'left_pinky',
  18: 'right_pinky',
  19: 'left_index',
  20: 'right_index',
  21: 'left_thumb',
  22: 'right_thumb',
  23: 'left_hip',
  24: 'right_hip',
  25: 'left_knee',
  26: 'right_knee',
  27: 'left_ankle',
  28: 'right_ankle',
  29: 'left_heel',
  30: 'right_heel',
  31: 'left_foot_index',
  32: 'right_foot_index'
}

Hand Landmarks (21 points per hand)

HAND_LANDMARKS = {
  0: 'wrist',
  1: 'thumb_cmc',
  2: 'thumb_mcp',
  3: 'thumb_ip',
  4: 'thumb_tip',
  5: 'index_mcp',
  6: 'index_pip',
  7: 'index_dip',
  8: 'index_tip',
  9: 'middle_mcp',
  10: 'middle_pip',
  11: 'middle_dip',
  12: 'middle_tip',
  13: 'ring_mcp',
  14: 'ring_pip',
  15: 'ring_dip',
  16: 'ring_tip',
  17: 'pinky_mcp',
  18: 'pinky_pip',
  19: 'pinky_dip',
  20: 'pinky_tip'
}

2.3 Usage Examples

Basic Usage

typescript
import { HolisticPipeline } from '@/lib/mediapipe'

// Initialize
const pipeline = new HolisticPipeline({
  face: { smoothing: 0.3 },
  hands: { detectGestures: true },
  pose: { smoothing: 0.5 }
})

// Get video element
const video = document.getElementById('video') as HTMLVideoElement

// Initialize and start
await pipeline.initialize(video)
pipeline.start()

// Subscribe to events
pipeline.on('face', (face) => {
  if (face.smileIntensity > 0.5) {
    console.log('User is smiling!')
  }
})

pipeline.on('hands', (hands) => {
  if (hands.left?.gesture === 'fist') {
    console.log('Left fist detected')
  }
})

pipeline.on('pose', (pose) => {
  console.log(`Body energy: ${pose.bodyEnergy}`)
})

// Cleanup
pipeline.destroy()

React Hook Usage

typescript
import { useMediaPipe } from '@/lib/mediapipe'

function MotionCapture() {
  const videoRef = useRef<HTMLVideoElement>(null)

  const {
    start,
    stop,
    isRunning,
    face,
    hands,
    pose,
    error
  } = useMediaPipe({
    videoRef,
    autoStart: true,
    config: {
      face: { detectBlinks: true },
      hands: { detectGestures: true }
    }
  })

  return (
    <div>
      <video ref={videoRef} />
      {face && (
        <div>Smile: {(face.smileIntensity * 100).toFixed(0)}%</div>
      )}
      {hands?.left && (
        <div>Left hand: {hands.left.gesture}</div>
      )}
      {pose && (
        <div>Energy: {(pose.bodyEnergy * 100).toFixed(0)}%</div>
      )}
    </div>
  )
}

---

3. Mocopi Integration

3.1 Overview

Sony Mocopi provides 6 IMU sensors for full-body motion tracking:

SensorPlacementPrimary Function
HEADForehead bandHead orientation
BODYChest/hip clipTorso orientation
L_WRISTLeft wrist bandLeft arm tracking
R_WRISTRight wrist bandRight arm tracking
L_ANKLELeft ankle bandLeft leg tracking
R_ANKLERight ankle bandRight leg tracking

3.2 Data Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                         MOCOPI DATA FLOW                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────────┐                                                        │
│  │   Mocopi        │                                                        │
│  │   Sensors       │                                                        │
│  │   (6 IMUs)      │                                                        │
│  └────────┬────────┘                                                        │
│           │ Bluetooth                                                        │
│           ▼                                                                  │
│  ┌─────────────────┐                                                        │
│  │   Mocopi App    │                                                        │
│  │   (iOS/Android) │                                                        │
│  │                 │                                                        │
│  │   - Calibration │                                                        │
│  │   - Skeleton    │                                                        │
│  │     solving     │                                                        │
│  └────────┬────────┘                                                        │
│           │ UDP (Port 12351)                                                │
│           │ BVH Protocol                                                    │
│           ▼                                                                  │
│  ┌─────────────────┐                                                        │
│  │   BVH Sender    │                                                        │
│  │   (macOS App)   │                                                        │
│  │                 │                                                        │
│  │   - UDP receive │                                                        │
│  │   - WebSocket   │                                                        │
│  │     broadcast   │                                                        │
│  └────────┬────────┘                                                        │
│           │ WebSocket                                                        │
│           │ ws://localhost:8765                                             │
│           ▼                                                                  │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │   LOCAL PROCESSING                                                   │   │
│  │                                                                      │   │
│  │  ┌───────────────────┐    ┌───────────────────┐                    │   │
│  │  │   MocopiParser    │───▶│   SkeletonState   │                    │   │
│  │  │                   │    │                   │                    │   │
│  │  │   - Parse BVH     │    │   - Joint pos     │                    │   │
│  │  │   - Quaternions   │    │   - Rotations     │                    │   │
│  │  │   - Timestamps    │    │   - Velocities    │                    │   │
│  │  └───────────────────┘    └───────────────────┘                    │   │
│  │                                                                      │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│                             OR                                               │
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │   CLOUD RELAY (for remote access)                                    │   │
│  │                                                                      │   │
│  │   BVH Sender ──▶ VPS Relay ──▶ Browser                              │   │
│  │                                                                      │   │
│  │   wss://your-vps.com:443/relay                                      │   │
│  │                                                                      │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

3.3 Type Definitions

typescript
// Location: apps/web/cc-dashboard/src/lib/mocopi/types.ts

export interface MocopiSensorId {
  HEAD: 'head'
  BODY: 'body'
  L_WRIST: 'l_wrist'
  R_WRIST: 'r_wrist'
  L_ANKLE: 'l_ankle'
  R_ANKLE: 'r_ankle'
}

export interface MocopiFrame {
  timestamp: number
  frameId: number

  sensors: {
    [K in keyof MocopiSensorId]: SensorData
  }

  skeleton?: SkeletonData
}

export interface SensorData {
  quaternion: Quaternion      // [w, x, y, z]
  acceleration: Vector3       // [x, y, z] in m/s²
  angularVelocity?: Vector3   // [x, y, z] in rad/s

  batteryLevel?: number       // 0-100
  signalStrength?: number     // 0-100
}

export interface SkeletonData {
  joints: SkeletonJoint[]
  rootPosition: Vector3
  rootRotation: Quaternion
}

export interface SkeletonJoint {
  name: string
  position: Vector3           // Local position
  rotation: Quaternion        // Local rotation
  worldPosition?: Vector3     // Computed world position
  worldRotation?: Quaternion  // Computed world rotation
}

// BVH Joint hierarchy
export const MOCOPI_SKELETON_HIERARCHY = {
  root: {
    name: 'Hips',
    children: ['Spine', 'LeftUpLeg', 'RightUpLeg']
  },
  Spine: {
    children: ['Spine1']
  },
  Spine1: {
    children: ['Spine2']
  },
  Spine2: {
    children: ['Neck', 'LeftShoulder', 'RightShoulder']
  },
  Neck: {
    children: ['Head']
  },
  LeftShoulder: {
    children: ['LeftArm']
  },
  LeftArm: {
    children: ['LeftForeArm']
  },
  LeftForeArm: {
    children: ['LeftHand']
  },
  RightShoulder: {
    children: ['RightArm']
  },
  RightArm: {
    children: ['RightForeArm']
  },
  RightForeArm: {
    children: ['RightHand']
  },
  LeftUpLeg: {
    children: ['LeftLeg']
  },
  LeftLeg: {
    children: ['LeftFoot']
  },
  LeftFoot: {
    children: ['LeftToeBase']
  },
  RightUpLeg: {
    children: ['RightLeg']
  },
  RightLeg: {
    children: ['RightFoot']
  },
  RightFoot: {
    children: ['RightToeBase']
  }
}

3.4 React Hook

typescript
// Location: apps/web/cc-dashboard/src/lib/mocopi/useMocopiStream.ts

export interface UseMocopiStreamOptions {
  url?: string                // WebSocket URL
  autoConnect?: boolean       // Connect on mount
  reconnect?: boolean         // Auto-reconnect on disconnect
  reconnectInterval?: number  // ms between reconnect attempts
}

export interface UseMocopiStreamReturn {
  // Connection
  connect: () => void
  disconnect: () => void
  isConnected: boolean

  // Data
  frame: MocopiFrame | null
  skeleton: SkeletonData | null
  sensors: Record<string, SensorData>

  // Metrics
  fps: number
  latency: number

  // Errors
  error: Error | null
}

export function useMocopiStream(options?: UseMocopiStreamOptions): UseMocopiStreamReturn

3.5 Cloud Relay Setup

python
# Location: deploy/mocopi-relay/mocopi_relay_v4.py

"""
Mocopi Cloud Relay Server

Features:
- WebSocket server with SSL
- Session management
- Multi-client broadcasting
- Authentication
- Metrics logging
"""

import asyncio
import ssl
import json
from websockets.server import serve
from dataclasses import dataclass
from typing import Dict, Set

@dataclass
class Session:
    id: str
    sender: WebSocket | None
    receivers: Set[WebSocket]
    created_at: float
    last_activity: float
    frame_count: int

class MocopiRelay:
    def __init__(self, host: str, port: int, ssl_context: ssl.SSLContext | None = None):
        self.host = host
        self.port = port
        self.ssl_context = ssl_context
        self.sessions: Dict[str, Session] = {}

    async def handler(self, websocket, path):
        """Handle incoming WebSocket connections."""
        try:
            # First message should be registration
            message = await websocket.recv()
            data = json.loads(message)

            if data['type'] == 'register':
                session_id = data['sessionId']
                client_type = data['clientType']  # 'sender' or 'receiver'

                if client_type == 'sender':
                    await self.handle_sender(websocket, session_id)
                else:
                    await self.handle_receiver(websocket, session_id)
        except Exception as e:
            print(f"Error: {e}")

    async def handle_sender(self, websocket, session_id: str):
        """Handle sender (BVH Sender app)."""
        session = self.get_or_create_session(session_id)
        session.sender = websocket

        try:
            async for message in websocket:
                # Broadcast to all receivers
                if session.receivers:
                    await asyncio.gather(
                        *[r.send(message) for r in session.receivers],
                        return_exceptions=True
                    )
                session.frame_count += 1
                session.last_activity = time.time()
        finally:
            session.sender = None

    async def handle_receiver(self, websocket, session_id: str):
        """Handle receiver (browser)."""
        session = self.get_or_create_session(session_id)
        session.receivers.add(websocket)

        try:
            await websocket.wait_closed()
        finally:
            session.receivers.discard(websocket)

    async def start(self):
        """Start the relay server."""
        async with serve(
            self.handler,
            self.host,
            self.port,
            ssl=self.ssl_context
        ):
            print(f"Mocopi Relay running on {self.host}:{self.port}")
            await asyncio.Future()  # Run forever

# Usage
if __name__ == "__main__":
    relay = MocopiRelay("[ip]", 443, ssl_context)
    asyncio.run(relay.start())

---

4. Sensor Fusion

4.1 Fusion Strategy

The fusion system combines data from multiple sources with different characteristics:

SourceStrengthsWeaknesses
MediaPipePosition accuracy, facial detailOcclusion, lighting dependent
MocopiRotation accuracy, no occlusionNo absolute position, drift
iOS MotionHigh frequency, accelerationSingle device only

4.2 Fusion Algorithm

typescript
// Location: apps/web/cc-dashboard/src/lib/mocopi/MocopiLimbFusion.ts

export class MocopiLimbFusion {
  private kalmanFilters: Map<string, KalmanFilter>
  private lastMediaPipe: PoseOutput | null
  private lastMocopi: MocopiFrame | null

  constructor(config: FusionConfig) {
    // Initialize Kalman filters for each joint
    this.kalmanFilters = new Map()

    for (const joint of FUSION_JOINTS) {
      this.kalmanFilters.set(joint, new KalmanFilter({
        processNoise: config.processNoise,
        measurementNoise: config.measurementNoise,
        initialState: [0, 0, 0, 0, 0, 0]  // pos + vel
      }))
    }
  }

  fuse(mediapipe: PoseOutput | null, mocopi: MocopiFrame | null): FusedSkeletonState {
    const result: FusedSkeletonState = {
      joints: new Map(),
      timestamp: Date.now(),
      confidence: 0
    }

    // Weight sources based on availability and quality
    const mpWeight = mediapipe ? this.computeMediaPipeWeight(mediapipe) : 0
    const mocopiWeight = mocopi ? this.computeMocopiWeight(mocopi) : 0

    const totalWeight = mpWeight + mocopiWeight
    if (totalWeight === 0) return result

    // Fuse each joint
    for (const [jointName, kalman] of this.kalmanFilters) {
      // Get position from MediaPipe (if available)
      const mpPosition = mediapipe
        ? this.getMediaPipeJointPosition(mediapipe, jointName)
        : null

      // Get rotation from Mocopi (if available)
      const mocopiRotation = mocopi
        ? this.getMocopiJointRotation(mocopi, jointName)
        : null

      // Kalman update
      if (mpPosition) {
        kalman.updatePosition(mpPosition, mpWeight / totalWeight)
      }
      if (mocopiRotation) {
        kalman.updateRotation(mocopiRotation, mocopiWeight / totalWeight)
      }

      // Get fused state
      const state = kalman.getState()

      result.joints.set(jointName, {
        position: state.position,
        rotation: state.rotation,
        velocity: state.velocity,
        confidence: (mpWeight + mocopiWeight) / 2
      })
    }

    result.confidence = totalWeight / 2
    return result
  }

  private computeMediaPipeWeight(mp: PoseOutput): number {
    // Weight based on visibility and confidence
    return mp.confidence * (mp.visibleLandmarks / 33)
  }

  private computeMocopiWeight(mocopi: MocopiFrame): number {
    // Weight based on sensor connectivity
    const activeSensors = Object.values(mocopi.sensors)
      .filter(s => s.signalStrength > 50).length
    return activeSensors / 6
  }
}

4.3 Kalman Filter Implementation

typescript
// Simplified Kalman filter for position/velocity fusion

class KalmanFilter {
  private state: Float64Array      // [x, y, z, vx, vy, vz]
  private covariance: Float64Array // 6x6 matrix
  private F: Float64Array          // State transition matrix
  private Q: Float64Array          // Process noise
  private H: Float64Array          // Measurement matrix
  private R: Float64Array          // Measurement noise

  constructor(config: KalmanConfig) {
    this.state = new Float64Array(6)
    this.covariance = this.identity(6)

    // State transition: x' = x + v*dt
    this.F = new Float64Array([
      1, 0, 0, 1, 0, 0,  // x' = x + vx
      0, 1, 0, 0, 1, 0,  // y' = y + vy
      0, 0, 1, 0, 0, 1,  // z' = z + vz
      0, 0, 0, 1, 0, 0,  // vx' = vx
      0, 0, 0, 0, 1, 0,  // vy' = vy
      0, 0, 0, 0, 0, 1   // vz' = vz
    ])

    this.Q = this.diagonal(6, config.processNoise)
    this.R = this.diagonal(3, config.measurementNoise)

    // Measure position only
    this.H = new Float64Array([
      1, 0, 0, 0, 0, 0,
      0, 1, 0, 0, 0, 0,
      0, 0, 1, 0, 0, 0
    ])
  }

  predict(dt: number): void {
    // Update F with actual dt
    this.F[3] = this.F[10] = this.F[17] = dt

    // x' = F * x
    this.state = this.matVecMul(this.F, this.state)

    // P' = F * P * F' + Q
    this.covariance = this.matAdd(
      this.matMul(this.F, this.matMul(this.covariance, this.transpose(this.F))),
      this.Q
    )
  }

  update(measurement: [number, number, number], weight: number): void {
    // Innovation: y = z - H * x
    const predicted = this.matVecMul(this.H, this.state)
    const innovation = measurement.map((z, i) => z - predicted[i])

    // Kalman gain: K = P * H' * (H * P * H' + R)^-1
    const PHt = this.matMul(this.covariance, this.transpose(this.H))
    const S = this.matAdd(
      this.matMul(this.H, PHt),
      this.R.map(r => r / weight)  // Scale noise by weight
    )
    const K = this.matMul(PHt, this.invert(S))

    // Update state: x = x + K * y
    const correction = this.matVecMul(K, innovation)
    this.state = this.state.map((s, i) => s + correction[i])

    // Update covariance: P = (I - K * H) * P
    const KH = this.matMul(K, this.H)
    const IKH = this.identity(6).map((v, i) => v - KH[i])
    this.covariance = this.matMul(IKH, this.covariance)
  }

  getState(): { position: Vector3, velocity: Vector3 } {
    return {
      position: [this.state[0], this.state[1], this.state[2]],
      velocity: [this.state[3], this.state[4], this.state[5]]
    }
  }
}

---

5. Skeleton System

5.1 Unified Skeleton Representation

python
# Location: core/cc-core/cc_core/skeleton/pose_frame.py

from dataclasses import dataclass
from typing import Dict, List, Optional
import numpy as np

@dataclass
class PoseFrame:
    """Single frame of skeleton data."""

    timestamp: float
    joints: Dict[str, JointData]
    root_position: np.ndarray  # [3,]
    root_rotation: np.ndarray  # [4,] quaternion

    # Metadata
    source: str  # 'mediapipe', 'mocopi', 'fused'
    confidence: float
    frame_id: int

@dataclass
class JointData:
    """Data for a single joint."""

    position: np.ndarray       # [3,] local position
    rotation: np.ndarray       # [4,] quaternion
    world_position: np.ndarray # [3,] world position
    world_rotation: np.ndarray # [4,] world quaternion

    velocity: Optional[np.ndarray] = None      # [3,]
    angular_velocity: Optional[np.ndarray] = None  # [3,]

    confidence: float = 1.0


# Standard joint names
JOINT_NAMES = [
    'Hips', 'Spine', 'Spine1', 'Spine2', 'Neck', 'Head',
    'LeftShoulder', 'LeftArm', 'LeftForeArm', 'LeftHand',
    'RightShoulder', 'RightArm', 'RightForeArm', 'RightHand',
    'LeftUpLeg', 'LeftLeg', 'LeftFoot', 'LeftToeBase',
    'RightUpLeg', 'RightLeg', 'RightFoot', 'RightToeBase'
]

# Parent-child relationships
SKELETON_PARENTS = {
    'Hips': None,
    'Spine': 'Hips',
    'Spine1': 'Spine',
    'Spine2': 'Spine1',
    'Neck': 'Spine2',
    'Head': 'Neck',
    'LeftShoulder': 'Spine2',
    'LeftArm': 'LeftShoulder',
    'LeftForeArm': 'LeftArm',
    'LeftHand': 'LeftForeArm',
    'RightShoulder': 'Spine2',
    'RightArm': 'RightShoulder',
    'RightForeArm': 'RightArm',
    'RightHand': 'RightForeArm',
    'LeftUpLeg': 'Hips',
    'LeftLeg': 'LeftUpLeg',
    'LeftFoot': 'LeftLeg',
    'LeftToeBase': 'LeftFoot',
    'RightUpLeg': 'Hips',
    'RightLeg': 'RightUpLeg',
    'RightFoot': 'RightLeg',
    'RightToeBase': 'RightFoot'
}

5.2 Derived Kinematics

python
# Location: core/cc-core/cc_core/skeleton/derived_kinematics.py

import numpy as np
from typing import List
from .pose_frame import PoseFrame

class DerivedKinematics:
    """Compute velocities, accelerations, and jerk from pose sequences."""

    def __init__(self, fps: float = 30.0):
        self.fps = fps
        self.dt = 1.0 / fps
        self.history: List[PoseFrame] = []
        self.max_history = 5

    def update(self, frame: PoseFrame) -> PoseFrame:
        """Add frame and compute derivatives."""
        self.history.append(frame)
        if len(self.history) > self.max_history:
            self.history.pop(0)

        # Compute velocities
        if len(self.history) >= 2:
            self._compute_velocities(frame)

        # Compute accelerations
        if len(self.history) >= 3:
            self._compute_accelerations(frame)

        return frame

    def _compute_velocities(self, frame: PoseFrame):
        """Compute joint velocities."""
        prev = self.history[-2]

        for joint_name, joint in frame.joints.items():
            if joint_name in prev.joints:
                prev_joint = prev.joints[joint_name]

                # Linear velocity
                joint.velocity = (
                    joint.world_position - prev_joint.world_position
                ) / self.dt

                # Angular velocity (from quaternion difference)
                joint.angular_velocity = self._quat_to_angular_velocity(
                    prev_joint.world_rotation,
                    joint.world_rotation,
                    self.dt
                )

    def _quat_to_angular_velocity(
        self,
        q0: np.ndarray,
        q1: np.ndarray,
        dt: float
    ) -> np.ndarray:
        """Convert quaternion change to angular velocity."""
        # q_diff = q1 * q0^-1
        q0_inv = np.array([q0[0], -q0[1], -q0[2], -q0[3]])
        q_diff = self._quat_mul(q1, q0_inv)

        # Extract axis-angle
        angle = 2 * np.arccos(np.clip(q_diff[0], -1, 1))

        if angle < 1e-6:
            return np.zeros(3)

        axis = q_diff[1:] / np.sin(angle / 2)

        return axis * angle / dt

    def get_energy(self, frame: PoseFrame) -> float:
        """Compute total kinetic energy."""
        energy = 0.0

        for joint in frame.joints.values():
            if joint.velocity is not None:
                # Translational energy: 1/2 * m * v^2
                energy += 0.5 * np.sum(joint.velocity ** 2)

            if joint.angular_velocity is not None:
                # Rotational energy (simplified)
                energy += 0.5 * np.sum(joint.angular_velocity ** 2)

        return energy

---

6. API Reference

6.1 MediaPipe Exports

typescript
// apps/web/cc-dashboard/src/lib/mediapipe/index.ts

// Types
export * from './types'

// Legacy API
export {
  MediaPipeService,
  getMediaPipeService,
  destroyMediaPipeService,
} from './MediaPipeService'

// Legacy Hooks
export {
  useMediaPipe,
  useMediaPipeStore,
  useFaceExpressions,
  useHandGestures,
  useBodyPose,
  useMediaPipeEnhancements,
} from './useMediaPipe'

// Modular API - Core
export { EventEmitter, CameraService } from './core'
export type { CameraConfig, CameraState, CameraEvents } from './core'

// Modular API - Analyzers
export {
  BaseAnalyzer,
  FaceAnalyzer,
  HandAnalyzer,
  PoseAnalyzer,
} from './analyzers'

export type {
  AnalyzerConfig,
  AnalyzerOutput,
  FaceAnalyzerConfig,
  FaceOutput,
  HandAnalyzerConfig,
  HandOutput,
  HandGesture,
  FingerState,
  PoseAnalyzerConfig,
  PoseOutput,
  JointState,
  LimbState,
} from './analyzers'

// Modular API - Pipeline
export { HolisticPipeline } from './pipeline'
export type {
  PipelineConfig,
  PipelineState,
  PipelineOutput,
  PipelineEvents,
} from './pipeline'

6.2 Mocopi Exports

typescript
// apps/web/cc-dashboard/src/lib/mocopi/index.ts

export * from './types'

export { useMocopiStream } from './useMocopiStream'
export { useConductorStream } from './useConductorStream'
export { MocopiLimbFusion } from './MocopiLimbFusion'
export { mocopiToConductor } from './conductorAdapter'

---

7. Performance Optimization

7.1 Target Metrics

MetricTargetMeasurement
MediaPipe FPS30 fpsrequestAnimationFrame
Mocopi Latency< 50mstimestamp delta
Fusion Latency< 10msprocessing time
Memory Usage< 200MBheap size

7.2 Optimization Strategies

MediaPipe

typescript
// Use GPU acceleration
const holistic = new Holistic({
  locateFile: (file) =>
    `https://cdn.jsdelivr.net/npm/@mediapipe/holistic/${file}`,
});

holistic.setOptions({
  modelComplexity: 1,         // 0, 1, or 2
  smoothLandmarks: true,
  enableSegmentation: false,  // Disable if not needed
  smoothSegmentation: false,
  refineFaceLandmarks: false, // Disable for performance
  minDetectionConfidence: 0.5,
  minTrackingConfidence: 0.5,
});

WebSocket

typescript
// Use binary protocol for Mocopi
const ws = new WebSocket(url);
ws.binaryType = 'arraybuffer';

ws.onmessage = (event) => {
  // Parse binary directly instead of JSON
  const view = new DataView(event.data);
  const timestamp = view.getFloat64(0);
  // ... parse rest of frame
};

React Optimization

typescript
// Memoize expensive computations
const processedPose = useMemo(() => {
  if (!pose) return null;
  return computeExpensiveMetrics(pose);
}, [pose?.timestamp]);

// Use refs for high-frequency updates
const frameRef = useRef<PoseOutput | null>(null);

useEffect(() => {
  const unsubscribe = pipeline.on('pose', (pose) => {
    frameRef.current = pose;  // Don't trigger re-render
  });
  return unsubscribe;
}, []);

---

Document Version: 2.0.0
Generated: December 26, 2024

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

projects/Documentation/01-architecture/systems/MOTION_CAPTURE_PIPELINE.md

Detected Structure

Method · Evaluation · Code Anchors · Architecture