Grand Diomande Research · Full HTML Reader

Motion Capture Pipeline

1. [Overview](#overview) 2. [MediaPipe Integration](#mediapipe-integration) 3. [Mocopi Integration](#mocopi-integration) 4. [Sensor Fusion](#sensor-fusion) 5. [Skeleton System](#skeleton-system) 6. [API Reference](#api-reference) 7. [Performance Optimization](#performance-optimization)

Embodied Trajectory Systems architecture technical paper candidate score 54 .md

Full Public Reader

Motion Capture Pipeline

Comprehensive Technical Documentation

Version: 2.0.0
Last Updated: December 26, 2024

---

1. [Overview](#overview)
2. [MediaPipe Integration](#mediapipe-integration)
3. [Mocopi Integration](#mocopi-integration)
4. [Sensor Fusion](#sensor-fusion)
5. [Skeleton System](#skeleton-system)
6. [API Reference](#api-reference)
7. [Performance Optimization](#performance-optimization)

---

1. Overview

The Motion Capture Pipeline provides real-time human motion tracking through multiple input sources, fusing them into a unified skeleton representation suitable for audio synthesis and motion generation.

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                     MOTION CAPTURE PIPELINE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                        INPUT LAYER                                     │ │
│  │                                                                        │ │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────┐   │ │
│  │  │    WEBCAM       │  │    MOCOPI       │  │     iOS DEVICE      │   │ │
│  │  │   (Camera)      │  │   (6 IMUs)      │  │   (CoreMotion)      │   │ │
│  │  │   720p/30fps    │  │   60Hz/sensor   │  │   100Hz             │   │ │
│  │  └────────┬────────┘  └────────┬────────┘  └──────────┬──────────┘   │ │
│  │           │                    │                      │               │ │
│  └───────────┼────────────────────┼──────────────────────┼───────────────┘ │
│              │                    │                      │                  │
│              ▼                    ▼                      ▼                  │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                        CAPTURE LAYER                                   │ │
│  │                                                                        │ │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────┐   │ │
│  │  │   MediaPipe     │  │  Mocopi Parser  │  │  Motion Manager     │   │ │
│  │  │   Holistic      │  │                 │  │                     │   │ │
│  │  │                 │  │  BVH Protocol   │  │  CMMotionManager    │   │ │
│  │  │  543 landmarks  │  │  Quaternions    │  │  Device Motion      │   │ │
│  │  └────────┬────────┘  └────────┬────────┘  └──────────┬──────────┘   │ │
│  │           │                    │                      │               │ │
│  │           │ PoseOutput         │ MocopiFrame          │ iOSMotion     │ │
│  │           │ FaceOutput         │                      │               │ │
│  │           │ HandOutput         │                      │               │ │
│  └───────────┼────────────────────┼──────────────────────┼───────────────┘ │
│              │                    │                      │                  │
│              └────────────────────┼──────────────────────┘                  │
│                                   │                                         │
│                                   ▼                                         │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                        FUSION LAYER                                    │ │
│  │                                                                        │ │
│  │  ┌─────────────────────────────────────────────────────────────────┐ │ │
│  │  │                    Kalman Filter Fusion                          │ │ │
│  │  │                                                                  │ │ │
│  │  │   Position (MediaPipe)  ─────┐                                  │ │ │
│  │  │   High visual accuracy       │                                  │ │ │
│  │  │   Occlusion issues           ├──▶  FusedSkeletonState           │ │ │
│  │  │                              │                                  │ │ │
│  │  │   Orientation (Mocopi)  ─────┤                                  │ │ │
│  │  │   High rotational accuracy   │                                  │ │ │
│  │  │   No absolute position       │                                  │ │ │
│  │  │                              │                                  │ │ │
│  │  │   Acceleration (iOS)  ───────┘                                  │ │ │
│  │  │   Supplementary data                                            │ │ │
│  │  │                                                                  │ │ │
│  │  └──────────────────────────────┬──────────────────────────────────┘ │ │
│  └─────────────────────────────────┼─────────────────────────────────────┘ │
│                                    │                                        │
│                                    ▼                                        │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                        OUTPUT LAYER                                    │ │
│  │                                                                        │ │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────┐   │ │
│  │  │  Skeleton       │  │  Derived        │  │  Motion             │   │ │
│  │  │  State          │  │  Kinematics     │  │  Features           │   │ │
│  │  │                 │  │                 │  │                     │   │ │
│  │  │  - Positions    │  │  - Velocities   │  │  - Energy           │   │ │
│  │  │  - Rotations    │  │  - Accelerations│  │  - Gestures         │   │ │
│  │  │  - Confidence   │  │  - Jerk         │  │  - Expressions      │   │ │
│  │  └─────────────────┘  └─────────────────┘  └─────────────────────┘   │ │
│  │                                                                        │ │
│  └───────────────────────────────────────────────────────────────────────┘ │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

---

2. MediaPipe Integration

2.1 Architecture

The MediaPipe integration follows a modular, plugin-based architecture with three layers:

Core Layer

typescript

// Location: apps/web/cc-dashboard/src/lib/mediapipe/core/

// EventEmitter.ts - Type-safe pub/sub
export class EventEmitter<Events extends EventMap = EventMap> {
  private listeners: Map<keyof Events, Set<EventCallback<Events[keyof Events]>>>

  on<K extends keyof Events>(event: K, callback: EventCallback<Events[K]>): () => void
  off<K extends keyof Events>(event: K, callback: EventCallback<Events[K]>): void
  emit<K extends keyof Events>(event: K, data: Events[K]): void
  once<K extends keyof Events>(event: K, callback: EventCallback<Events[K]>): void
  removeAllListeners(): void
}

// CameraService.ts - Webcam lifecycle management
export class CameraService extends EventEmitter<CameraEvents> {
  private videoElement: HTMLVideoElement | null
  private stream: MediaStream | null
  private state: CameraState

  async initialize(videoElement: HTMLVideoElement): Promise<boolean>
  start(): void
  stop(): void
  destroy(): void

  getVideoElement(): HTMLVideoElement | null
  getState(): CameraState
  isRunning(): boolean
}

Analyzer Layer

typescript

// Location: apps/web/cc-dashboard/src/lib/mediapipe/analyzers/

// BaseAnalyzer.ts - Abstract base class
export abstract class BaseAnalyzer<TInput, TOutput extends AnalyzerOutput, TEvents>
  extends EventEmitter<TEvents> {

  protected config: AnalyzerConfig
  protected lastOutput: TOutput | null
  protected frameCount: number

  constructor(config: Partial<AnalyzerConfig>)

  process(input: TInput, timestamp: number): TOutput | null
  protected abstract analyze(input: TInput, timestamp: number): TOutput | null
  protected applySmoothing(current: TOutput): TOutput
  protected checkThreshold(value: number, threshold: number): boolean
}

// FaceAnalyzer.ts - Face expression extraction
export interface FaceOutput extends AnalyzerOutput {
  // Mouth
  mouthOpen: number          // 0-1, how open the mouth is
  smileIntensity: number     // 0-1, smile strength

  // Eyes
  eyeOpenLeft: number        // 0-1
  eyeOpenRight: number       // 0-1
  eyebrowRaiseLeft: number   // 0-1
  eyebrowRaiseRight: number  // 0-1

  // Head pose
  headYaw: number            // -1 to 1 (left/right)
  headPitch: number          // -1 to 1 (up/down)
  headRoll: number           // -1 to 1 (tilt)

  // Events
  isBlinking: boolean
  blinkCount: number
}

// HandAnalyzer.ts - Gesture recognition
export interface HandOutput extends AnalyzerOutput {
  left: SingleHandOutput | null
  right: SingleHandOutput | null
}

export interface SingleHandOutput {
  openness: number           // 0-1, how open the hand is
  grip: number               // 0-1, grip strength
  gesture: HandGesture       // Detected gesture

  // Individual fingers
  thumb: FingerState
  index: FingerState
  middle: FingerState
  ring: FingerState
  pinky: FingerState

  // Palm
  palmPosition: [number, number, number]
  palmNormal: [number, number, number]

  // Pinch detection
  pinchDistance: number
  isPinching: boolean
}

export type HandGesture =
  | 'open'      // All fingers extended
  | 'fist'      // All fingers closed
  | 'point'     // Index extended
  | 'peace'     // Index + middle extended
  | 'thumbsUp'  // Thumb up, others closed
  | 'pinch'     // Thumb + index touching
  | 'unknown'

// PoseAnalyzer.ts - Body pose analysis
export interface PoseOutput extends AnalyzerOutput {
  // Torso
  bodyLean: number           // -1 to 1 (left/right lean)
  bodyTwist: number          // -1 to 1 (rotation)
  crouchLevel: number        // 0-1 (standing to crouching)

  // Arms
  armSpread: number          // 0-1 (arms at sides to spread)
  armRaiseLeft: number       // 0-1 (arm height)
  armRaiseRight: number      // 0-1
  elbowBendLeft: number      // 0-1
  elbowBendRight: number     // 0-1

  // Legs
  legSpread: number          // 0-1
  kneeFlexLeft: number       // 0-1
  kneeFlexRight: number      // 0-1

  // Energy metrics
  bodyEnergy: number         // 0-1, overall movement energy
  upperBodyEnergy: number    // 0-1
  lowerBodyEnergy: number    // 0-1

  // Joint states
  joints: Map<string, JointState>
  limbs: Map<string, LimbState>
}

Pipeline Layer

typescript

// Location: apps/web/cc-dashboard/src/lib/mediapipe/pipeline/

// HolisticPipeline.ts - Main orchestrator
export interface PipelineConfig {
  face: FaceAnalyzerConfig
  hands: HandAnalyzerConfig
  pose: PoseAnalyzerConfig

  // Global settings
  smoothing: number          // 0-1, global smoothing factor
  minDetectionConfidence: number
  minTrackingConfidence: number

  // Performance
  maxFPS: number
  enableGPU: boolean
}

export class HolisticPipeline extends EventEmitter<PipelineEvents> {
  private holistic: Holistic | null
  private camera: CameraService
  private faceAnalyzer: FaceAnalyzer
  private handAnalyzer: HandAnalyzer
  private poseAnalyzer: PoseAnalyzer

  private state: PipelineState
  private lastOutput: PipelineOutput | null

  constructor(config: Partial<PipelineConfig>)

  async initialize(
    videoElement: HTMLVideoElement,
    canvasElement?: HTMLCanvasElement
  ): Promise<boolean>

  start(): void
  stop(): void
  destroy(): void

  // Event subscriptions
  on(event: 'frame', callback: (output: PipelineOutput) => void): () => void
  on(event: 'face', callback: (face: FaceOutput) => void): () => void
  on(event: 'hands', callback: (hands: HandOutput) => void): () => void
  on(event: 'pose', callback: (pose: PoseOutput) => void): () => void
  on(event: 'error', callback: (error: Error) => void): () => void

  getState(): PipelineState
  getLastOutput(): PipelineOutput | null
}

2.2 Landmark Reference

Face Landmarks (468 points)

Key landmarks for expression detection:

Mouth:
  - 13: Upper lip center
  - 14: Lower lip center
  - 78: Left mouth corner
  - 308: Right mouth corner

Eyes:
  - 159: Left eye upper
  - 145: Left eye lower
  - 386: Right eye upper
  - 374: Right eye lower

Eyebrows:
  - 70: Left eyebrow inner
  - 107: Left eyebrow outer
  - 336: Right eyebrow inner
  - 300: Right eyebrow outer

Nose:
  - 1: Nose tip
  - 4: Nose bridge

Forehead:
  - 10: Forehead center

Pose Landmarks (33 points)

POSE_LANDMARKS = {
  0: 'nose',
  1: 'left_eye_inner',
  2: 'left_eye',
  3: 'left_eye_outer',
  4: 'right_eye_inner',
  5: 'right_eye',
  6: 'right_eye_outer',
  7: 'left_ear',
  8: 'right_ear',
  9: 'mouth_left',
  10: 'mouth_right',
  11: 'left_shoulder',
  12: 'right_shoulder',
  13: 'left_elbow',
  14: 'right_elbow',
  15: 'left_wrist',
  16: 'right_wrist',
  17: 'left_pinky',
  18: 'right_pinky',
  19: 'left_index',
  20: 'right_index',
  21: 'left_thumb',
  22: 'right_thumb',
  23: 'left_hip',
  24: 'right_hip',
  25: 'left_knee',
  26: 'right_knee',
  27: 'left_ankle',
  28: 'right_ankle',
  29: 'left_heel',
  30: 'right_heel',
  31: 'left_foot_index',
  32: 'right_foot_index'
}

Hand Landmarks (21 points per hand)

HAND_LANDMARKS = {
  0: 'wrist',
  1: 'thumb_cmc',
  2: 'thumb_mcp',
  3: 'thumb_ip',
  4: 'thumb_tip',
  5: 'index_mcp',
  6: 'index_pip',
  7: 'index_dip',
  8: 'index_tip',
  9: 'middle_mcp',
  10: 'middle_pip',
  11: 'middle_dip',
  12: 'middle_tip',
  13: 'ring_mcp',
  14: 'ring_pip',
  15: 'ring_dip',
  16: 'ring_tip',
  17: 'pinky_mcp',
  18: 'pinky_pip',
  19: 'pinky_dip',
  20: 'pinky_tip'
}

2.3 Usage Examples

Basic Usage

typescript

import { HolisticPipeline } from '@/lib/mediapipe'

// Initialize
const pipeline = new HolisticPipeline({
  face: { smoothing: 0.3 },
  hands: { detectGestures: true },
  pose: { smoothing: 0.5 }
})

// Get video element
const video = document.getElementById('video') as HTMLVideoElement

// Initialize and start
await pipeline.initialize(video)
pipeline.start()

// Subscribe to events
pipeline.on('face', (face) => {
  if (face.smileIntensity > 0.5) {
    console.log('User is smiling!')
  }
})

pipeline.on('hands', (hands) => {
  if (hands.left?.gesture === 'fist') {
    console.log('Left fist detected')
  }
})

pipeline.on('pose', (pose) => {
  console.log(`Body energy: ${pose.bodyEnergy}`)
})

// Cleanup
pipeline.destroy()

React Hook Usage

typescript

import { useMediaPipe } from '@/lib/mediapipe'

function MotionCapture() {
  const videoRef = useRef<HTMLVideoElement>(null)

  const {
    start,
    stop,
    isRunning,
    face,
    hands,
    pose,
    error
  } = useMediaPipe({
    videoRef,
    autoStart: true,
    config: {
      face: { detectBlinks: true },
      hands: { detectGestures: true }
    }
  })

  return (
    <div>
      <video ref={videoRef} />
      {face && (
        <div>Smile: {(face.smileIntensity * 100).toFixed(0)}%</div>
      )}
      {hands?.left && (
        <div>Left hand: {hands.left.gesture}</div>
      )}
      {pose && (
        <div>Energy: {(pose.bodyEnergy * 100).toFixed(0)}%</div>
      )}
    </div>
  )
}

---

3. Mocopi Integration

3.1 Overview

Sony Mocopi provides 6 IMU sensors for full-body motion tracking:

Sensor	Placement	Primary Function
HEAD	Forehead band	Head orientation
BODY	Chest/hip clip	Torso orientation
L_WRIST	Left wrist band	Left arm tracking
R_WRIST	Right wrist band	Right arm tracking
L_ANKLE	Left ankle band	Left leg tracking
R_ANKLE	Right ankle band	Right leg tracking

3.2 Data Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                         MOCOPI DATA FLOW                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────────┐                                                        │
│  │   Mocopi        │                                                        │
│  │   Sensors       │                                                        │
│  │   (6 IMUs)      │                                                        │
│  └────────┬────────┘                                                        │
│           │ Bluetooth                                                        │
│           ▼                                                                  │
│  ┌─────────────────┐                                                        │
│  │   Mocopi App    │                                                        │
│  │   (iOS/Android) │                                                        │
│  │                 │                                                        │
│  │   - Calibration │                                                        │
│  │   - Skeleton    │                                                        │
│  │     solving     │                                                        │
│  └────────┬────────┘                                                        │
│           │ UDP (Port 12351)                                                │
│           │ BVH Protocol                                                    │
│           ▼                                                                  │
│  ┌─────────────────┐                                                        │
│  │   BVH Sender    │                                                        │
│  │   (macOS App)   │                                                        │
│  │                 │                                                        │
│  │   - UDP receive │                                                        │
│  │   - WebSocket   │                                                        │
│  │     broadcast   │                                                        │
│  └────────┬────────┘                                                        │
│           │ WebSocket                                                        │
│           │ ws://localhost:8765                                             │
│           ▼                                                                  │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │   LOCAL PROCESSING                                                   │   │
│  │                                                                      │   │
│  │  ┌───────────────────┐    ┌───────────────────┐                    │   │
│  │  │   MocopiParser    │───▶│   SkeletonState   │                    │   │
│  │  │                   │    │                   │                    │   │
│  │  │   - Parse BVH     │    │   - Joint pos     │                    │   │
│  │  │   - Quaternions   │    │   - Rotations     │                    │   │
│  │  │   - Timestamps    │    │   - Velocities    │                    │   │
│  │  └───────────────────┘    └───────────────────┘                    │   │
│  │                                                                      │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│                             OR                                               │
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │   CLOUD RELAY (for remote access)                                    │   │
│  │                                                                      │   │
│  │   BVH Sender ──▶ VPS Relay ──▶ Browser                              │   │
│  │                                                                      │   │
│  │   wss://your-vps.com:443/relay                                      │   │
│  │                                                                      │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

3.3 Type Definitions

typescript

// Location: apps/web/cc-dashboard/src/lib/mocopi/types.ts

export interface MocopiSensorId {
  HEAD: 'head'
  BODY: 'body'
  L_WRIST: 'l_wrist'
  R_WRIST: 'r_wrist'
  L_ANKLE: 'l_ankle'
  R_ANKLE: 'r_ankle'
}

export interface MocopiFrame {
  timestamp: number
  frameId: number

  sensors: {
    [K in keyof MocopiSensorId]: SensorData
  }

  skeleton?: SkeletonData
}

export interface SensorData {
  quaternion: Quaternion      // [w, x, y, z]
  acceleration: Vector3       // [x, y, z] in m/s²
  angularVelocity?: Vector3   // [x, y, z] in rad/s

  batteryLevel?: number       // 0-100
  signalStrength?: number     // 0-100
}

export interface SkeletonData {
  joints: SkeletonJoint[]
  rootPosition: Vector3
  rootRotation: Quaternion
}

export interface SkeletonJoint {
  name: string
  position: Vector3           // Local position
  rotation: Quaternion        // Local rotation
  worldPosition?: Vector3     // Computed world position
  worldRotation?: Quaternion  // Computed world rotation
}

// BVH Joint hierarchy
export const MOCOPI_SKELETON_HIERARCHY = {
  root: {
    name: 'Hips',
    children: ['Spine', 'LeftUpLeg', 'RightUpLeg']
  },
  Spine: {
    children: ['Spine1']
  },
  Spine1: {
    children: ['Spine2']
  },
  Spine2: {
    children: ['Neck', 'LeftShoulder', 'RightShoulder']
  },
  Neck: {
    children: ['Head']
  },
  LeftShoulder: {
    children: ['LeftArm']
  },
  LeftArm: {
    children: ['LeftForeArm']
  },
  LeftForeArm: {
    children: ['LeftHand']
  },
  RightShoulder: {
    children: ['RightArm']
  },
  RightArm: {
    children: ['RightForeArm']
  },
  RightForeArm: {
    children: ['RightHand']
  },
  LeftUpLeg: {
    children: ['LeftLeg']
  },
  LeftLeg: {
    children: ['LeftFoot']
  },
  LeftFoot: {
    children: ['LeftToeBase']
  },
  RightUpLeg: {
    children: ['RightLeg']
  },
  RightLeg: {
    children: ['RightFoot']
  },
  RightFoot: {
    children: ['RightToeBase']
  }
}

3.4 React Hook

typescript

// Location: apps/web/cc-dashboard/src/lib/mocopi/useMocopiStream.ts

export interface UseMocopiStreamOptions {
  url?: string                // WebSocket URL
  autoConnect?: boolean       // Connect on mount
  reconnect?: boolean         // Auto-reconnect on disconnect
  reconnectInterval?: number  // ms between reconnect attempts
}

export interface UseMocopiStreamReturn {
  // Connection
  connect: () => void
  disconnect: () => void
  isConnected: boolean

  // Data
  frame: MocopiFrame | null
  skeleton: SkeletonData | null
  sensors: Record<string, SensorData>

  // Metrics
  fps: number
  latency: number

  // Errors
  error: Error | null
}

export function useMocopiStream(options?: UseMocopiStreamOptions): UseMocopiStreamReturn

3.5 Cloud Relay Setup

python

# Location: deploy/mocopi-relay/mocopi_relay_v4.py

"""
Mocopi Cloud Relay Server

Features:
- WebSocket server with SSL
- Session management
- Multi-client broadcasting
- Authentication
- Metrics logging
"""

import asyncio
import ssl
import json
from websockets.server import serve
from dataclasses import dataclass
from typing import Dict, Set

@dataclass
class Session:
    id: str
    sender: WebSocket | None
    receivers: Set[WebSocket]
    created_at: float
    last_activity: float
    frame_count: int

class MocopiRelay:
    def __init__(self, host: str, port: int, ssl_context: ssl.SSLContext | None = None):
        self.host = host
        self.port = port
        self.ssl_context = ssl_context
        self.sessions: Dict[str, Session] = {}

    async def handler(self, websocket, path):
        """Handle incoming WebSocket connections."""
        try:
            # First message should be registration
            message = await websocket.recv()
            data = json.loads(message)

            if data['type'] == 'register':
                session_id = data['sessionId']
                client_type = data['clientType']  # 'sender' or 'receiver'

                if client_type == 'sender':
                    await self.handle_sender(websocket, session_id)
                else:
                    await self.handle_receiver(websocket, session_id)
        except Exception as e:
            print(f"Error: {e}")

    async def handle_sender(self, websocket, session_id: str):
        """Handle sender (BVH Sender app)."""
        session = self.get_or_create_session(session_id)
        session.sender = websocket

        try:
            async for message in websocket:
                # Broadcast to all receivers
                if session.receivers:
                    await asyncio.gather(
                        *[r.send(message) for r in session.receivers],
                        return_exceptions=True
                    )
                session.frame_count += 1
                session.last_activity = time.time()
        finally:
            session.sender = None

    async def handle_receiver(self, websocket, session_id: str):
        """Handle receiver (browser)."""
        session = self.get_or_create_session(session_id)
        session.receivers.add(websocket)

        try:
            await websocket.wait_closed()
        finally:
            session.receivers.discard(websocket)

    async def start(self):
        """Start the relay server."""
        async with serve(
            self.handler,
            self.host,
            self.port,
            ssl=self.ssl_context
        ):
            print(f"Mocopi Relay running on {self.host}:{self.port}")
            await asyncio.Future()  # Run forever

# Usage
if __name__ == "__main__":
    relay = MocopiRelay("[ip]", 443, ssl_context)
    asyncio.run(relay.start())

---

4. Sensor Fusion

4.1 Fusion Strategy

The fusion system combines data from multiple sources with different characteristics:

Source	Strengths	Weaknesses
MediaPipe	Position accuracy, facial detail	Occlusion, lighting dependent
Mocopi	Rotation accuracy, no occlusion	No absolute position, drift
iOS Motion	High frequency, acceleration	Single device only

4.2 Fusion Algorithm

typescript

// Location: apps/web/cc-dashboard/src/lib/mocopi/MocopiLimbFusion.ts

export class MocopiLimbFusion {
  private kalmanFilters: Map<string, KalmanFilter>
  private lastMediaPipe: PoseOutput | null
  private lastMocopi: MocopiFrame | null

  constructor(config: FusionConfig) {
    // Initialize Kalman filters for each joint
    this.kalmanFilters = new Map()

    for (const joint of FUSION_JOINTS) {
      this.kalmanFilters.set(joint, new KalmanFilter({
        processNoise: config.processNoise,
        measurementNoise: config.measurementNoise,
        initialState: [0, 0, 0, 0, 0, 0]  // pos + vel
      }))
    }
  }

  fuse(mediapipe: PoseOutput | null, mocopi: MocopiFrame | null): FusedSkeletonState {
    const result: FusedSkeletonState = {
      joints: new Map(),
      timestamp: Date.now(),
      confidence: 0
    }

    // Weight sources based on availability and quality
    const mpWeight = mediapipe ? this.computeMediaPipeWeight(mediapipe) : 0
    const mocopiWeight = mocopi ? this.computeMocopiWeight(mocopi) : 0

    const totalWeight = mpWeight + mocopiWeight
    if (totalWeight === 0) return result

    // Fuse each joint
    for (const [jointName, kalman] of this.kalmanFilters) {
      // Get position from MediaPipe (if available)
      const mpPosition = mediapipe
        ? this.getMediaPipeJointPosition(mediapipe, jointName)
        : null

      // Get rotation from Mocopi (if available)
      const mocopiRotation = mocopi
        ? this.getMocopiJointRotation(mocopi, jointName)
        : null

      // Kalman update
      if (mpPosition) {
        kalman.updatePosition(mpPosition, mpWeight / totalWeight)
      }
      if (mocopiRotation) {
        kalman.updateRotation(mocopiRotation, mocopiWeight / totalWeight)
      }

      // Get fused state
      const state = kalman.getState()

      result.joints.set(jointName, {
        position: state.position,
        rotation: state.rotation,
        velocity: state.velocity,
        confidence: (mpWeight + mocopiWeight) / 2
      })
    }

    result.confidence = totalWeight / 2
    return result
  }

  private computeMediaPipeWeight(mp: PoseOutput): number {
    // Weight based on visibility and confidence
    return mp.confidence * (mp.visibleLandmarks / 33)
  }

  private computeMocopiWeight(mocopi: MocopiFrame): number {
    // Weight based on sensor connectivity
    const activeSensors = Object.values(mocopi.sensors)
      .filter(s => s.signalStrength > 50).length
    return activeSensors / 6
  }
}

4.3 Kalman Filter Implementation

typescript

// Simplified Kalman filter for position/velocity fusion

class KalmanFilter {
  private state: Float64Array      // [x, y, z, vx, vy, vz]
  private covariance: Float64Array // 6x6 matrix
  private F: Float64Array          // State transition matrix
  private Q: Float64Array          // Process noise
  private H: Float64Array          // Measurement matrix
  private R: Float64Array          // Measurement noise

  constructor(config: KalmanConfig) {
    this.state = new Float64Array(6)
    this.covariance = this.identity(6)

    // State transition: x' = x + v*dt
    this.F = new Float64Array([
      1, 0, 0, 1, 0, 0,  // x' = x + vx
      0, 1, 0, 0, 1, 0,  // y' = y + vy
      0, 0, 1, 0, 0, 1,  // z' = z + vz
      0, 0, 0, 1, 0, 0,  // vx' = vx
      0, 0, 0, 0, 1, 0,  // vy' = vy
      0, 0, 0, 0, 0, 1   // vz' = vz
    ])

    this.Q = this.diagonal(6, config.processNoise)
    this.R = this.diagonal(3, config.measurementNoise)

    // Measure position only
    this.H = new Float64Array([
      1, 0, 0, 0, 0, 0,
      0, 1, 0, 0, 0, 0,
      0, 0, 1, 0, 0, 0
    ])
  }

  predict(dt: number): void {
    // Update F with actual dt
    this.F[3] = this.F[10] = this.F[17] = dt

    // x' = F * x
    this.state = this.matVecMul(this.F, this.state)

    // P' = F * P * F' + Q
    this.covariance = this.matAdd(
      this.matMul(this.F, this.matMul(this.covariance, this.transpose(this.F))),
      this.Q
    )
  }

  update(measurement: [number, number, number], weight: number): void {
    // Innovation: y = z - H * x
    const predicted = this.matVecMul(this.H, this.state)
    const innovation = measurement.map((z, i) => z - predicted[i])

    // Kalman gain: K = P * H' * (H * P * H' + R)^-1
    const PHt = this.matMul(this.covariance, this.transpose(this.H))
    const S = this.matAdd(
      this.matMul(this.H, PHt),
      this.R.map(r => r / weight)  // Scale noise by weight
    )
    const K = this.matMul(PHt, this.invert(S))

    // Update state: x = x + K * y
    const correction = this.matVecMul(K, innovation)
    this.state = this.state.map((s, i) => s + correction[i])

    // Update covariance: P = (I - K * H) * P
    const KH = this.matMul(K, this.H)
    const IKH = this.identity(6).map((v, i) => v - KH[i])
    this.covariance = this.matMul(IKH, this.covariance)
  }

  getState(): { position: Vector3, velocity: Vector3 } {
    return {
      position: [this.state[0], this.state[1], this.state[2]],
      velocity: [this.state[3], this.state[4], this.state[5]]
    }
  }
}

---

5. Skeleton System

5.1 Unified Skeleton Representation

python

# Location: core/cc-core/cc_core/skeleton/pose_frame.py

from dataclasses import dataclass
from typing import Dict, List, Optional
import numpy as np

@dataclass
class PoseFrame:
    """Single frame of skeleton data."""

    timestamp: float
    joints: Dict[str, JointData]
    root_position: np.ndarray  # [3,]
    root_rotation: np.ndarray  # [4,] quaternion

    # Metadata
    source: str  # 'mediapipe', 'mocopi', 'fused'
    confidence: float
    frame_id: int

@dataclass
class JointData:
    """Data for a single joint."""

    position: np.ndarray       # [3,] local position
    rotation: np.ndarray       # [4,] quaternion
    world_position: np.ndarray # [3,] world position
    world_rotation: np.ndarray # [4,] world quaternion

    velocity: Optional[np.ndarray] = None      # [3,]
    angular_velocity: Optional[np.ndarray] = None  # [3,]

    confidence: float = 1.0


# Standard joint names
JOINT_NAMES = [
    'Hips', 'Spine', 'Spine1', 'Spine2', 'Neck', 'Head',
    'LeftShoulder', 'LeftArm', 'LeftForeArm', 'LeftHand',
    'RightShoulder', 'RightArm', 'RightForeArm', 'RightHand',
    'LeftUpLeg', 'LeftLeg', 'LeftFoot', 'LeftToeBase',
    'RightUpLeg', 'RightLeg', 'RightFoot', 'RightToeBase'
]

# Parent-child relationships
SKELETON_PARENTS = {
    'Hips': None,
    'Spine': 'Hips',
    'Spine1': 'Spine',
    'Spine2': 'Spine1',
    'Neck': 'Spine2',
    'Head': 'Neck',
    'LeftShoulder': 'Spine2',
    'LeftArm': 'LeftShoulder',
    'LeftForeArm': 'LeftArm',
    'LeftHand': 'LeftForeArm',
    'RightShoulder': 'Spine2',
    'RightArm': 'RightShoulder',
    'RightForeArm': 'RightArm',
    'RightHand': 'RightForeArm',
    'LeftUpLeg': 'Hips',
    'LeftLeg': 'LeftUpLeg',
    'LeftFoot': 'LeftLeg',
    'LeftToeBase': 'LeftFoot',
    'RightUpLeg': 'Hips',
    'RightLeg': 'RightUpLeg',
    'RightFoot': 'RightLeg',
    'RightToeBase': 'RightFoot'
}

5.2 Derived Kinematics

python

# Location: core/cc-core/cc_core/skeleton/derived_kinematics.py

import numpy as np
from typing import List
from .pose_frame import PoseFrame

class DerivedKinematics:
    """Compute velocities, accelerations, and jerk from pose sequences."""

    def __init__(self, fps: float = 30.0):
        self.fps = fps
        self.dt = 1.0 / fps
        self.history: List[PoseFrame] = []
        self.max_history = 5

    def update(self, frame: PoseFrame) -> PoseFrame:
        """Add frame and compute derivatives."""
        self.history.append(frame)
        if len(self.history) > self.max_history:
            self.history.pop(0)

        # Compute velocities
        if len(self.history) >= 2:
            self._compute_velocities(frame)

        # Compute accelerations
        if len(self.history) >= 3:
            self._compute_accelerations(frame)

        return frame

    def _compute_velocities(self, frame: PoseFrame):
        """Compute joint velocities."""
        prev = self.history[-2]

        for joint_name, joint in frame.joints.items():
            if joint_name in prev.joints:
                prev_joint = prev.joints[joint_name]

                # Linear velocity
                joint.velocity = (
                    joint.world_position - prev_joint.world_position
                ) / self.dt

                # Angular velocity (from quaternion difference)
                joint.angular_velocity = self._quat_to_angular_velocity(
                    prev_joint.world_rotation,
                    joint.world_rotation,
                    self.dt
                )

    def _quat_to_angular_velocity(
        self,
        q0: np.ndarray,
        q1: np.ndarray,
        dt: float
    ) -> np.ndarray:
        """Convert quaternion change to angular velocity."""
        # q_diff = q1 * q0^-1
        q0_inv = np.array([q0[0], -q0[1], -q0[2], -q0[3]])
        q_diff = self._quat_mul(q1, q0_inv)

        # Extract axis-angle
        angle = 2 * np.arccos(np.clip(q_diff[0], -1, 1))

        if angle < 1e-6:
            return np.zeros(3)

        axis = q_diff[1:] / np.sin(angle / 2)

        return axis * angle / dt

    def get_energy(self, frame: PoseFrame) -> float:
        """Compute total kinetic energy."""
        energy = 0.0

        for joint in frame.joints.values():
            if joint.velocity is not None:
                # Translational energy: 1/2 * m * v^2
                energy += 0.5 * np.sum(joint.velocity ** 2)

            if joint.angular_velocity is not None:
                # Rotational energy (simplified)
                energy += 0.5 * np.sum(joint.angular_velocity ** 2)

        return energy

---

6. API Reference

6.1 MediaPipe Exports

typescript

// apps/web/cc-dashboard/src/lib/mediapipe/index.ts

// Types
export * from './types'

// Legacy API
export {
  MediaPipeService,
  getMediaPipeService,
  destroyMediaPipeService,
} from './MediaPipeService'

// Legacy Hooks
export {
  useMediaPipe,
  useMediaPipeStore,
  useFaceExpressions,
  useHandGestures,
  useBodyPose,
  useMediaPipeEnhancements,
} from './useMediaPipe'

// Modular API - Core
export { EventEmitter, CameraService } from './core'
export type { CameraConfig, CameraState, CameraEvents } from './core'

// Modular API - Analyzers
export {
  BaseAnalyzer,
  FaceAnalyzer,
  HandAnalyzer,
  PoseAnalyzer,
} from './analyzers'

export type {
  AnalyzerConfig,
  AnalyzerOutput,
  FaceAnalyzerConfig,
  FaceOutput,
  HandAnalyzerConfig,
  HandOutput,
  HandGesture,
  FingerState,
  PoseAnalyzerConfig,
  PoseOutput,
  JointState,
  LimbState,
} from './analyzers'

// Modular API - Pipeline
export { HolisticPipeline } from './pipeline'
export type {
  PipelineConfig,
  PipelineState,
  PipelineOutput,
  PipelineEvents,
} from './pipeline'

6.2 Mocopi Exports

typescript

// apps/web/cc-dashboard/src/lib/mocopi/index.ts

export * from './types'

export { useMocopiStream } from './useMocopiStream'
export { useConductorStream } from './useConductorStream'
export { MocopiLimbFusion } from './MocopiLimbFusion'
export { mocopiToConductor } from './conductorAdapter'

---

7. Performance Optimization

7.1 Target Metrics

Metric	Target	Measurement
MediaPipe FPS	30 fps	requestAnimationFrame
Mocopi Latency	< 50ms	timestamp delta
Fusion Latency	< 10ms	processing time
Memory Usage	< 200MB	heap size

7.2 Optimization Strategies

MediaPipe

typescript

// Use GPU acceleration
const holistic = new Holistic({
  locateFile: (file) =>
    `https://cdn.jsdelivr.net/npm/@mediapipe/holistic/${file}`,
});

holistic.setOptions({
  modelComplexity: 1,         // 0, 1, or 2
  smoothLandmarks: true,
  enableSegmentation: false,  // Disable if not needed
  smoothSegmentation: false,
  refineFaceLandmarks: false, // Disable for performance
  minDetectionConfidence: 0.5,
  minTrackingConfidence: 0.5,
});

WebSocket

typescript

// Use binary protocol for Mocopi
const ws = new WebSocket(url);
ws.binaryType = 'arraybuffer';

ws.onmessage = (event) => {
  // Parse binary directly instead of JSON
  const view = new DataView(event.data);
  const timestamp = view.getFloat64(0);
  // ... parse rest of frame
};

React Optimization

typescript

// Memoize expensive computations
const processedPose = useMemo(() => {
  if (!pose) return null;
  return computeExpensiveMetrics(pose);
}, [pose?.timestamp]);

// Use refs for high-frequency updates
const frameRef = useRef<PoseOutput | null>(null);

useEffect(() => {
  const unsubscribe = pipeline.on('pose', (pose) => {
    frameRef.current = pose;  // Don't trigger re-render
  });
  return unsubscribe;
}, []);

---

Document Version: 2.0.0
Generated: December 26, 2024

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

projects/Documentation/01-architecture/systems/MOTION_CAPTURE_PIPELINE.md

Detected Structure

Method · Evaluation · Code Anchors · Architecture

Full Public Reader

Motion Capture Pipeline

Comprehensive Technical Documentation

Table of Contents

1. Overview

Architecture Diagram

2. MediaPipe Integration

2.1 Architecture

Core Layer

Analyzer Layer

Pipeline Layer

2.2 Landmark Reference

Face Landmarks (468 points)

Pose Landmarks (33 points)

Hand Landmarks (21 points per hand)

2.3 Usage Examples

Basic Usage

React Hook Usage

3. Mocopi Integration

3.1 Overview

3.2 Data Flow

3.3 Type Definitions

3.4 React Hook

3.5 Cloud Relay Setup

4. Sensor Fusion

4.1 Fusion Strategy

4.2 Fusion Algorithm

4.3 Kalman Filter Implementation

5. Skeleton System

5.1 Unified Skeleton Representation

5.2 Derived Kinematics

6. API Reference

6.1 MediaPipe Exports

6.2 Mocopi Exports

7. Performance Optimization

7.1 Target Metrics

7.2 Optimization Strategies

MediaPipe

WebSocket

React Optimization

Promotion Decision

Source Anchor

Detected Structure