Grand Diomande Research · Full HTML Reader

Motion as Language: Semantic Meaning from Movement

This document explores how continuous human movement maps to discrete semantic meaning through the Comp-Core motion intelligence pipeline. At its heart: the **2.16ms latent motion window**—a quantum of embodied computation that bridges the gap between raw sensor data and meaningful intent.

Embodied Trajectory Systems research note experiment writeup candidate score 38 .md

Full Public Reader

Motion as Language: Semantic Meaning from Movement

> "The body speaks before the mind knows what it's saying."

Overview

This document explores how continuous human movement maps to discrete semantic meaning through the Comp-Core motion intelligence pipeline. At its heart: the 2.16ms latent motion window—a quantum of embodied computation that bridges the gap between raw sensor data and meaningful intent.

---

Table of Contents

1. [The 2.16ms Latent Window](#the-216ms-latent-window)
2. [Gesture Vocabulary Taxonomy](#gesture-vocabulary-taxonomy)
3. [Motion → Intent Translation Pipeline](#motion--intent-translation-pipeline)
4. [Connection to cc-inscription Sigils](#connection-to-cc-inscription-sigils)
5. [Theoretical Foundations](#theoretical-foundations)
6. [Implementation Architecture](#implementation-architecture)

---

The 2.16ms Latent Window

What Makes 2.16ms Special?

The Comp-Core system achieves 2.16ms average latency from sensor input to semantic output. This isn't arbitrary—it's the sweet spot where:

1. Perceptual continuity: Below the 10-20ms threshold where humans perceive delay
2. Information sufficiency: Enough motion data for meaningful feature extraction
3. Causal coherence: Fast enough to feel like "now," slow enough to reason about

The Latent Space

Motion data flows through a 104-dimensional latent space (via RPS—Recursive Polymodal Synthesis):

┌─────────────────────────────────────────────────────────────────┐
│                    104-D LATENT SPACE                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Motion (25D)  ──┐                                              │
│                  │    ┌──────────┐    ┌──────────┐              │
│  Heart Rate (8D) ├───►│  RPS     │───►│ Latent z │─► 104D       │
│                  │    │ Encoders │    │  Vector  │              │
│  Audio (32D)  ───┤    └──────────┘    └──────────┘              │
│                  │                                              │
│  Context (39D) ──┘                                              │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Temporal Dynamics

A single frame tells you where. A window tells you what:

Window SizeInformation CapturedUse Case
1 frame (16ms @ 60Hz)Position onlyState snapshot
8 frames (~130ms)Velocity + directionMicro-gesture detection
32 frames (~530ms)Acceleration + rhythmPhrase recognition
128 frames (~2.1s)Pattern + periodicityMovement style, intent

The anticipation kernel operates on a sliding window, extracting temporal features that map to meaning:

python
# Conceptual representation
commitment = how_locked_in_is_the_trajectory(window)    # 0-1
uncertainty = entropy_over_possible_futures(window)     # 0-1
transition_pressure = rate_of_change(commitment, uncertainty)  # Signed

---

Gesture Vocabulary Taxonomy

Hierarchy of Motion Meaning

┌─────────────────────────────────────────────────────────────────┐
│                    MOTION MEANING HIERARCHY                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  LEVEL 4: INTENT                                                │
│    "Navigate to kitchen"   "Express agreement"   "Request help" │
│         ▲                        ▲                    ▲         │
│         │                        │                    │         │
│  LEVEL 3: SEMANTIC PHRASES                                      │
│    [reach → point → hold]    [nod × 2]    [wave + point]       │
│         ▲                        ▲                    ▲         │
│         │                        │                    │         │
│  LEVEL 2: GESTURES                                              │
│    swipe_left    tap    hold    circle    nod    wave          │
│         ▲                        ▲                    ▲         │
│         │                        │                    │         │
│  LEVEL 1: PRIMITIVES                                            │
│    velocity    acceleration    jerk    angular_velocity        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

The Three Gesture Domains

#### 1. Full-Body Gestures (Mocopi/Vision)
Captured from full-body motion tracking:
- Postural: T-pose, arms crossed, crouch
- Locomotive: Walk, run, jump, pivot
- Expressive: Dance moves, shrug, bow

#### 2. Hand Gestures (iPhone/Watch IMU)
Lightweight 6DOF tracking:
- Directional: Swipe, flick, throw
- Spatial: Circle, spiral, figure-8
- Temporal: Tap, double-tap, hold, shake

#### 3. Compound Gestures (Sequences)
Multi-step intentional patterns:
- Navigation: Point-and-hold, sweep-and-select
- Communication: Attention-getting sequences
- Control: Volume dial, slider, switch

Mapping to Anticipation Signals

Each gesture type produces characteristic anticipation signatures:

Gesture TypeCommitmentUncertaintyTransition Pressure
Ballistic (swipe)High → Peak → LowLow throughoutStrong positive then negative
Exploratory (circle)Medium, oscillatingHigh initially, decreasesLow amplitude oscillation
Static (hold)High, stableVery lowNear zero
Transitional (tap)Spike patternBrief spikeSharp bidirectional

---

Motion → Intent Translation Pipeline

The Complete Pipeline

┌─────────────────────────────────────────────────────────────────┐
│                 MOTION → INTENT PIPELINE                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  SENSORS                                                        │
│    Mocopi IMU (full-body)                                       │
│    iPhone/Watch (hand)        ┌─────────────────┐               │
│    MediaPipe (vision)    ────►│ cc-collection   │               │
│                               │ (Sensor Fusion) │               │
│                               └────────┬────────┘               │
│                                        │                        │
│                                        ▼                        │
│  ALIGNMENT                    ┌─────────────────┐               │
│                               │ cc-window-      │               │
│                               │ aligner         │               │
│                               │ (Deterministic) │               │
│                               └────────┬────────┘               │
│                                        │                        │
│                                        ▼                        │
│  ANTICIPATION                 ┌─────────────────┐               │
│                               │ cc-anticipation │               │
│                               │ (2.16ms kernel) │               │
│                               └────────┬────────┘               │
│                                        │                        │
│                         ┌──────────────┼──────────────┐         │
│                         ▼              ▼              ▼         │
│  CLASSIFICATION   ┌──────────┐  ┌──────────┐  ┌──────────┐     │
│                   │ cc-gesture│  │cc-conduct│  │cc-inscri-│     │
│                   │ (labels) │  │ (policy) │  │ ption    │     │
│                   └────┬─────┘  └────┬─────┘  └────┬─────┘     │
│                        │             │             │            │
│                        └──────────────┼──────────────┘          │
│                                       ▼                         │
│  OUTPUT                      ┌─────────────────┐                │
│                              │     Intent      │                │
│                              │  + N'Ko Sigil   │                │
│                              │  + Confidence   │                │
│                              └─────────────────┘                │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Stage 1: Sensor Fusion (cc-collection)

Raw sensor data → Unified skeleton representation

rust
// Extended Kalman Filter combining multiple sources
engine = FusionEngine::new(session_id);
skeleton = engine.process(mocopi_json, dt);

// Transform to ML-ready format
transform = To25DTransform::new(frame_rate);
motion_vector = transform.transform(skeleton, beat_phase); // 25-D

Stage 2: Window Alignment (cc-window-aligner)

Streaming frames → Deterministic windows

  • Beat-aligned: Snap boundaries to musical beat phase
  • Replay-stable: Same input always produces same windows
  • Gap-tolerant: Handles dropped frames gracefully

Stage 3: Anticipation Kernel (cc-anticipation)

Windows → Anticipatory signals

python
from cc_anticipation import AnticipationKernel, MotionWindow

kernel = AnticipationKernel(config)
packet = kernel.process(window)

# Extract semantic signals
commitment = packet.commitment      # How locked-in is the motion?
uncertainty = packet.uncertainty    # How many futures are possible?
transition_pressure = packet.transition_pressure  # Is a change happening?
novelty = packet.novelty           # Is this new or familiar?
stability = packet.stability       # Is this predictable?

Stage 4: Gesture Classification (cc-gesture)

Anticipation signals + Neighbor voting → Labeled gestures

rust
// Query phrase library for similar motions
let neighbors = library.query(&embedding, k=5);

// Aggregate votes weighted by distance
let votes = classifier.aggregate_votes(&neighbors);

// Gate by anticipation signals
if packet.commitment > threshold && packet.uncertainty < threshold {
    return GestureEvent { label, confidence, timestamp };
}

Stage 5: Intent Inference

Gestures → Semantic intent

The final mapping from gesture sequences to intent uses:

1. Temporal grammar: Valid gesture sequences (e.g., point → hold = select)
2. Context priors: Location, time, previous actions
3. User adaptation: Learned preferences and patterns

---

Connection to cc-inscription Sigils

The Ten Claim Types as Motion Semantics

The cc-inscription system defines ten fundamental motion semantics, each with a N'Ko sigil:

SigilNameMotion Meaning
ߛStabilizationDispersion decreased — motion is "settling"
ߜDispersionSpread increased — motion is "exploring"
ߕTransitionCurvature spike — discrete change point
ߙReturnRe-entry to known basin — "coming back"
ߡDwellSustained stay — "resting here"
ߚOscillationRapid alternation — "bouncing between"
ߞRecoveryLatency to return — "how long to reset"
ߣNoveltyUnknown region — "something new"
ߠPlace-ShiftLocation changed with dynamics — "moved to"
ߥEchoPattern match — "like before"

From Gesture to Sigil

┌─────────────────────────────────────────────────────────────────┐
│               GESTURE → SIGIL MAPPING                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Gesture Event                                                  │
│       │                                                         │
│       ▼                                                         │
│  ┌─────────────┐     ┌─────────────┐     ┌─────────────┐       │
│  │ Anticipation│ ──► │ Claim       │ ──► │ N'Ko        │       │
│  │ Packet      │     │ Detector    │     │ Renderer    │       │
│  └─────────────┘     └─────────────┘     └─────────────┘       │
│       │                    │                   │                │
│  commitment: 0.85    stabilize_claim     ߛ ⟦t0–t1⟧: z(σ)↓     │
│  uncertainty: 0.12   window: [t0, t1]    ⟦home⟧; c=0.85        │
│  novelty: 0.03       place: home                               │
│                      confidence: 0.85                          │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Provenance Chain

Every sigil is traceable back to its motion source:

z-trajectory → Anticipation → Claim IR → Lexicon → N'Ko Surface → Proof

This ensures:
- Reproducibility: Same motion → Same inscription
- Auditability: Every claim has evidence
- Versioning: Lexicon evolves but corpus stays stable

---

Theoretical Foundations

Embodied Cognition

Motion-as-language rests on embodied cognition theory:

> "Cognition is not confined to the brain but is distributed across brain, body, and environment."

Key principles:
1. Grounding: Abstract concepts are grounded in bodily experience
2. Simulation: Understanding involves motor simulation
3. Affordances: The body shapes what meanings are possible

Dynamical Systems

Movement operates in a high-dimensional dynamical system with:

  • Basins of attraction: Stable movement patterns
  • Transitions: Passages between basins
  • Bifurcations: Points where behavior qualitatively changes

The anticipation kernel tracks these dynamics in real-time.

Information Theory

Gesture semantics can be quantified through:

  • Entropy: Uncertainty over possible movements
  • Mutual Information: How much gesture predicts intent
  • Compression: Efficient encoding of motion patterns

---

Implementation Architecture

Core Dependencies

cc-types (foundation)
    └── cc-collection (sensor fusion)
        └── cc-window-aligner (deterministic windows)
            └── cc-anticipation (kernel)
                ├── cc-gesture (classification)
                ├── cc-conductor (policy)
                └── cc-inscription (sigils)

Performance Guarantees

ComponentLatencyInvariant
cc-collection< 0.5msDeterministic fusion
cc-window-aligner< 0.3msReplay-stable
cc-anticipation< 2.0msNo heap allocation in hot path
cc-gesture< 0.5msO(k×m) classification
Total< 2.16msReal-time capable

Configuration

Key parameters for motion-language tuning:

yaml
# Anticipation kernel
anticipation:
  commitment_threshold: 0.7   # When motion is "locked in"
  uncertainty_threshold: 0.3  # When futures are constrained
  novelty_window: 128         # Frames for novelty detection
  regime_embedding_dim: 16    # Latent embedding size

# Gesture classifier
gesture:
  min_commitment: 0.6         # Gate for classification
  cooldown_ms: 300            # Prevent duplicate detections
  k_neighbors: 5              # Phrase library query size

# Inscription
inscription:
  lexicon_version: "1.0"
  min_confidence: 0.5         # Minimum for claim emission

---

References

  • cc-anticipation: Core anticipation kernel
  • cc-gesture: Gesture classification system
  • cc-inscription: N'Ko inscription compiler
  • cc-conductor: Control-theoretic policy layer

See also:
- [GESTURE_LEXICON.md](./GESTURE_LEXICON.md) — Catalog of gestures and meanings
- [examples/](./examples/) — Implementation examples

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

Comp-Core/docs/motion-language/MOTION_SEMANTICS.md

Detected Structure

Method · Evaluation · References · Figures · Architecture