Grand Diomande Research · Full HTML Reader

Motion-To-Inscription Source Authority And Mac4 Seated Protocol V1

This document defines the practical motion-to-inscription operating plan for the current hardware reality. It assumes the twelve-sensor Mocopi upgrade is not available yet and that full room-scale depth integration is not finished. The goal is to stop waiting for ideal hardware and establish a robust, usable capture regime that can generate a real motion library and feed the inscription system now.

Embodied Trajectory Systems research note experiment writeup candidate score 18 .md

Full Public Reader

Motion-To-Inscription Source Authority And Mac4 Seated Protocol V1

Date: 2026-04-25

Purpose

This document defines the practical motion-to-inscription operating plan for the current hardware reality. It assumes the twelve-sensor Mocopi upgrade is not available yet and that full room-scale depth integration is not finished. The goal is to stop waiting for ideal hardware and establish a robust, usable capture regime that can generate a real motion library and feed the inscription system now.

The operating principle is simple. We do not need the perfect capture stack to begin learning stable motion-to-inscription mappings. We need one trustworthy capture regime that is consistent enough to support repeatable motion states, clean semantic labels, and replayable rendering.

The first trustworthy regime is the Mac4 seated capture zone.

Current State

The inscription engine is real. The `cc-inscription` Rust system already exists and compiles embodied dynamics into typed claims and N'Ko surface forms. The Unity and MotionMix visual stack also exists and has been validated enough to run and receive body state.

What is missing is not the symbolic layer. What is missing is a stable live body-source chain.

The current `:9404` motion surface is up, but the live body state is degraded. The `/skeleton-3d` endpoint currently reports effectively empty fused body state. That means the main bottleneck is source quality, not rendering or inscription code.

The Femto side is scaffolded, not finished. There is a scene-manifest builder and a studio pipeline placeholder, but there is not yet a completed live Femto hardware ingest path that can be treated as authoritative.

Because of that, the right move is to formalize source authority and define a constrained capture regime that works with what is already physically available.

Source Authority Ladder

Source authority should be explicit and deterministic.

The priority order is:

1. Femto depth capture
2. Sony Mocopi skeleton
3. Multi-camera triangulation
4. Seated single-camera capture
5. No-body-state / hold-last-safe-state

This is not just a preference list. It is the policy the runtime should follow when deciding what data is allowed to drive inscription, Unity, or downstream visual behavior.

Tier 1: Femto

Femto is authoritative for room-scale spatial anchoring, coarse body volume, and spatial relationship to the environment. When Femto is healthy, it should own body position in space and any surface that depends on depth grounding.

Femto should not be blocked on inscription work. If it is live later, it becomes the highest-quality source. If it is absent, the system must continue without it.

Tier 2: Mocopi

Mocopi is authoritative for skeleton continuity and body articulation when depth is absent. It is the best current non-depth body source when the sensors are healthy.

If Mocopi is live and Femto is not, Mocopi should drive the skeletal interpretation and motion semantics. Unity and inscription should accept that regime as valid rather than treating it as second-class.

Tier 3: Multi-camera triangulation

Multi-camera triangulation is the fallback when wearable sensors are not live. It is valuable when it has enough camera coverage and stable subject tracking.

Its failure mode is usually partial-body ambiguity, low confidence, and posture collapse. Because of that, multi-camera should drive only the dimensions it is actually confident about. It should not fabricate full-body certainty.

Tier 4: Seated single-camera capture

This is the minimum viable regime and the immediate plan. A seated single-camera regime is not a failed version of full-body capture. It is a distinct motion basin with its own vocabulary and its own valid inscriptions.

If the user is consistently sitting in front of the Mac4 camera, then upper-body dynamics become the capture domain. The system should learn from that regime cleanly instead of pretending it is waiting for something else.

Tier 5: No-body-state / hold-last-safe-state

If no source is confident enough, the system should preserve or degrade gracefully rather than hallucinating motion structure. It can hold the last safe stable state, lower visual intensity, or emit a “no fresh embodied evidence” condition. It should not invent a skeleton.

Immediate Capture Regime

The immediate regime is the Mac4 seated zone.

This regime should be treated as the first real motion library, not as a temporary hack. It is valid because it is repeatable and constrained.

The setup assumptions are:

  • One fixed seated position
  • One fixed Mac4 camera angle
  • A known desk or chair zone
  • Stable lighting
  • Consistent torso framing
  • Reliable visibility of head, shoulders, chest, and hands when raised into frame

This regime should be explicitly named in data and manifests as something like:

`capture_regime = mac4_seated_v1`

That matters because later room-scale or full-body regimes should not be mixed invisibly into the same library.

What This Regime Can Reliably Learn

The seated Mac4 regime is strong enough for:

  • stillness
  • left lean
  • right lean
  • forward lean
  • return to center
  • head turn left
  • head turn right
  • chin up
  • chin down
  • chest expansion
  • chest contraction
  • hand rise left
  • hand rise right
  • both hands rise
  • hand near face
  • reach forward
  • retract
  • pause
  • restart after pause
  • small oscillation
  • abrupt transition

This is more than enough to begin inscription mapping because the first inscription vocabulary does not need every human pose. It needs repeatable dynamic states such as stabilize, dwell, transition, recover, and oscillate.

Motion Library V1

The first library should be a deliberately small set of motion primitives, each with clean repetitions.

Group A: Baseline postural states

These define the neutral and stable reference conditions.

  • centered seated stillness
  • mild left lean
  • mild right lean
  • forward attention lean
  • relaxed return to center

Group B: Head and gaze states

These provide small but semantically meaningful transitions.

  • look left
  • look right
  • look down
  • look up
  • slow return to center

Group C: Chest and torso states

These are important because chest motion already matters in the MotionMix stack.

  • chest expansion
  • chest contraction
  • torso pulse
  • torso settle

Group D: Hand and reach states

These provide the first clear boundary and transition gestures.

  • left hand rise
  • right hand rise
  • both hands rise
  • hand near face
  • forward reach
  • retract

Group E: Temporal states

These are not poses; they are dynamics over time.

  • sustained stillness
  • repeated micro-oscillation
  • abrupt interruption
  • recover to baseline
  • deliberate transition from one primitive to another

Inscription Mapping Target

The purpose of the first library is not to identify “all movements.” It is to create clean evidence for the early inscription claim types.

The initial mapping target should be:

  • `stabilize` for settling or contraction into a steady state
  • `dwell` for sustained occupancy of a stable basin
  • `transition` for discrete posture or motion changes
  • `recover` for return to baseline after interruption
  • `oscillate` for repeated small left-right or pulse-like alternation
  • `novel` only when a motion falls outside the known seated library

This means the first pass is not trying to use all ten inscription types equally. It is trying to get a small subset to become stable and trustworthy.

Runtime Policy For This Regime

The runtime should treat seated capture as a first-class mode with constrained expectations.

That means:

  • upper-body states are allowed to drive inscription
  • lower-body certainty is assumed low or absent
  • room-scale claims are disabled unless a higher-authority source is live
  • novelty thresholds should be narrower than in full-body mode
  • a lack of lower-body data should not automatically invalidate the frame

In other words, the system should be strict about what it claims, but not so strict that it rejects the entire seated regime.

Data Contract For Mac4 Seated Sessions

Every recorded session in this regime should include:

  • regime identifier
  • source authority used for the session
  • camera placement notes
  • framing notes
  • lighting notes
  • motion primitive label
  • take number
  • start and stop timestamps
  • any confidence or source-coverage diagnostics available from the body pipeline

At minimum, a session row should distinguish:

  • `source_authority = triangulation`
  • `source_authority = seated_single_cam`
  • `capture_regime = mac4_seated_v1`

This prevents future confusion when mixing richer sources.

Minimum Session Protocol

Each seated recording session should be short and structured.

A good first protocol is:

1. ten seconds of centered stillness
2. five repetitions of left lean and return
3. five repetitions of right lean and return
4. five repetitions of forward lean and return
5. five head turns left and right
6. five chest expansions and relaxations
7. five left-hand rises and returns
8. five right-hand rises and returns
9. five forward reaches and retracts
10. one longer improvised thirty-second segment combining these states

This gives both atomic examples and a small transition-rich continuous segment.

What Unity Should Do With It

Unity does not need full-body authority to begin.

For the seated regime, Unity should prioritize:

  • torso-centered visual fields
  • upper-body trails
  • chest-reactive color or density
  • head-turn camera bias
  • hand-near-face or reach events as transition accents

It should not depend on reliable feet or room translation for this regime.

The seated regime should produce a visually meaningful composition even if the only high-confidence body zones are head, shoulders, chest, and hands.

What The Next Wiring Pass Should Actually Implement

The next engineering pass should do five things in order.

First, formalize source authority in the runtime, so Femto, Mocopi, triangulation, and seated single-camera are not treated as ambiguous peers.

Second, add an explicit seated single-camera mode to the motion pipeline with its own confidence and claim rules.

Third, create the first labeled Mac4 seated motion library using the primitive set above.

Fourth, run that library through the existing inscription path and measure whether the claim distribution is stable and repeatable.

Fifth, wire the same seated regime into Unity so the scene can respond meaningfully before full room-scale depth is available.

Success Criteria

This phase is successful when:

  • the runtime can name its current authority source explicitly
  • seated single-camera is recognized as a valid operating mode
  • a labeled seated motion library exists
  • at least the first small claim family is stable across repeated takes
  • Unity responds coherently to the seated upper-body regime

The success condition is not “Femto works.” The success condition is “motion-to-inscription becomes real with the hardware available now.”

Non-Goals For This Phase

This phase is not trying to:

  • solve full room-scale depth capture
  • perfect multi-camera triangulation
  • wait for the twelve-sensor Mocopi configuration
  • prove final artistic visuals
  • unify every regime into one model immediately

Those are later phases. Right now the job is to make one regime trustworthy.

Recommended Next Step

The immediate next step is to treat Mac4 seated capture as the first production regime and build `mac4_seated_v1` as a labeled motion library. Once that exists, the source-authority contract can be enforced in code and the inscription system can be tuned against real repeated motions instead of abstract architectural intent.

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

Comp-Core/docs/research/motion-to-inscription-source-authority-and-mac4-seated-protocol-v1.md

Detected Structure

Evaluation · Architecture