Grand Diomande Research · Full HTML Reader

26. Research Execution Fabric

**Status**: Active architecture **Scope**: Shared agent architecture for research-driven execution, remote training, evaluation, meta-review, and paper synthesis **Audience**: Claude Code, Codex, Gemini, orchestration services, paper-writing pipelines

Language as Infrastructure architecture technical paper candidate score 40 .md

Full Public Reader

26. Research Execution Fabric

Status: Active architecture
Scope: Shared agent architecture for research-driven execution, remote training, evaluation, meta-review, and paper synthesis
Audience: Claude Code, Codex, Gemini, orchestration services, paper-writing pipelines

---

Purpose

This document defines a global architecture for how the AI stack turns a prompt like:

"Read this paper and reproduce the experiment"
"Train a model on this dataset"
"Run the Vast.ai workflow and tell me the result"
"Take the findings through evaluation, meta-review, and paper drafting"

into a deterministic multi-stage process.

This is not an ASR-specific system.
This is not a `cog-rlm` feature.
This is a shared execution fabric that sits above individual workloads and below human intent.

The ASR Paper 6 run is one profile inside this fabric, not the definition of the fabric.

---

Core Principle

The system should treat research execution as a first-class AI capability with two linked rails:

1. Research Rail
Takes an idea or source artifact through framing, hypothesis formation, dataset discovery, design, evaluation framing, and paper synthesis.

2. Execution Rail
Takes a concrete experiment plan through environment setup, remote execution, monitoring, recovery, verification, artifact collection, and result summarization.

The rails meet at a shared contract:

hypothesis
data contract
execution profile
evaluation contract
publication contract

---

High-Level Flow

text

Prompt / Paper / Idea
        |
        v
1. Intent Intake
        |
        v
2. Divergent Rail
   research angles / failure expectations / workload classes
        |
        v
3. Research Synthesis
   sources / prior logs / prompt history / prior experiments
        |
        v
4. Hypothesis Contract
   what is being tested and how success is measured
        |
        v
5. Data Contract
   sources / schemas / transforms / risks / provenance
        |
        v
6. Execution Profile
   local / mesh / Vast.ai / benchmark / finetune / inference-only
        |
        +------------------------+
        |                        |
        v                        v
7a. Remote Execution Rail    7b. Local / Mesh Execution Rail
    bootstrap                    bootstrap
    run                          run
    monitor                      monitor
    recover                      recover
    verify                       verify
        |                        |
        +-----------+------------+
                    |
                    v
8. Evaluation
   metrics / baselines / regressions / artifact checks
                    |
                    v
9. Meta-Review
   bug hunt / invalid assumptions / missing controls / paper audit
                    |
                    v
10. Paper / Blog / Briefing Synthesis
                    |
                    v
11. Memory + Registry Update

---

Execution Model

The Claude Code or Codex session is the default linear executor.

That matters because the session already has:

shell access
MCP tools
prompt logs
Orbit / context recovery
mesh dispatch
browser automation
file system access
paper-writing ability

The fabric assumes the active tool-rich session can do the whole chain end to end:

read sources
inspect prior failures
compile a workload
execute the workload
validate outputs
run meta-review
write the paper draft

Sub-agents remain optional accelerators, not required architecture.

---

The Two Shared Subsystems

A. Research Workflow Layer

This layer is responsible for:

reading prompt history and prior experiment logs
recovering prior hypotheses and failed paths
comparing possible experiment directions
turning a source paper or idea into a testable contract
defining what data is needed
defining what result would support or falsify the hypothesis
carrying the result into paper and blog synthesis

This is where Evoflow-style divergence belongs.
This is also where meta-review belongs.

B. Execution Workflow Layer

This layer is responsible for:

choosing compute substrate
compiling exact setup and run commands
defining monitor and recovery behavior
tracking expected artifacts and success markers
incorporating failure patterns from prompt logs
retrying only when verification fails
preserving resumability across instance death, process death, and package drift

This is where Vast.ai belongs.

---

Why Vast.ai Is Only One Profile

Vast.ai should be modeled as an execution profile under the fabric, not the whole workflow.

Examples of execution profiles:

`vastai.generic`
`vastai.training`
`vastai.paper_bundle`
`mesh.parallel`
`local.prototype`
`local.benchmark`
`remote.inference`

The `vastai.paper_bundle` profile is what the N'Ko Paper 6 run used:

remote bootstrap
dependency pinning
extraction
training bundle
monitor + relaunch
artifact verification
results download

But the global system must also support:

reading a paper and generating a reproduction plan
collecting or transforming data first
evaluating against a baseline
drafting the hypothesis and results section
running meta-review before claiming anything

---

Shared Contracts

Every workload should compile into the following contracts.

1. Hypothesis Contract

experiment question
claim under test
expected directional outcome
falsifiers
baseline
metric set

2. Data Contract

source datasets
schemas and field assumptions
transforms
noise warnings
provenance
volume requirements

3. Execution Contract

substrate: local / mesh / Vast.ai
bootstrap commands
run commands
monitor interval
recovery rules
retry rules
artifact list
success markers

4. Evaluation Contract

metrics
held-out split policy
baselines
sanity checks
artifact verification
regression checks

5. Publication Contract

result summary
caveats
negative findings
paper section updates
blog / briefing outputs

---

Incident-Aware Operation

The execution rail must be informed by prior incident logs.

Examples already recovered from the Vast.ai sessions:

never destroy active instances before SSH verification and artifact merge
do not rewrite scripts that are already producing output without a correctness reason
pin drifting dependencies
validate schema assumptions before long runs
assert feature flags are actually wired into runtime behavior
treat process death separately from instance death
verify artifacts before counting a run as complete
keep monitors portable across macOS and Linux

These incidents are not ASR-specific.
They are execution intelligence.

They belong in the shared fabric.

---

Relationship to Existing Systems

Evoflow

Evoflow belongs in the Research Workflow Layer.
It is a divergence and synthesis engine for shaping experiments before execution.

Meta-Review

Meta-review belongs after evaluation and before publication.
Its job is to attack assumptions, controls, methodology, missing tests, and overclaims.

Orbit / Context Recovery

Orbit belongs in the source-recovery stage.
It provides prior experiments, prompt logs, plans, and session context.

Vast.ai Workflow

Vast.ai belongs in the execution profile layer as one deterministic remote substrate.

Paper Pipeline

The paper pipeline belongs in the publication contract.
It should consume the verified experiment outputs, not raw optimistic notes.

---

Architectural Decision

The shared home for this architecture should be Comp-Core, not `cog-rlm`.

Reason:

`Comp-Core` is the system-level repository for shared agent and orchestration architecture.
`cog-rlm` can consume this architecture, but should not define it.
Claude Code, Codex, Gemini, and future orchestrators need a neutral home that is not tied to one product.

`cog-rlm` is therefore a consumer implementation.
The global architecture lives here in `Comp-Core`.

---

Initial Shared Deliverables

The first shared implementation should include:

1. A workflow manifest describing the stages and contracts.
2. An incident registry extracted from prompt logs and failure docs.
3. A compiler that turns a prompt-level objective into a deterministic execution plan.
4. A profile system for substrates like Vast.ai.
5. A publication pipeline hook so execution can flow into hypothesis writeup, meta-review, and paper drafting.

---

Non-Goals

This architecture does not require:

full autonomous operation without a tool-rich session
one monolithic service for all execution substrates
ASR-only abstractions
paper writing without experiment verification

---

Immediate Consequence

Any implementation hidden inside one app repo should be treated as provisional until its logic is promoted into shared architecture and shared tooling.

That is the change made here:

`Comp-Core` owns the architecture
shared tooling owns the workflow manifest
app repos consume the compiled plans

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

Comp-Core/docs/architecture/26-RESEARCH_EXECUTION_FABRIC.md

Detected Structure

Method · Evaluation · Architecture