Back to corpus
experimentexperiment writeup candidatescore 32

SEA Phase 0 — MiniMax Scoring Latency Benchmark Results

**Date:** 2026-02-17 23:30 (re-run with live endpoint) **Endpoint:** `http://localhost:18080` **Model:** MiniMax-M2.5 (229B params, TQ1_0 GGUF quantization, 55GB) **Server:** llama.cpp (llamacpp) **Prompt Template:** Tier 2 (Skill Activation Judge) **Benchmark Script:** `benchmark_minimax_scoring.py`

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

**Date:** 2026-02-17 23:30 (re-run with live endpoint) **Endpoint:** `http://localhost:18080` **Model:** MiniMax-M2.5 (229B params, TQ1_0 GGUF quantization, 55GB) **Server:** llama.cpp (llamacpp) **Prompt Template:** Tier 2 (Skill Activation Judge) **Benchmark Script:** `benchmark_minimax_scoring.py` > **Note:** The originally spec'd MiniMax-3B-v0.1 has been superseded by MiniMax-M2.5, > a 229B parameter reasoning model. This is significantly more capable but slower than > a 3B model would be. The benchmarks below reflect the actual available hardware. | Property | Value | |----------|-------| | Model ID | `unsloth_MiniMax-M2.5-GGUF_MiniMax-M2.5-UD-TQ1_0.gguf` | | Parameters | 229B | | Quantization | TQ1_0 (GGUF) | | File size | ~55 GB | | Context window | 196,608 tokens (train) | | Vocab size | 200,064 | | Embedding dim | 3,072 | | Capabilities | completion (with chain-of-thought reasoning) | 20 calls with the full SEA scoring prompt template, capped at 150 tokens to measure raw generation throughput. | Metric | Value | |--------|-------| | **Min** | **1,341ms** | | **Max** | **2,941ms** | | **Mean** | **1,848ms** | | **Median (P50)** | **1,573ms** | | **P95** | **2,903ms** | | **P99** | **2,934ms** | | **Std Dev** | **508ms** |

Promotion decision

What has to happen next

Attach run IDs, datasets, metrics, and reproduction commands.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.