Grand Diomande Research · Full HTML Reader

T4 — Temporal Reasoning Layer: COMPLETE ✅

Core capabilities: - **Natural language date parsing** — "yesterday", "last week", "3 days ago", "February 2026", "Q1", "recently", ISO dates - **Query classification** — 6 query types: activity, timeline, recency, duration, sequence, search - **Recency scoring** — Exponential decay with configurable half-life (default 30 days) - **Temporal ordering** — Results ranked by combined relevance × recency score - **Timeline generation** — Chronological event listing for any topic - **Temporal edge traversal** — preceded_

Agents That Account for Themselves research note experiment writeup candidate score 24 .md

Full Public Reader

T4 — Temporal Reasoning Layer: COMPLETE ✅

Completed: 2026-02-19
Status: All deliverables shipped, 54/54 tests passing

## Problem
Eval showed 80

What Was Built

### 1. Temporal Retrieval Module
File: `cognitive_twin/v3/tools/temporal_retrieval.py` (43KB, ~1100 lines)

Core capabilities:
- Natural language date parsing — "yesterday", "last week", "3 days ago", "February 2026", "Q1", "recently", ISO dates
- Query classification — 6 query types: activity, timeline, recency, duration, sequence, search
- Recency scoring — Exponential decay with configurable half-life (default 30 days)
- Temporal ordering — Results ranked by combined relevance × recency score
- Timeline generation — Chronological event listing for any topic
- Temporal edge traversal — preceded_by, followed_by, concurrent_with, superseded_by

Key classes:
- `TemporalRetriever` — Main retrieval engine over graph + RAG
- `TemporalRange` — Time range with start/end/granularity
- `TemporalResult` / `TemporalQueryResult` — Structured result types
- `parse_temporal_reference()` — NL date → TemporalRange
- `classify_temporal_query()` — Query → type + params
- `compute_recency_score()` — Exponential decay scoring

### 2. Temporal Enrichment Pipeline
File: `scripts/enrich_temporal.py` (19KB)

Mines `[home-path]` (196K messages, 5934 sessions, 17K entities) to extract:
- First/last mention dates for entities via entity → message → session join
- Session-level topic co-occurrence for concurrent_with edges
- Known timeline data for high-confidence project dates
- Technology evolution supersession edges

3. Enriched Data Files

File	Description	Stats
`data/expanded_graph_v3_temporal.json`	Temporally enriched knowledge graph	270 nodes, 1154 edges (was 689)
`data/expanded_rag_entries_temporal.jsonl`	Temporally enriched RAG entries	2140 entries

4. Graph Enrichment Stats

Metric	Count
Nodes with timestamps	173
Nodes with temporal ranges	85
RAG entries with created_at	2139
Temporal edges added	465
— concurrent_with	451
— preceded_by	6
— followed_by	6
— superseded_by	2
Total edges (was 689)	1154

### 5. Tests
File: `tests/test_temporal.py` — 54/54 passing

Test Class	Tests	Coverage
TestDateParsing	16	NL date parsing (yesterday, last week, months, quarters, ISO, etc.)
TestQueryClassification	5	Activity, timeline, recency, sequence, search queries
TestRecencyScoring	6	Exponential decay, edge cases (null, future, very old)
TestDataIntegrity	9	Graph version, timestamps, edges, ranges, RAG coverage
TestTemporalRange	5	Midpoint, duration, overlap, serialization
TestRetrieverIntegration	13	Full queries against real data, scoring, serialization

Sample Queries Working

"what were we working on recently?"
→ Type: activity, Time range: last 2 weeks, 2244 candidates → 20 results

"when did we switch from fine-tuning to RLM?"
→ Type: timeline, finds Together AI, SFT, Cognitive Twin, superseded_by edges

"what happened in February?"
→ Type: search, Time range: Feb 1-28 2026, Sleep Sync and other Feb activity

"Shopify"
→ Type: search, finds MFP, Serenity Soother, storefronts

"Koji CRM"
→ Type: search, finds CRM service node + RAG entries about pipeline

Architecture

temporal_retrieval.py
├── parse_temporal_reference()     # NL → TemporalRange
├── classify_temporal_query()      # Text → QueryType + params
├── compute_recency_score()        # Exponential decay
├── TemporalRetriever
│   ├── query()                    # Main entry point
│   ├── search()                   # Simple keyword + recency
│   ├── get_timeline()             # Chronological events
│   ├── get_temporal_neighbors()   # Graph edge traversal
│   ├── _activity_search()         # "What were we working on..."
│   ├── _timeline_search()         # "When did we switch..."
│   ├── _recency_search()          # "What's the latest..."
│   ├── _sequence_search()         # "What happened before/after..."
│   └── _general_search()          # Keyword + temporal filter
└── load_default_retriever()       # Convenience factory

## Dependencies
- T1: Knowledge graph v2 (270 nodes, 689 edges) → enriched to v3 (1154 edges)
- T2: RAG entries (2140) → enriched with timestamps (2139 timestamped)
- ledger.db: 196K messages, 5934 sessions, 17K entities — mined for temporal data

## Files Changed/Created
- Created: `cognitive_twin/v3/tools/temporal_retrieval.py`
- Created: `scripts/enrich_temporal.py`
- Created: `tests/test_temporal.py`
- Created: `data/expanded_graph_v3_temporal.json`
- Created: `data/expanded_rag_entries_temporal.jsonl`
- Created: `scripts/test_temporal_retriever.py` (manual verification script)
- Created: `T4-COMPLETE.md` (this file)

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

Comp-Core/packages/cognitive-twin/T4-COMPLETE.md

Detected Structure

Method · Evaluation · Code Anchors · Architecture