Grand Diomande Research · Full HTML Reader

Cross-Pollination Architecture Specification

**Cross-pollination** is PULSE's proactive intelligence layer. It predicts what you need *before you ask* by synthesizing signals from your calendar, tasks, location, time patterns, and conversation history — then delivers contextual predictions to the right device at the right moment.

Language as Infrastructure architecture technical paper candidate score 56 .md

Full Public Reader

Cross-Pollination Architecture Specification

> PULSE v1 — Component A3
> Status: COMPLETE
> Last updated: 2025-07-22

---

1. Overview

Cross-pollination is PULSE's proactive intelligence layer. It predicts what you need before you ask by synthesizing signals from your calendar, tasks, location, time patterns, and conversation history — then delivers contextual predictions to the right device at the right moment.

Think of it as a personal chief-of-staff who knows your schedule, remembers your habits, and quietly prepares what you'll need next.

Design Principles

PrincipleDescription
Anticipatory, not intrusivePredictions surface only when confidence is high and timing is right
Privacy-firstAll context stays local or in user-controlled infrastructure; no third-party telemetry
Self-improvingEvery accept/reject trains the system to be more accurate over time
Cost-awareHard caps on inference calls, token usage, and prediction frequency
Multi-modal deliveryText cards, voice, push notifications — matched to device and context

---

2. System Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                        CROSS-POLLINATION ENGINE                      │
│                                                                      │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐           │
│  │   CONTEXT     │    │  PREDICTION  │    │   DELIVERY   │           │
│  │  COLLECTOR    │───▶│   ENGINE     │───▶│   ROUTER     │           │
│  └──────────────┘    └──────────────┘    └──────────────┘           │
│         │                    │                    │                   │
│         ▼                    ▼                    ▼                   │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐           │
│  │   CONTEXT     │    │   SAFETY     │    │   FEEDBACK   │           │
│  │   STORE       │    │   RAILS      │    │   LOOP       │           │
│  └──────────────┘    └──────────────┘    └──────────────┘           │
│                              │                    │                   │
│                              └────────────────────┘                   │
│                              DPO Export ──▶ Twin Refinement           │
└──────────────────────────────────────────────────────────────────────┘

External Integrations:
  ├── Google Calendar API / Apple EventKit
  ├── Clawdbot Conversation History
  ├── Node Location Services
  ├── Twin Model (inference endpoint)
  └── Device Push (ACC / Watch / Glasses / Discord)

---

3. Component Deep-Dives

3.1 Context Collector

The Context Collector continuously gathers and normalizes signals from multiple sources into a unified context frame.

Signal Sources

SourceMethodRefresh RatePriority
CalendarGoogle Calendar API / Apple EventKitEvery 5 minHIGH
Task HistoryClawdbot memory files + conversation logsOn changeHIGH
LocationNode `location_get` (when available)Every 15 minMEDIUM
Time PatternsDerived from historical interaction timestampsDaily rebuildMEDIUM
Conversation HistoryLast N messages from active channelsOn messageHIGH
WeatherExternal API (OpenWeatherMap)Every 30 minLOW
Device StateWhich devices are active/connectedOn changeLOW

Context Frame Schema

typescript
interface ContextFrame {
  id: string;                    // UUID
  timestamp: string;             // ISO 8601
  userId: string;                // User identifier

  calendar: {
    current: CalendarEvent | null;   // Currently in-progress event
    next: CalendarEvent[];           // Next 3 upcoming events
    todayRemaining: CalendarEvent[]; // Rest of today
    tomorrowPreview: CalendarEvent[]; // Tomorrow's events
  };

  tasks: {
    recentCompleted: Task[];     // Last 5 completed tasks
    activeProjects: Project[];   // Currently active projects
    overdue: Task[];             // Overdue items
    recurring: Task[];           // Recurring patterns detected
  };

  location: {
    current: GeoPoint | null;    // Lat/lng if available
    label: string | null;        // "home", "office", "gym", etc.
    recentPlaces: Place[];       // Last 3 places visited
    commuting: boolean;          // Currently in transit?
  };

  temporal: {
    localTime: string;           // HH:MM in user timezone
    dayOfWeek: string;           // "monday", etc.
    timeBlock: string;           // "early_morning" | "morning" | "afternoon" | "evening" | "night"
    isWorkHours: boolean;        // Based on learned patterns
    isWeekend: boolean;
  };

  conversation: {
    lastMessages: Message[];     // Last 10 messages across channels
    activeTopics: string[];      // Extracted topics/entities
    pendingQuestions: string[];   // Unanswered questions detected
    sentiment: string;           // "positive" | "neutral" | "stressed" | "rushed"
  };

  environment: {
    weather: WeatherData | null;
    activeDevices: string[];     // ["iphone", "macbook", "watch"]
  };
}

Context Store

  • Storage: SQLite database at `[home-path]`
  • Retention: 30 days of context frames, then archived to compressed JSON
  • Indexing: By timestamp, userId, and derived topic tags
  • Privacy: All data local. No cloud sync unless user explicitly enables

3.2 Prediction Engine

The Prediction Engine takes context frames and generates actionable predictions.

Prediction Pipeline

Context Frame
     │
     ▼
┌─────────────┐
│  PATTERN     │  ← Historical patterns from context store
│  MATCHER     │    "Every Monday at 9am, user checks email"
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  TWIN MODEL  │  ← LLM inference with user's Twin persona
│  INFERENCE   │    "Given this context, what would user want?"
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  CONFIDENCE  │  ← Score 0.0–1.0 based on pattern strength
│  SCORER      │    + model certainty + historical accuracy
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  SAFETY      │  ← Rate limits, cost caps, loop detection
│  RAILS       │    Blocks or downgrades low-value predictions
└──────┬──────┘
       │
       ▼
  Prediction Output (or suppressed)

Pattern Matcher

The Pattern Matcher identifies recurring behaviors from historical context:

typescript
interface Pattern {
  id: string;
  type: "temporal" | "sequential" | "contextual" | "reactive";
  trigger: PatternTrigger;       // What activates this pattern
  action: string;                // What the user typically does
  confidence: number;            // 0.0–1.0 based on frequency
  occurrences: number;           // How many times observed
  lastSeen: string;              // ISO timestamp
  decayRate: number;             // How fast confidence drops without reinforcement
}

// Example patterns:
// temporal:    "Every weekday at 8:30am → check email"
// sequential:  "After finishing a meeting → write summary notes"
// contextual:  "When at gym → play workout playlist"
// reactive:    "When weather changes to rain → remind about umbrella"

Twin Model Inference

The prediction prompt sent to the Twin model:

You are {user}'s Digital Twin. Based on the following context, predict
what {user} most likely needs right now. Be specific and actionable.

CURRENT CONTEXT:
{serialized_context_frame}

RECENT PATTERNS:
{top_5_matching_patterns}

PREVIOUSLY REJECTED PREDICTIONS (do not repeat):
{rejected_predictions_last_24h}

Generate 1-3 predictions. For each, provide:
- prediction: What the user likely needs (one sentence)
- action: Specific action to take (API call, message, reminder, etc.)
- reasoning: Why you think this (one sentence)
- confidence: 0.0-1.0
- urgency: "immediate" | "soon" | "whenever"
- category: "task" | "info" | "reminder" | "suggestion" | "preparation"

Confidence Scoring

Final confidence is a weighted composite:

final_confidence = (
  0.3 × pattern_confidence +      // Historical pattern strength
  0.3 × model_confidence +        // LLM's self-assessed confidence
  0.2 × temporal_relevance +      // How time-appropriate is this?
  0.1 × historical_accuracy +     // User's past accept rate for similar
  0.1 × freshness_bonus           // New info gets a small boost
)

Prediction Schema

typescript
interface Prediction {
  id: string;                      // UUID
  contextFrameId: string;          // Source context frame
  timestamp: string;               // When generated

  prediction: string;              // Human-readable prediction text
  action: PredictionAction;        // Structured action to execute
  reasoning: string;               // Why this prediction was made

  confidence: number;              // 0.0–1.0 composite score
  urgency: "immediate" | "soon" | "whenever";
  category: "task" | "info" | "reminder" | "suggestion" | "preparation";

  status: "pending" | "delivered" | "accepted" | "rejected" | "expired" | "auto_executed";
  deliveredAt: string | null;
  respondedAt: string | null;

  cost: {
    inputTokens: number;
    outputTokens: number;
    estimatedCostUsd: number;
  };

  metadata: {
    patternIds: string[];          // Patterns that contributed
    modelId: string;               // Which model was used
    generationTimeMs: number;      // How long inference took
  };
}

interface PredictionAction {
  type: "notify" | "prepare" | "execute" | "suggest";
  payload: Record<string, any>;
  // Examples:
  // { type: "notify", payload: { message: "Meeting in 15 min with X" } }
  // { type: "prepare", payload: { draft: "email summary of Q3 results" } }
  // { type: "execute", payload: { command: "set_timer", args: { minutes: 25 } } }
  // { type: "suggest", payload: { suggestion: "You usually review PRs now" } }
}

3.3 Safety Rails

Critical guardrails to prevent runaway costs, spam, and feedback loops.

Rate Limiting

typescript
interface RateLimits {
  maxPredictionsPerHour: 10;       // Hard cap
  maxPredictionsPerDay: 50;        // Daily cap
  minIntervalBetweenMs: 120_000;   // At least 2 min between predictions
  burstLimit: 3;                    // Max 3 predictions in a 5-min window

  // Adaptive: reduce frequency when rejection rate is high
  adaptiveThrottle: {
    enabled: true;
    recentWindow: 20;              // Look at last 20 predictions
    rejectRateThreshold: 0.6;      // If >60% rejected
    throttleFactor: 0.5;           // Cut rate in half
  };
}

Cost Controls

typescript
interface CostControls {
  dailyBudgetUsd: 1.00;           // Max $1/day on predictions
  perPredictionCapUsd: 0.05;      // Max per single prediction
  modelTier: "efficient";          // Use cheaper models for predictions

  // Fallback: if budget exhausted, switch to pattern-only (no LLM)
  fallbackToPatternOnly: true;

  // Track cumulative spend
  tracking: {
    todaySpentUsd: number;
    monthSpentUsd: number;
    alertAtPercent: 80;            // Alert user at 80% of budget
  };
}

Loop Prevention

typescript
interface LoopPrevention {
  // Don't repeat rejected predictions
  rejectionCooldown: {
    exactMatch: "never";           // Never repeat exact same prediction
    similarThreshold: 0.85;        // Cosine similarity threshold
    similarCooldownHours: 72;      // 3 days before similar can resurface
  };

  // Don't generate predictions about predictions
  metaPredictionBlock: true;       // No "you should check your predictions"

  // Detect oscillation (predict A → reject → predict B → reject → predict A)
  oscillationDetection: {
    windowSize: 10;
    maxRepeats: 2;                 // Max times a topic can appear in window
  };
}

Confidence Gating

typescript
interface ConfidenceGating {
  minimumToDeliver: 0.70;          // Below this → suppressed
  minimumToAutoExecute: 0.95;      // Only auto-execute at very high confidence

  // Tier-based delivery
  tiers: [
    { min: 0.95, delivery: "auto_execute", label: "Near-certain" },
    { min: 0.85, delivery: "prominent_card", label: "High confidence" },
    { min: 0.70, delivery: "subtle_suggestion", label: "Moderate confidence" },
    { min: 0.00, delivery: "suppressed", label: "Low confidence" }
  ];
}

3.4 Delivery Router

Routes predictions to the optimal device and format based on context.

Delivery Channels

ChannelFormatWhen to Use
ACC (iPhone)Swipeable prediction cardDefault for most predictions
Apple WatchCompact notificationTime-sensitive, user is mobile
VisionClaw (Glasses)Voice + ambient overlayHands-busy, walking, driving
DiscordEmbed messageUser is at computer, in chat
macOSNative notificationUser is at laptop

Routing Logic

typescript
function routePrediction(prediction: Prediction, context: ContextFrame): DeliveryTarget {
  // Urgency-based routing
  if (prediction.urgency === "immediate") {
    if (context.environment.activeDevices.includes("watch")) {
      return { channel: "watch", format: "haptic_notification" };
    }
    return { channel: "acc", format: "banner_notification" };
  }

  // Context-based routing
  if (context.location.commuting) {
    return { channel: "visionclaw", format: "voice_brief" };
  }

  if (context.environment.activeDevices.includes("macbook")) {
    return { channel: "macos", format: "notification_center" };
  }

  // Default
  return { channel: "acc", format: "prediction_card" };
}

Prediction Card UI

┌─────────────────────────────────────────┐
│ 🔮 Cross-Pollination          85% conf │
│                                         │
│  "You have a meeting with Sarah in 30   │
│   minutes. Last time you discussed the  │
│   Q3 budget. Want me to pull up the     │
│   latest numbers?"                      │
│                                         │
│  [✅ Yes, prepare]  [❌ Dismiss]  [⏰ Later] │
│                                         │
│  Based on: calendar + conversation history │
└─────────────────────────────────────────┘

3.5 Feedback Loop

Every prediction outcome feeds back into the system.

Feedback Collection

typescript
interface PredictionFeedback {
  predictionId: string;
  outcome: "accepted" | "rejected" | "modified" | "expired" | "auto_executed";
  timestamp: string;

  // Optional: user can say why they rejected
  rejectionReason?: "not_relevant" | "bad_timing" | "already_done" | "wrong_info" | "too_obvious";

  // If modified, what did user actually want?
  modification?: string;

  // Implicit feedback
  responseTimeMs: number;          // How quickly did they respond?
  interactionDepth: number;        // Did they just dismiss or engage deeply?
}

DPO Export Pipeline

Feedback data is periodically exported for Twin model refinement:

Feedback Store
     │
     ▼ (every 24h or 50 new feedbacks)
┌─────────────┐
│  DPO PAIR    │  For each prediction:
│  GENERATOR   │  - chosen = accepted predictions (good)
│              │  - rejected = rejected predictions (bad)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  EXPORT TO   │  Format: JSONL with DPO pairs
│  TWIN FINE-  │  { prompt, chosen, rejected }
│  TUNE QUEUE  │
└──────┬──────┘
       │
       ▼
  Twin Model Update (scheduled fine-tuning)

DPO Pair Schema

jsonl
{
  "prompt": "Context: Monday 8:45am, user has standup at 9am, was working on PR #142 yesterday...",
  "chosen": "You have standup in 15 minutes. PR #142 is ready for review — want me to post a summary in #dev?",
  "rejected": "Good morning! It's Monday. Have a great week!",
  "metadata": { "category": "preparation", "confidence_delta": 0.3 }
}

Confidence Calibration

The system tracks predicted vs actual confidence:

typescript
interface CalibrationMetric {
  bucket: string;           // "0.7-0.8", "0.8-0.9", "0.9-1.0"
  predictedAcceptRate: number;
  actualAcceptRate: number;
  sampleSize: number;
  calibrationError: number; // |predicted - actual|
}

// Goal: calibrationError < 0.1 for each bucket
// If bucket "0.8-0.9" has actual accept rate of 0.6,
// we know the model is overconfident in that range → adjust weights

---

4. Data Flow — Complete Lifecycle

1. COLLECT
   Timer fires (every 5 min) or event trigger
        │
        ▼
   Build ContextFrame from all sources
        │
        ▼
   Store in context.db

2. PREDICT
   Context frame enters prediction pipeline
        │
        ▼
   Pattern Matcher finds matching historical patterns
        │
        ▼
   Twin Model generates 1-3 predictions
        │
        ▼
   Confidence Scorer assigns composite scores
        │
        ▼
   Safety Rails filter (rate limit, cost, loops)
        │
        ▼
   Surviving predictions stored in predictions.db

3. DELIVER
   Delivery Router selects channel + format
        │
        ▼
   Prediction card/notification sent to device
        │
        ▼
   Status updated to "delivered"

4. FEEDBACK
   User responds (accept/reject/modify/ignore)
        │
        ▼
   Feedback recorded with metadata
        │
        ▼
   Confidence calibration updated
        │
        ▼
   DPO pairs queued for Twin refinement

5. REFINE (async, periodic)
   DPO export → Twin fine-tuning
   Pattern database updated
   Confidence weights recalibrated

---

5. Database Schema

context_frames

sql
CREATE TABLE context_frames (
  id TEXT PRIMARY KEY,
  user_id TEXT NOT NULL,
  timestamp TEXT NOT NULL,
  frame_json TEXT NOT NULL,          -- Full serialized ContextFrame
  topics TEXT,                        -- Comma-separated extracted topics
  time_block TEXT,                    -- "morning", "afternoon", etc.
  location_label TEXT,                -- "home", "office", etc.
  created_at TEXT DEFAULT (datetime('now'))
);

CREATE INDEX idx_context_timestamp ON context_frames(timestamp);
CREATE INDEX idx_context_topics ON context_frames(topics);

predictions

sql
CREATE TABLE predictions (
  id TEXT PRIMARY KEY,
  context_frame_id TEXT NOT NULL,
  user_id TEXT NOT NULL,
  timestamp TEXT NOT NULL,
  prediction_text TEXT NOT NULL,
  action_json TEXT NOT NULL,
  reasoning TEXT,
  confidence REAL NOT NULL,
  urgency TEXT NOT NULL,
  category TEXT NOT NULL,
  status TEXT DEFAULT 'pending',
  delivered_at TEXT,
  responded_at TEXT,
  cost_input_tokens INTEGER,
  cost_output_tokens INTEGER,
  cost_usd REAL,
  model_id TEXT,
  generation_time_ms INTEGER,
  pattern_ids TEXT,                   -- JSON array of contributing pattern IDs
  created_at TEXT DEFAULT (datetime('now')),

  FOREIGN KEY (context_frame_id) REFERENCES context_frames(id)
);

CREATE INDEX idx_predictions_status ON predictions(status);
CREATE INDEX idx_predictions_confidence ON predictions(confidence);
CREATE INDEX idx_predictions_timestamp ON predictions(timestamp);

feedback

sql
CREATE TABLE feedback (
  id TEXT PRIMARY KEY,
  prediction_id TEXT NOT NULL,
  user_id TEXT NOT NULL,
  outcome TEXT NOT NULL,              -- accepted/rejected/modified/expired
  rejection_reason TEXT,
  modification TEXT,
  response_time_ms INTEGER,
  interaction_depth INTEGER,
  created_at TEXT DEFAULT (datetime('now')),

  FOREIGN KEY (prediction_id) REFERENCES predictions(id)
);

CREATE INDEX idx_feedback_outcome ON feedback(outcome);

patterns

sql
CREATE TABLE patterns (
  id TEXT PRIMARY KEY,
  user_id TEXT NOT NULL,
  type TEXT NOT NULL,                 -- temporal/sequential/contextual/reactive
  trigger_json TEXT NOT NULL,
  action_description TEXT NOT NULL,
  confidence REAL NOT NULL,
  occurrences INTEGER DEFAULT 1,
  last_seen TEXT NOT NULL,
  decay_rate REAL DEFAULT 0.05,
  created_at TEXT DEFAULT (datetime('now')),
  updated_at TEXT DEFAULT (datetime('now'))
);

CREATE INDEX idx_patterns_type ON patterns(type);
CREATE INDEX idx_patterns_confidence ON patterns(confidence);

dpo_queue

sql
CREATE TABLE dpo_queue (
  id TEXT PRIMARY KEY,
  prompt TEXT NOT NULL,
  chosen TEXT NOT NULL,
  rejected TEXT NOT NULL,
  category TEXT,
  confidence_delta REAL,
  exported INTEGER DEFAULT 0,
  created_at TEXT DEFAULT (datetime('now'))
);

CREATE INDEX idx_dpo_exported ON dpo_queue(exported);

---

6. Configuration

Default Configuration

yaml
crosspoll:
  enabled: true

  collection:
    calendar_sync_interval_sec: 300
    location_interval_sec: 900
    conversation_lookback: 10
    weather_interval_sec: 1800

  prediction:
    model: "claude-haiku"             # Cheap, fast for predictions
    max_predictions_per_run: 3
    min_confidence: 0.70
    auto_execute_confidence: 0.95

  safety:
    max_per_hour: 10
    max_per_day: 50
    min_interval_sec: 120
    daily_budget_usd: 1.00
    per_prediction_cap_usd: 0.05
    rejection_cooldown_hours: 72

  delivery:
    default_channel: "acc"
    quiet_hours:
      start: "23:00"
      end: "08:00"
    urgent_override_quiet: true

  feedback:
    dpo_export_threshold: 50          # Export after 50 new feedbacks
    dpo_export_interval_hours: 24
    calibration_recalc_interval: 100  # Recalibrate every 100 predictions

  retention:
    context_frames_days: 30
    predictions_days: 90
    feedback_days: 365
    patterns_max_age_days: 180

---

7. Integration Points

### With Twin (A1)
- Twin provides the inference model for predictions
- DPO pairs from feedback loop refine the Twin
- Twin's persona guides prediction tone and style

### With ACC (A2)
- ACC is the primary delivery surface for prediction cards
- Swipe gestures on ACC provide feedback (accept/reject)
- ACC's daily brief can include top predictions

### With VisionClaw (A4)
- Voice delivery of predictions through glasses
- Camera context can enrich predictions ("you're looking at a restaurant menu")
- Ambient overlays for low-urgency suggestions

### With PULSE Core
- Cross-pollination is a PULSE subsystem, not standalone
- Shares the PULSE event bus for real-time triggers
- Uses PULSE's unified auth and user model

---

8. Example Scenarios

Scenario 1: Meeting Prep

Context: Tuesday 9:45am, meeting with Sarah at 10am
Pattern: User always reviews notes before meetings with Sarah
Recent: User was discussing Q3 budget in yesterday's messages

→ Prediction (confidence: 0.88):
  "Meeting with Sarah in 15 min. You discussed Q3 budget last time.
   Want me to pull up the latest revenue numbers?"
  Action: { type: "prepare", payload: { query: "Q3 revenue summary" } }
  Delivery: ACC notification card

Scenario 2: Commute Intelligence

Context: Friday 5:30pm, location shows user leaving office
Pattern: User checks traffic on Friday evenings
Weather: Rain expected in 30 min

→ Prediction (confidence: 0.82):
  "Heads up — rain starting in 30 min on your commute.
   Route via I-95 is 10 min faster than usual right now."
  Action: { type: "notify", payload: { weather: true, traffic: true } }
  Delivery: Watch haptic + VisionClaw voice

Scenario 3: Task Momentum

Context: Wednesday 2pm, user just completed 3 Pomodoro sessions
Pattern: User takes a break after 3 sessions, then reviews PRs
Recent: 2 PRs awaiting review on GitHub

→ Prediction (confidence: 0.75):
  "Nice focus session! 2 PRs are waiting for your review.
   Ready to switch to code review, or take a longer break?"
  Action: { type: "suggest", payload: { prs: ["#142", "#147"] } }
  Delivery: macOS notification (user is at laptop)

---

9. Future Enhancements (v2+)

  • Multi-user cross-pollination: Predict team coordination needs
  • Proactive drafting: Don't just predict, pre-write the email/message
  • Ambient mode: Always-on subtle suggestions in VisionClaw peripheral vision
  • Emotional awareness: Adjust prediction style based on detected stress/energy levels
  • Third-party integrations: Slack, Linear, GitHub, Notion as context sources
  • Prediction chaining: One prediction triggers preparation for the next

---

Architecture designed for PULSE v1. Implementation target: Wave 2.

Promotion Decision

Promote into a technical note or architecture paper with implementation anchors.

Source Anchor

PULSE-V1/crosspoll-architecture.md

Detected Structure

Method · Evaluation · References · Architecture