Back to corpus
architecturetechnical paper candidatescore 34

Privacy Architecture

**Processing**: 1. Parse conversations into individual prompts with timestamps 2. Strip all named entities (NER pass): names, companies, URLs, emails, phone numbers, addresses 3. Strip code snippets containing credentials, API keys, file paths with usernames 4. Generate embeddings locally (or via privacy-preserving API with no logging) 5. Compute all 6 cognitive metrics locally 6. Cluster prompts into domain topics using embedding similarity 7. Label clusters with generic domain tags (not project-specific names)

Full HTML reader

Read the full artifact

Open in new tab

Extracted abstract or opening context

### Layer 0: Local Extraction Runs entirely on the user's machine. No network calls. **Input**: Raw AI conversation exports (ChatGPT JSON, Claude history, Gemini data) **Processing**: 1. Parse conversations into individual prompts with timestamps 2. Strip all named entities (NER pass): names, companies, URLs, emails, phone numbers, addresses 3. Strip code snippets containing credentials, API keys, file paths with usernames 4. Generate embeddings locally (or via privacy-preserving API with no logging) 5. Compute all 6 cognitive metrics locally 6. Cluster prompts into domain topics using embedding similarity 7. Label clusters with generic domain tags (not project-specific names) **Output**: - Metric vectors (6 metrics, each a time series) - Domain topology graph (nodes = generic domain labels, edges = transition frequency) - Embedding centroids per domain (not individual prompt embeddings) - Session metadata (timestamps, durations, no content) **What NEVER leaves the machine**: - Raw prompt text - AI response text - Individual prompt embeddings (only centroids) - Any personally identifiable information - Project names, company names, people names - Code, credentials, file paths

Promotion decision

What has to happen next

Promote into a technical note or architecture paper with implementation anchors.

Why this is not always a full paper yet

Corpus pages are public-safe readers for discovered workspace artifacts. They are not automatically final papers. A corpus item becomes a polished paper only after the editable source, evidence checkpoints, references, figures, render path, and release status are attached through the paper schema.