Grand Diomande Research · Full HTML Reader

DLM Package Reorganization Plan

**Split Plan:** 1. **base.py** (~200 lines) - Base classes and utilities 2. **chat_model.py** (~800 lines) - BaseChatModel, ChatArtificial 3. **ai_interface.py** (~1,200 lines) - AI class with main interface 4. **completion.py** (~600 lines) - Completion logic 5. **streaming.py** (~400 lines) - Streaming functionality 6. **embeddings.py** (~400 lines) - Embedding cache and utilities

Agents That Account for Themselves proposal experiment writeup candidate score 24 .md

Full Public Reader

DLM Package Reorganization Plan

Executive Summary

Goal: Improve code organization while maintaining all functionality for the conversation/chatbot system.

Approach: Create logical subfolders, split mega files, consolidate duplicates.

Current State Analysis

### Large Files Requiring Split
1. inference/artificial.py - 3,692 lines (AI chatbot core)
2. inference/prompt.py - 2,153 lines (prompt templates)
3. response/links.py - 2,084 lines (link handling)
4. response/motion.py - 1,783 lines (motion techniques)
5. response/system.py - 1,521 lines (response system)
6. engine/embedder.py - 1,543 lines (duplicate - to deprecate)

### Module Count
- inference/: 12 files (10,620 lines total)
- response/: 71 files (organized but needs subfolder cleanup)
- engine/: 20 files (some duplicates)

Phase 2A: Inference Module Reorganization

Proposed Structure

inference/
├── __init__.py                    # Main exports
├── core/                          # Core AI/chatbot functionality
│   ├── __init__.py
│   ├── base.py                    # Base classes (from artificial.py)
│   ├── chat_model.py              # ChatArtificial class
│   ├── ai_interface.py            # AI class
│   ├── completion.py              # Completion logic
│   └── streaming.py               # Streaming support
├── generation/                    # Content generation
│   ├── __init__.py
│   ├── generator.py               # Main Generator class
│   ├── batch.py                   # Batch processing
│   └── media.py                   # Media processing
├── prompts/                       # Prompt management
│   ├── __init__.py
│   ├── templates.py               # Prompt templates (from prompt.py)
│   ├── system_prompts.py          # System prompts
│   └── manager.py                 # PromptManager
├── state/                         # State management
│   ├── __init__.py
│   ├── state.py                   # StateMachine
│   ├── session.py                 # Session handling
│   └── validator.py               # State validation
├── management/                    # Conversation management
│   ├── __init__.py
│   ├── chain_manager.py           # ChainManager (from manager.py)
│   ├── conversation.py            # ConversationManager
│   └── element.py                 # Element class
└── utils/                         # Utilities
    ├── __init__.py
    ├── cloud.py                   # CloudManager
    └── file.py                    # FileManager

### Migration Steps
1. Create new subfolder structure
2. Split artificial.py (3,692 lines) into 6 focused modules
3. Split generator.py (1,433 lines) into 3 modules
4. Split prompt.py (2,153 lines) into 3 modules
5. Move files to appropriate subfolders
6. Update all imports
7. Test thoroughly

Phase 2B: Response Module Reorganization

Proposed Structure

response/
├── __init__.py                    # Main exports
├── system/                        # Core response system
│   ├── __init__.py
│   ├── core.py                    # Main system logic (from system.py)
│   ├── builder.py                 # ReplyChainBuilder
│   ├── director.py                # Director
│   └── factory.py                 # Factory
├── techniques/                    # Synthesis techniques (rename from vangaurd)
│   ├── __init__.py
│   ├── base.py                    # Base technique classes
│   ├── motion/                    # Motion techniques
│   │   ├── __init__.py
│   │   ├── core.py                # Split from motion.py
│   │   ├── sensors.py
│   │   ├── gestures.py
│   │   └── analysis.py
│   ├── creative/                  # Creative techniques
│   │   ├── __init__.py
│   │   ├── barista.py
│   │   ├── word_weaver/
│   │   └── ... (other creative)
│   └── synth/                     # Synthesis techniques
│       └── ... (existing synth files)
├── linking/                       # Link handling
│   ├── __init__.py
│   ├── core.py                    # Core link logic (from links.py)
│   ├── types.py                   # Link types
│   ├── validation.py              # Link validation
│   └── processing.py              # Link processing
├── cohort/                        # Technique cohort management
│   ├── __init__.py
│   └── cohort.py
├── providers/                     # External providers
│   ├── __init__.py
│   └── embedding_provider.py
└── utils/                         # Utilities
    ├── __init__.py
    ├── config.py
    ├── logging.py
    ├── validators.py
    └── utils.py

### Migration Steps
1. Rename `vangaurd/` → `techniques/`
2. Split motion.py (1,783 lines) into 4 modules
3. Split links.py (2,084 lines) into 4 modules
4. Split system.py (1,521 lines) into 4 modules
5. Organize existing files into subfolders
6. Update all imports
7. Test thoroughly

Phase 2C: Engine Module Cleanup

Proposed Structure

engine/
├── __init__.py
├── core/                          # Core engine utilities (KEEP)
│   ├── __init__.py
│   ├── dataframe_ops.py
│   ├── embedding_utils.py
│   ├── filters.py
│   ├── similarity.py
│   └── validators.py
├── legacy/                        # Legacy/deprecated (MOVE HERE)
│   ├── embedder.py                # Duplicate of core/embeddings.py
│   ├── loader.py                  # Duplicate of core/data_loader.py
│   └── ...
└── components/                    # Current root files (ORGANIZE)
    ├── aggregator.py
    ├── builder.py
    ├── engine.py
    ├── handler.py
    ├── match.py
    ├── relation.py
    ├── retriever.py
    ├── structure.py
    └── tuner.py

### Migration Steps
1. Move duplicates to `legacy/` folder
2. Move root files to `components/` folder
3. Add deprecation warnings to legacy files
4. Update imports to use core/embeddings.py and core/data_loader.py
5. Test thoroughly

Phase 2D: File Splitting Details

artificial.py (3,692 lines) → 6 files

Split Plan:
1. base.py (~200 lines) - Base classes and utilities
2. chat_model.py (~800 lines) - BaseChatModel, ChatArtificial
3. ai_interface.py (~1,200 lines) - AI class with main interface
4. completion.py (~600 lines) - Completion logic
5. streaming.py (~400 lines) - Streaming functionality
6. embeddings.py (~400 lines) - Embedding cache and utilities

prompt.py (2,153 lines) → 3 files

Split Plan:
1. templates.py (~1,500 lines) - All prompt templates
2. system_prompts.py (~400 lines) - System-specific prompts
3. manager.py (~250 lines) - PromptManager class

links.py (2,084 lines) → 4 files

Split Plan:
1. core.py (~600 lines) - Core link logic
2. types.py (~700 lines) - Link type definitions
3. validation.py (~400 lines) - Link validation
4. processing.py (~400 lines) - Link processing

motion.py (1,783 lines) → 4 files

Split Plan:
1. core.py (~500 lines) - Core motion logic
2. sensors.py (~500 lines) - Sensor data types
3. gestures.py (~400 lines) - Gesture recognition
4. analysis.py (~400 lines) - Motion analysis

Implementation Strategy

### Priority Order
1. Phase 2A: Inference (HIGHEST - core chatbot)
2. Phase 2B: Response (HIGH - chatbot responses)
3. Phase 2C: Engine (MEDIUM - utilities)
4. Phase 2D: File splitting (CONTINUOUS - during above)

### Testing Strategy
After each phase:
1. Run all existing tests (16 tests)
2. Test manual imports
3. Test chatbot functionality
4. Verify no regressions

### Rollback Plan
- Keep original files until all tests pass
- Use git to track changes
- Can revert to pre-reorganization state if needed

Benefits

### Code Maintainability
- ✅ Smaller, focused files (<500 lines each)
- ✅ Clear separation of concerns
- ✅ Easier to navigate and understand

### Development Velocity
- ✅ Faster to find relevant code
- ✅ Easier to modify without side effects
- ✅ Better IDE support

### Team Collaboration
- ✅ Less merge conflicts
- ✅ Clearer module ownership
- ✅ Easier code reviews

Risk Mitigation

### Import Breakage
- Risk: Changing file locations breaks imports
- Mitigation: Update all imports systematically
- Fallback: Keep `__init__.py` with backward-compatible imports

### Functionality Loss
- Risk: Code split incorrectly
- Mitigation: Comprehensive testing after each phase
- Fallback: Revert to original structure

### Chatbot Disruption
- Risk: AI/response system stops working
- Mitigation: Test chatbot specifically after inference/response changes
- Fallback: Keep original files until verified

Timeline Estimate

Phase 2A (Inference): 2-3 hours
Phase 2B (Response): 2-3 hours
Phase 2C (Engine): 1 hour
Testing & Verification: 1 hour
Total: 6-8 hours

Next Steps

1. Get approval for reorganization plan
2. Start with Phase 2A (Inference)
3. Proceed incrementally with testing
4. Update documentation after completion

---
Created: 2025-12-08
Status: READY FOR IMPLEMENTATION

Promotion Decision

Attach run IDs, datasets, metrics, and reproduction commands.

Source Anchor

Comp-Core/backend/cc-trajectory/legacy/cc-tpo-original/cc-tpo/docs/refactoring/REORGANIZATION_PLAN.md

Detected Structure

Method · Evaluation · Code Anchors · Architecture