Training Directory
``` training/ └── ircp/ ├── full_dataset/ # Full dataset training │ ├── best_model.pt # Trained model checkpoint │ ├── inferred_config.json # Model configuration │ └── [other training files] │ ├── complete_training/ # Complete training run │ ├── outputs/ # Training outputs and logs │ ├── evaluation/ # Evaluation results │ └── backups/ # Training backups └── ircp_training_backup_20250815_173556/ ```
Full Public Reader
Training Directory
This directory contains all machine learning training artifacts, models, and evaluation results for the IRCP project.
Structure
training/
└── ircp/
├── full_dataset/ # Full dataset training
│ ├── best_model.pt # Trained model checkpoint
│ ├── inferred_config.json # Model configuration
│ └── [other training files]
│
├── complete_training/ # Complete training run
│
├── outputs/ # Training outputs and logs
│
├── evaluation/ # Evaluation results
│
└── backups/ # Training backups
└── ircp_training_backup_20250815_173556/Training Artifacts
### full_dataset/
Contains the production IRCP model trained on the full dataset.
Key Files:
- `best_model.pt` - PyTorch model checkpoint
- `inferred_config.json` - Model configuration
- Training hyperparameters and metrics
Used by:
- `apps/liquid-chat-backend/main.py` - Loads this model for embeddings
### complete_training/
Additional training run with complete dataset.
### outputs/
Training logs, metrics, and intermediate outputs.
### evaluation/
Evaluation results and performance metrics.
### backups/
Historical training checkpoints and backups.
Model Details
Architecture: Custom SentenceTransformer with IRCP (Inverse Ring Contextual Propagation)
Embedding Dimension: 384
Training Data: 277 conversations from Claude AI
Purpose:
- Semantic embedding generation
- Conversation similarity search
- DLM coordinate calculation
Usage
Loading the Model
from pathlib import Path
from ircp.models.sentence_transformer_icp import SentenceTransformerICP
import torch
import json
# Paths
PROJECT_ROOT = Path(__file__).parent.parent
model_path = PROJECT_ROOT / "training" / "ircp" / "full_dataset" / "best_model.pt"
config_path = PROJECT_ROOT / "training" / "ircp" / "full_dataset" / "inferred_config.json"
# Load configuration
with open(config_path, "r") as f:
config = json.load(f)
# Initialize and load model
model = SentenceTransformerICP(config)
checkpoint = torch.load(model_path, map_location="cpu")
model.load_state_dict(checkpoint.get("model_state_dict", checkpoint))
model.eval()
# Generate embeddings
embedding = model.sentence_transformer.encode(["Your text here"])Training New Models
Place new training scripts and data in this directory structure:
training/
└── ircp/
└── your_training_run/
├── train.py
├── data/
└── outputs/Notes
- Model files can be large (100MB+)
- Ensure sufficient disk space for training
- Backups are kept for disaster recovery
- Evaluation results help track model performance over time
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
Comp-Core/backend/cc-trajectory/legacy/cc-tpo-original/cc-tpo/training/README.md
Detected Structure
Method · Evaluation · Code Anchors · Architecture