Deployment Guide
This guide covers deploying TrajectoryOS to production. We'll use **Fly.io** as the primary example, but principles apply to other platforms (AWS, Railway, etc.).
Full Public Reader
Deployment Guide
Overview
This guide covers deploying TrajectoryOS to production. We'll use Fly.io as the primary example, but principles apply to other platforms (AWS, Railway, etc.).
Architecture
Production Stack:
- Web Dashboard (Next.js) → Fly.io
- API Gateway → Fly.io
- Trajectory Core → Fly.io
- Python Models → Fly.io (separate apps)
- PostgreSQL → Fly.io Postgres
- Redis → Upstash
- Vector Store → Pinecone---
Prerequisites
1. Fly.io Account: Sign up at https://fly.io
2. flyctl CLI: Install via `curl -L https://fly.io/install.sh | sh`
3. Docker: For building containers
4. Environment Variables: Prepare production secrets
---
Step 1: Prepare Dockerfiles
Trajectory Core
File: `services/trajectory-core/Dockerfile`
FROM node:18-alpine AS builder
WORKDIR /app
# Copy package files
COPY package.json pnpm-lock.yaml ./
COPY services/trajectory-core/package.json ./services/trajectory-core/
# Install dependencies
RUN npm install -g pnpm
RUN pnpm install --frozen-lockfile
# Copy source
COPY services/trajectory-core ./services/trajectory-core
COPY prisma ./prisma
# Generate Prisma client
RUN cd services/trajectory-core && pnpm prisma generate
# Build
RUN cd services/trajectory-core && pnpm build
# Production image
FROM node:18-alpine
WORKDIR /app
RUN npm install -g pnpm
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/services/trajectory-core/dist ./dist
COPY --from=builder /app/services/trajectory-core/package.json ./
COPY --from=builder /app/prisma ./prisma
# Run migrations on start
COPY services/trajectory-core/docker-entrypoint.sh ./
RUN chmod +x docker-entrypoint.sh
EXPOSE 3003
CMD ["./docker-entrypoint.sh"]Entrypoint: `services/trajectory-core/docker-entrypoint.sh`
#!/bin/sh
set -e
# Run migrations
pnpm prisma migrate deploy
# Start server
node dist/index.jsWeb Dashboard
File: `apps/web-dashboard/Dockerfile`
FROM node:18-alpine AS builder
WORKDIR /app
COPY apps/web-dashboard/package.json apps/web-dashboard/package-lock.json ./
RUN npm install
COPY apps/web-dashboard ./
RUN npm run build
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/.next ./.next
COPY --from=builder /app/public ./public
COPY --from=builder /app/package.json ./
COPY --from=builder /app/node_modules ./node_modules
EXPOSE 3000
CMD ["npm", "start"]Python Model (Example: Skill Graph)
File: `models/skill_graph/Dockerfile`
FROM python:3.10-slim
WORKDIR /app
COPY models/skill_graph/requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY models/skill_graph ./
EXPOSE 5001
CMD ["uvicorn", "main:app", "--host", "[ip]", "--port", "5001"]---
Step 2: Create Fly.io Apps
Initialize Trajectory Core
cd services/trajectory-core
fly launch --name trajectoryos-core --region sjc --no-deploy
# Edit fly.toml as neededfly.toml:
app = "trajectoryos-core"
primary_region = "sjc"
[build]
dockerfile = "Dockerfile"
[env]
PORT = "3003"
NODE_ENV = "production"
[[services]]
internal_port = 3003
protocol = "tcp"
[[services.ports]]
handlers = ["http"]
port = 80
[[services.ports]]
handlers = ["tls", "http"]
port = 443
[[ services.http_checks]]
interval = "10s"
timeout = "2s"
grace_period = "5s"
method = "GET"
path = "/health"Initialize Web Dashboard
cd apps/web-dashboard
fly launch --name trajectoryos-web --region sjc --no-deployInitialize Python Models
cd models/skill_graph
fly launch --name trajectoryos-skill-model --region sjc --no-deploy
# Repeat for other models---
Step 3: Set Up PostgreSQL
# Create Postgres cluster
fly postgres create --name trajectoryos-db --region sjc
# Attach to trajectory-core
fly postgres attach trajectoryos-db --app trajectoryos-core
# This sets DATABASE_URL automatically---
Step 4: Set Up Redis
Using Upstash (serverless Redis):
1. Go to https://upstash.com
2. Create Redis database
3. Copy connection URL
# Set secret
fly secrets set REDIS_URL="redis://..." --app trajectoryos-core---
Step 5: Configure Secrets
# Set environment variables
fly secrets set \
OPENAI_API_KEY="sk-..." \
JWT_SECRET="your-secret-key" \
--app trajectoryos-core
# For dashboard
fly secrets set \
NEXT_PUBLIC_API_URL="https://trajectoryos-core.fly.dev" \
--app trajectoryos-web---
Step 6: Deploy
# Deploy trajectory-core
cd services/trajectory-core
fly deploy
# Deploy web dashboard
cd apps/web-dashboard
fly deploy
# Deploy Python models
cd models/skill_graph
fly deploy --app trajectoryos-skill-model---
Step 7: Run Migrations
Migrations run automatically via `docker-entrypoint.sh`, but you can also run manually:
fly ssh console --app trajectoryos-core
> cd /app
> pnpm prisma migrate deploy---
Step 8: Verify Deployment
# Check health
curl https://trajectoryos-core.fly.dev/health
# Check logs
fly logs --app trajectoryos-core---
Monitoring
Logs
# Real-time logs
fly logs --app trajectoryos-core
# Historical logs
fly logs --app trajectoryos-core --since 1hMetrics
Fly.io provides built-in metrics:
fly dashboard --app trajectoryos-coreCustom Metrics (Prometheus)
Add Prometheus exporter to your apps:
// services/trajectory-core/src/metrics.ts
import promClient from 'prom-client';
export const register = new promClient.Registry();
export const httpRequestDuration = new promClient.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', ' route', 'status_code'],
registers: [register]
});
// Expose /metrics endpoint
app.get('/metrics', async (req, res) => {
res.set('Content-Type', register.contentType);
res.end(await register.metrics());
});---
Scaling
Horizontal Scaling
# Scale to 3 instances
fly scale count 3 --app trajectoryos-core
# Auto-scaling
fly autoscale set min=2 max=10 --app trajectoryos-coreVertical Scaling
# Increase CPU/RAM
fly scale vm shared-cpu-2x --app trajectoryos-core
fly scale memory 1024 --app trajectoryos-core---
CI/CD with GitHub Actions
.github/workflows/deploy.yml:
name: Deploy to Fly.io
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Fly
uses: superfly/flyctl-actions/setup-flyctl@master
- name: Deploy Trajectory Core
run: |
cd services/trajectory-core
flyctl deploy --remote-only
env:
FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}
- name: Deploy Web Dashboard
run: |
cd apps/web-dashboard
flyctl deploy --remote-only
env:
FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}---
Backup & Recovery
Database Backups
Fly.io Postgres automatically backs up daily. To manually backup:
# Create snapshot
fly postgres db backup --app trajectoryos-db
# Restore from snapshot
fly postgres db restore --app trajectoryos-db <snapshot-id>Application State
For critical data:
# Export user data
fly ssh console --app trajectoryos-core
> pnpm prisma db execute --sql "COPY users TO '/tmp/users.csv' CSV HEADER"
> scp /tmp/users.csv local-backup/---
Rollback
Quick Rollback
# List recent releases
fly releases --app trajectoryos-core
# Rollback to previous version
fly releases rollback v42 --app trajectoryos-coreDatabase Rollback
If a migration fails:
# SSH into app
fly ssh console --app trajectoryos-core
# Rollback migration
pnpm prisma migrate resolve --rolled-back 20251126_migration_name---
Custom Domain
# Add domain
fly certs create trajectoryos.com --app trajectoryos-web
# Verify DNS
fly certs show trajectoryos.com --app trajectoryos-webAdd DNS records:
A @ 66.241.124.x
AAAA @ 2a09:8280:1::x:x---
Alternative Platforms
Railway
Similar workflow:
1. Connect GitHub repo
2. Railway auto-detects Dockerfiles
3. Set environment variables
4. Deploy
AWS (ECS + RDS)
More complex but highly scalable:
1. Push Docker images to ECR
2. Create ECS service definitions
3. Set up Application Load Balancer
4. Configure auto-scaling
DigitalOcean App Platform
Simpler alternative:
1. Connect repo
2. Select services to deploy
3. Configure build/run commands
4. Deploy
---
Troubleshooting
App Won't Start
# Check logs
fly logs --app trajectoryos-core
# SSH into container
fly ssh console --app trajectoryos-core
# Inspect environment
> env | grep DATABASE_URLDatabase Connection Errors
# Verify DATABASE_URL secret
fly secrets list --app trajectoryos-core
# Test connection
fly ssh console --app trajectoryos-core
> pnpm prisma db execute --sql "SELECT 1"High Memory Usage
# Check current usage
fly status --app trajectoryos-core
# Scale up if needed
fly scale memory 2048 --app trajectoryos-core---
Cost Optimization
Fly.io Pricing (estimate):
- Shared CPU (256MB): ~$2/month per instance
- PostgreSQL (1GB): ~$5/month
- Bandwidth: Usually free within limits
Tips:
- Use shared CPU for non-critical services
- Scale down dev/staging environments when not in use
- Enable auto-scaling to match demand
---
Next Steps: Monitoring and observability documentation coming soon. For now, see [Architecture Overview](../architecture/overview.md) for system design.
Promotion Decision
Attach run IDs, datasets, metrics, and reproduction commands.
Source Anchor
Comp-Core/backend/cc-trajectory/docs/ops/deployment.md
Detected Structure
Method · Evaluation · Figures · Code Anchors · Architecture