Production Deployment Guide¶
This guide covers deploying RAG Modulo in production environments with security, scalability, and reliability considerations.
Table of Contents¶
- Production Overview
- Infrastructure Requirements
- Security Configuration
- Deployment Steps
- Monitoring & Logging
- Backup & Recovery
- Scaling
- Maintenance
Production Overview¶
Production Architecture¶
RAG Modulo production deployment includes:
- Load Balancer: Nginx or cloud load balancer
- Application Servers: Multiple backend instances
- Database Cluster: PostgreSQL with replication
- Vector Database: Milvus cluster
- Object Storage: MinIO or cloud storage
- Monitoring: Prometheus + Grafana
- Logging: Centralized logging system
Production Principles¶
- Security First: All communications encrypted, secure defaults
- High Availability: Redundant components, failover mechanisms
- Scalability: Horizontal scaling capabilities
- Observability: Comprehensive monitoring and logging
- Disaster Recovery: Backup and recovery procedures
Infrastructure Requirements¶
Minimum Requirements¶
- CPU: 8 cores per application server
- RAM: 32GB per application server
- Storage: 500GB SSD per server
- Network: 1Gbps bandwidth
- Load Balancer: 2+ instances for HA
Recommended Requirements¶
- CPU: 16+ cores per application server
- RAM: 64GB+ per application server
- Storage: 1TB+ NVMe SSD per server
- Network: 10Gbps bandwidth
- Load Balancer: 3+ instances for HA
External Dependencies¶
- IBM WatsonX: AI/ML services
- SSL Certificates: For HTTPS
- DNS: Domain name resolution
- CDN: For static assets (optional)
Security Configuration¶
Environment Security¶
Production Environment Variables¶
# Production settings
PRODUCTION_MODE=true
DEBUG=false
LOG_LEVEL=INFO
SECURITY_SCAN=true
# Security
JWT_SECRET_KEY=your-secure-random-secret-key-256-bits
SKIP_AUTH=false
CORS_ORIGINS=https://yourdomain.com,https://www.yourdomain.com
# Database security
COLLECTIONDB_SSL_MODE=require
COLLECTIONDB_SSL_CERT=/path/to/client-cert.pem
COLLECTIONDB_SSL_KEY=/path/to/client-key.pem
COLLECTIONDB_SSL_ROOT_CERT=/path/to/ca-cert.pem
# AI Services (use production credentials)
WATSONX_APIKEY=your-production-watsonx-api-key
WATSONX_URL=https://us-south.ml.cloud.ibm.com
WATSONX_INSTANCE_ID=your-production-instance-id
SSL/TLS Configuration¶
# nginx.conf
server {
listen 443 ssl http2;
server_name yourdomain.com;
ssl_certificate /path/to/certificate.crt;
ssl_certificate_key /path/to/private.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;
ssl_prefer_server_ciphers off;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
# Security headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Content-Type-Options nosniff;
add_header X-Frame-Options DENY;
add_header X-XSS-Protection "1; mode=block";
add_header Referrer-Policy "strict-origin-when-cross-origin";
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Application Security¶
Authentication & Authorization¶
# backend/core/security.py
from fastapi import HTTPException, Depends
from fastapi.security import HTTPBearer
import jwt
from datetime import datetime, timedelta
security = HTTPBearer()
class SecurityConfig:
"""Production security configuration."""
JWT_SECRET_KEY: str = Field(env="JWT_SECRET_KEY")
JWT_ALGORITHM: str = "HS256"
JWT_EXPIRATION_HOURS: int = 24
MAX_LOGIN_ATTEMPTS: int = 5
LOCKOUT_DURATION_MINUTES: int = 30
def verify_token(self, token: str) -> dict:
"""Verify JWT token with production security."""
try:
payload = jwt.decode(
token,
self.JWT_SECRET_KEY,
algorithms=[self.JWT_ALGORITHM]
)
return payload
except jwt.ExpiredSignatureError:
raise HTTPException(status_code=401, detail="Token expired")
except jwt.InvalidTokenError:
raise HTTPException(status_code=401, detail="Invalid token")
Input Validation¶
# backend/core/validation.py
from pydantic import BaseModel, validator, Field
import re
class ProductionInputValidator(BaseModel):
"""Production input validation."""
@validator('email')
def validate_email(cls, v):
"""Validate email format."""
if not re.match(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$', v):
raise ValueError('Invalid email format')
return v.lower()
@validator('password')
def validate_password(cls, v):
"""Validate password strength."""
if len(v) < 8:
raise ValueError('Password must be at least 8 characters')
if not re.search(r'[A-Z]', v):
raise ValueError('Password must contain uppercase letter')
if not re.search(r'[a-z]', v):
raise ValueError('Password must contain lowercase letter')
if not re.search(r'\d', v):
raise ValueError('Password must contain digit')
return v
Deployment Steps¶
1. Infrastructure Setup¶
Server Preparation¶
# Update system
sudo apt update && sudo apt upgrade -y
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Install Docker Compose
sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
# Configure Docker for production
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json > /dev/null <<EOF
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
},
"storage-driver": "overlay2"
}
EOF
sudo systemctl restart docker
SSL Certificate Setup¶
# Using Let's Encrypt
sudo apt install certbot
# Generate certificate
sudo certbot certonly --standalone -d yourdomain.com
# Auto-renewal
sudo crontab -e
# Add: 0 12 * * * /usr/bin/certbot renew --quiet
2. Application Deployment¶
Production Configuration¶
# Clone repository
git clone https://github.com/manavgup/rag_modulo.git
cd rag_modulo
# Create production environment
cp env.example .env.prod
# Edit production configuration
nano .env.prod
Production Docker Compose¶
# docker-compose.prod.yml
version: '3.8'
services:
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
- /etc/letsencrypt:/etc/letsencrypt
depends_on:
- backend
restart: unless-stopped
backend:
build:
context: ./backend
dockerfile: Dockerfile.backend
environment:
- PRODUCTION_MODE=true
- DEBUG=false
- LOG_LEVEL=INFO
env_file:
- .env.prod
volumes:
- backend_data:/mnt/data
- ./logs:/app/logs
restart: unless-stopped
healthcheck:
test: ["CMD", "python", "healthcheck.py"]
interval: 30s
timeout: 10s
retries: 3
frontend:
build:
context: ./webui
dockerfile: Dockerfile.frontend
restart: unless-stopped
postgres:
image: postgres:15
environment:
- POSTGRES_DB=rag_modulo_prod
- POSTGRES_USER=rag_user
- POSTGRES_PASSWORD=${COLLECTIONDB_PASS}
volumes:
- postgres_data:/var/lib/postgresql/data
restart: unless-stopped
milvus-standalone:
image: milvusdb/milvus:v2.3.0
environment:
- ETCD_ENDPOINTS=milvus-etcd:2379
volumes:
- milvus_data:/var/lib/milvus
restart: unless-stopped
volumes:
backend_data:
postgres_data:
milvus_data:
Deploy Application¶
# Build production images
make build-all
# Deploy with production configuration
docker compose -f docker-compose.prod.yml up -d
# Verify deployment
make health-check
3. Database Setup¶
PostgreSQL Configuration¶
-- Create production database
CREATE DATABASE rag_modulo_prod;
CREATE USER rag_user WITH PASSWORD 'secure-password';
GRANT ALL PRIVILEGES ON DATABASE rag_modulo_prod TO rag_user;
-- Configure for production
ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';
ALTER SYSTEM SET max_connections = 200;
ALTER SYSTEM SET shared_buffers = '256MB';
ALTER SYSTEM SET effective_cache_size = '1GB';
ALTER SYSTEM SET maintenance_work_mem = '64MB';
ALTER SYSTEM SET checkpoint_completion_target = 0.9;
ALTER SYSTEM SET wal_buffers = '16MB';
ALTER SYSTEM SET default_statistics_target = 100;
-- Reload configuration
SELECT pg_reload_conf();
Database Backup Setup¶
# Create backup script
cat > backup-db.sh << 'EOF'
#!/bin/bash
BACKUP_DIR="/backups/postgres"
DATE=$(date +%Y%m%d_%H%M%S)
mkdir -p $BACKUP_DIR
docker exec rag_modulo-postgres-1 pg_dump -U rag_user rag_modulo_prod > $BACKUP_DIR/backup_$DATE.sql
# Keep only last 7 days of backups
find $BACKUP_DIR -name "backup_*.sql" -mtime +7 -delete
EOF
chmod +x backup-db.sh
# Schedule daily backups
crontab -e
# Add: 0 2 * * * /path/to/backup-db.sh
Monitoring & Logging¶
Prometheus Configuration¶
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "alerts.yml"
scrape_configs:
- job_name: 'rag-modulo-backend'
static_configs:
- targets: ['backend:8000']
metrics_path: '/metrics'
scrape_interval: 5s
- job_name: 'rag-modulo-postgres'
static_configs:
- targets: ['postgres:5432']
- job_name: 'rag-modulo-milvus'
static_configs:
- targets: ['milvus-standalone:9091']
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
Grafana Dashboard¶
{
"dashboard": {
"title": "RAG Modulo Production Dashboard",
"panels": [
{
"title": "API Response Time",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))",
"legendFormat": "95th percentile"
}
]
},
{
"title": "Error Rate",
"type": "graph",
"targets": [
{
"expr": "rate(http_requests_total{status=~\"5..\"}[5m])",
"legendFormat": "5xx errors"
}
]
},
{
"title": "Database Connections",
"type": "graph",
"targets": [
{
"expr": "pg_stat_database_numbackends",
"legendFormat": "Active connections"
}
]
}
]
}
}
Log Management¶
# docker-compose.logging.yml
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.8.0
environment:
- discovery.type=single-node
- xpack.security.enabled=false
volumes:
- elasticsearch_data:/usr/share/elasticsearch/data
kibana:
image: docker.elastic.co/kibana/kibana:8.8.0
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- "5601:5601"
depends_on:
- elasticsearch
logstash:
image: docker.elastic.co/logstash/logstash:8.8.0
volumes:
- ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
depends_on:
- elasticsearch
volumes:
elasticsearch_data:
Backup & Recovery¶
Backup Strategy¶
Database Backups¶
# Daily full backup
pg_dump -h postgres -U rag_user rag_modulo_prod > backup_$(date +%Y%m%d).sql
# Incremental backup (WAL files)
pg_basebackup -h postgres -U rag_user -D /backups/incremental/$(date +%Y%m%d_%H%M%S)
Application Data Backups¶
# Backup application data
tar -czf app_data_$(date +%Y%m%d).tar.gz volumes/
# Backup configuration
cp -r .env.prod nginx.conf /backups/config/
Recovery Procedures¶
Database Recovery¶
# Restore from backup
psql -h postgres -U rag_user rag_modulo_prod < backup_20240101.sql
# Point-in-time recovery
pg_restore -h postgres -U rag_user -d rag_modulo_prod backup_20240101.dump
Application Recovery¶
# Restore application data
tar -xzf app_data_20240101.tar.gz
# Restore configuration
cp /backups/config/.env.prod .
cp /backups/config/nginx.conf .
# Restart services
docker compose -f docker-compose.prod.yml restart
Scaling¶
Horizontal Scaling¶
Load Balancer Configuration¶
# nginx.conf for multiple backend instances
upstream backend {
server backend1:8000;
server backend2:8000;
server backend3:8000;
}
server {
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Database Scaling¶
# PostgreSQL with read replicas
services:
postgres-master:
image: postgres:15
environment:
- POSTGRES_DB=rag_modulo_prod
- POSTGRES_USER=rag_user
- POSTGRES_PASSWORD=${COLLECTIONDB_PASS}
volumes:
- postgres_master_data:/var/lib/postgresql/data
postgres-replica:
image: postgres:15
environment:
- POSTGRES_DB=rag_modulo_prod
- POSTGRES_USER=rag_user
- POSTGRES_PASSWORD=${COLLECTIONDB_PASS}
- PGUSER=rag_user
command: |
bash -c "
until pg_basebackup -h postgres-master -D /var/lib/postgresql/data -U rag_user -v -P -W; do
echo 'Waiting for master to be available...'
sleep 1s
done
echo 'Backup done, starting replica...'
chmod 0700 /var/lib/postgresql/data
postgres
"
Vertical Scaling¶
Resource Optimization¶
# Monitor resource usage
docker stats
# Optimize container resources
docker update --cpus="2.0" --memory="4g" rag_modulo-backend-1
Maintenance¶
Regular Maintenance Tasks¶
Daily Tasks¶
- Check system health:
make health-check - Review logs:
docker logs rag_modulo-backend-1 - Monitor metrics: Check Grafana dashboard
- Verify backups: Ensure backups completed successfully
Weekly Tasks¶
- Update dependencies:
make check-deps - Security scan:
make security-check - Performance review: Analyze metrics trends
- Clean up old logs:
docker system prune
Monthly Tasks¶
- Update system packages:
sudo apt update && sudo apt upgrade - Review security patches: Check for vulnerabilities
- Capacity planning: Analyze resource usage trends
- Disaster recovery test: Test backup restoration
Update Procedures¶
Application Updates¶
# Pull latest changes
git pull origin main
# Build new images
make build-all
# Deploy with zero downtime
docker compose -f docker-compose.prod.yml up -d --no-deps backend
# Verify deployment
make health-check
System Updates¶
# Update system packages
sudo apt update && sudo apt upgrade
# Update Docker
sudo apt install docker-ce docker-ce-cli containerd.io
# Restart services
sudo systemctl restart docker
docker compose -f docker-compose.prod.yml restart
Troubleshooting¶
Common Production Issues¶
High Memory Usage¶
# Check memory usage
docker stats
# Optimize application
# - Reduce cache sizes
# - Optimize database queries
# - Implement connection pooling
Database Performance¶
-- Check slow queries
SELECT query, mean_time, calls
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;
-- Check connection usage
SELECT count(*) FROM pg_stat_activity;
SSL Certificate Issues¶
# Check certificate expiration
openssl x509 -in /etc/letsencrypt/live/yourdomain.com/cert.pem -noout -dates
# Renew certificate
sudo certbot renew --dry-run
Emergency Procedures¶
Service Outage¶
- Check service status:
docker compose -f docker-compose.prod.yml ps - Check logs:
docker logs rag_modulo-backend-1 - Restart services:
docker compose -f docker-compose.prod.yml restart - Failover: Switch to backup servers if available
- Notify stakeholders: Send incident notification
Data Corruption¶
- Stop services:
docker compose -f docker-compose.prod.yml stop - Restore from backup: Use latest known good backup
- Verify data integrity: Run data validation checks
- Restart services:
docker compose -f docker-compose.prod.yml start - Monitor closely: Watch for recurring issues