# IT Request: OpenWebUI Horizontal Scaling Migration ## 📋 **Request Summary** **Simple DNS endpoint change for OpenWebUI horizontal scaling implementation** --- ## 🎯 **Required Change** **Update Microsoft Entra App Proxy backend configuration:** | Field | Current Value | New Value | |-------|---------------|-----------| | Backend URL | `http://ai.ggai:8080` | `http://ai-scaled.ggai:8080` | --- ## ⏱️ **Impact & Timing** - **Downtime**: < 30 seconds (DNS propagation) - **Risk Level**: **Very Low** (simple DNS change) - **Rollback Time**: < 30 seconds (revert DNS change) - **User Impact**: Minimal (brief reconnection required) --- ## ✅ **Pre-Migration Validation** **All testing completed successfully:** ```bash # DNS Resolution Test ✅ nslookup ai-scaled.ggai # Result: Points to load balancer (192.168.144.126, 192.168.144.75) # HTTP Health Check Test ✅ curl -I http://ai-scaled.ggai:8080/health # Result: HTTP 200 OK # Load Balancing Test ✅ # Result: Traffic distributed across multiple healthy instances ``` --- ## 🎯 **Business Justification** ### **💰 Cost Impact: 40% SAVINGS ($1,644/year)** - **Current Cost:** $340.22/month (over-provisioned single instance) - **New Cost:** $203.15/month (right-sized horizontal scaling) - **Monthly Savings:** $137.07/month = **$1,644.84 annual savings** ### **Current Limitations (Single Instance):** - ❌ No redundancy - single point of failure - ❌ No load distribution - performance bottlenecks - ❌ No auto-scaling - cannot handle traffic spikes - ❌ Manual intervention required for outages - ❌ Over-provisioned resources (8 vCPU, 32 GB unused capacity) ### **New Benefits (Horizontal Scaling):** - ✅ **High Availability**: 2-10 instances with automatic failover - ✅ **Auto-Scaling**: Automatic scaling based on CPU utilization - ✅ **Load Distribution**: Traffic balanced across healthy instances - ✅ **Session Management**: Redis-based session sharing - ✅ **Health Monitoring**: Automatic unhealthy instance removal - ✅ **Resource Optimization**: Right-sized instances (50% CPU/Memory reduction) --- ## 🛡️ **Risk Mitigation** ### **Rollback Plan** **If any issues occur, immediately revert:** ``` Backend URL: http://ai-scaled.ggai:8080 → http://ai.ggai:8080 Rollback Time: < 30 seconds ``` ### **Monitoring During Migration** - **Health Dashboard**: OpenWebUI-HorizontalScaling (CloudWatch) - **Instance Count**: 2 healthy instances minimum - **Response Time**: < 2 seconds average - **Error Rate**: < 0.1% --- ## 📞 **Technical Contacts** - **Primary**: [Your Name] - [Your Email] - **Infrastructure Team**: Available for support during migration - **Escalation**: [Manager Name] - [Manager Email] --- ## 📅 **Proposed Timeline** 1. **Pre-Change**: Validate all endpoints (COMPLETED) 2. **Change Window**: [Specify preferred maintenance window] 3. **Post-Change**: Monitor for 30 minutes 4. **Sign-off**: Confirm successful migration --- ## 🔍 **Technical Details** ### **Current Architecture:** ``` Entra App Proxy → ai.ggai:8080 → Single OpenWebUI Instance ``` ### **New Architecture:** ``` Entra App Proxy → ai-scaled.ggai:8080 → Load Balancer → Multiple OpenWebUI Instances ↓ Redis Session Sharing ``` ### **Infrastructure Components:** - **Load Balancer**: AWS Application Load Balancer (internal) - **Compute**: 2-10 AWS Fargate instances (auto-scaling) - **Session Store**: AWS ElastiCache Redis - **Database**: Existing Aurora PostgreSQL (no changes) - **Storage**: Existing EFS (no changes) --- ## ✅ **Approval & Sign-off** **Requested by:** [Your Name] **Date:** [Current Date] **Priority:** Normal **IT Team Approval:** ☐ Infrastructure Team Lead ☐ Network Team Lead ☐ Security Team (if required) **Execution Authorization:** ☐ Change Management Approval ☐ Business Owner Sign-off --- **Questions? Contact [Your Email] or [Your Phone]**