ti.sil/open-webui

Fork 0

mirror of https://github.com/open-webui/open-webui.git synced 2025-12-14 21:35:19 +00:00

LoiTra 3210b2fa45

feat: add terraform project

2025-12-08 11:27:59 +07:00

4 KiB

Raw Blame History

IT Request: OpenWebUI Horizontal Scaling Migration

📋 Request Summary

Simple DNS endpoint change for OpenWebUI horizontal scaling implementation

🎯 Required Change

Update Microsoft Entra App Proxy backend configuration:

Field	Current Value	New Value
Backend URL	`http://ai.ggai:8080`	`http://ai-scaled.ggai:8080`

⏱️ Impact & Timing

Downtime: < 30 seconds (DNS propagation)
Risk Level: Very Low (simple DNS change)
Rollback Time: < 30 seconds (revert DNS change)
User Impact: Minimal (brief reconnection required)

✅ Pre-Migration Validation

All testing completed successfully:

# DNS Resolution Test ✅
nslookup ai-scaled.ggai
# Result: Points to load balancer (192.168.144.126, 192.168.144.75)

# HTTP Health Check Test ✅  
curl -I http://ai-scaled.ggai:8080/health
# Result: HTTP 200 OK

# Load Balancing Test ✅
# Result: Traffic distributed across multiple healthy instances

🎯 Business Justification

💰 Cost Impact: 40% SAVINGS ($1,644/year)

Current Cost: $340.22/month (over-provisioned single instance)
New Cost: $203.15/month (right-sized horizontal scaling)
Monthly Savings: $137.07/month = $1,644.84 annual savings

Current Limitations (Single Instance):

❌ No redundancy - single point of failure
❌ No load distribution - performance bottlenecks
❌ No auto-scaling - cannot handle traffic spikes
❌ Manual intervention required for outages
❌ Over-provisioned resources (8 vCPU, 32 GB unused capacity)

New Benefits (Horizontal Scaling):

✅ High Availability: 2-10 instances with automatic failover
✅ Auto-Scaling: Automatic scaling based on CPU utilization
✅ Load Distribution: Traffic balanced across healthy instances
✅ Session Management: Redis-based session sharing
✅ Health Monitoring: Automatic unhealthy instance removal
✅ Resource Optimization: Right-sized instances (50% CPU/Memory reduction)

🛡️ Risk Mitigation

Rollback Plan

If any issues occur, immediately revert:

Backend URL: http://ai-scaled.ggai:8080 → http://ai.ggai:8080
Rollback Time: < 30 seconds

Monitoring During Migration

Health Dashboard: OpenWebUI-HorizontalScaling (CloudWatch)
Instance Count: 2 healthy instances minimum
Response Time: < 2 seconds average
Error Rate: < 0.1%

📞 Technical Contacts

Primary: [Your Name] - [Your Email]
Infrastructure Team: Available for support during migration
Escalation: [Manager Name] - [Manager Email]

📅 Proposed Timeline

Pre-Change: Validate all endpoints (COMPLETED)
Change Window: [Specify preferred maintenance window]
Post-Change: Monitor for 30 minutes
Sign-off: Confirm successful migration

🔍 Technical Details

Current Architecture:

Entra App Proxy → ai.ggai:8080 → Single OpenWebUI Instance

New Architecture:

Entra App Proxy → ai-scaled.ggai:8080 → Load Balancer → Multiple OpenWebUI Instances
                                                        ↓
                                                   Redis Session Sharing

Infrastructure Components:

Load Balancer: AWS Application Load Balancer (internal)
Compute: 2-10 AWS Fargate instances (auto-scaling)
Session Store: AWS ElastiCache Redis
Database: Existing Aurora PostgreSQL (no changes)
Storage: Existing EFS (no changes)

✅ Approval & Sign-off

Requested by: [Your Name]
Date: [Current Date]
Priority: Normal

IT Team Approval:
☐ Infrastructure Team Lead
☐ Network Team Lead
☐ Security Team (if required)

Execution Authorization:
☐ Change Management Approval
☐ Business Owner Sign-off

Questions? Contact [Your Email] or [Your Phone]

4 KiB Raw Blame History