open-webui/iac/IT_REQUEST.md

# IT Request: OpenWebUI Horizontal Scaling Migration

## 📋 **Request Summary**
**Simple DNS endpoint change for OpenWebUI horizontal scaling implementation**

---

## 🎯 **Required Change**
**Update Microsoft Entra App Proxy backend configuration:**

| Field | Current Value | New Value |
|-------|---------------|-----------|
| Backend URL | `http://ai.ggai:8080` | `http://ai-scaled.ggai:8080` |

---

## ⏱️ **Impact & Timing**
- **Downtime**: < 30 seconds (DNS propagation)
- **Risk Level**: **Very Low** (simple DNS change)
- **Rollback Time**: < 30 seconds (revert DNS change)
- **User Impact**: Minimal (brief reconnection required)

---

## ✅ **Pre-Migration Validation**
**All testing completed successfully:**

```bash
# DNS Resolution Test ✅
nslookup ai-scaled.ggai
# Result: Points to load balancer (192.168.144.126, 192.168.144.75)

# HTTP Health Check Test ✅  
curl -I http://ai-scaled.ggai:8080/health
# Result: HTTP 200 OK

# Load Balancing Test ✅
# Result: Traffic distributed across multiple healthy instances
```

---

## 🎯 **Business Justification**

### **💰 Cost Impact: 40% SAVINGS ($1,644/year)**
- **Current Cost:** $340.22/month (over-provisioned single instance)
- **New Cost:** $203.15/month (right-sized horizontal scaling)
- **Monthly Savings:** $137.07/month = **$1,644.84 annual savings**

### **Current Limitations (Single Instance):**
- ❌ No redundancy - single point of failure
- ❌ No load distribution - performance bottlenecks
- ❌ No auto-scaling - cannot handle traffic spikes
- ❌ Manual intervention required for outages
- ❌ Over-provisioned resources (8 vCPU, 32 GB unused capacity)

### **New Benefits (Horizontal Scaling):**
- ✅ **High Availability**: 2-10 instances with automatic failover
- ✅ **Auto-Scaling**: Automatic scaling based on CPU utilization
- ✅ **Load Distribution**: Traffic balanced across healthy instances
- ✅ **Session Management**: Redis-based session sharing
- ✅ **Health Monitoring**: Automatic unhealthy instance removal
- ✅ **Resource Optimization**: Right-sized instances (50% CPU/Memory reduction)

---

## 🛡️ **Risk Mitigation**

### **Rollback Plan**
**If any issues occur, immediately revert:**
```
Backend URL: http://ai-scaled.ggai:8080 → http://ai.ggai:8080
Rollback Time: < 30 seconds
```

### **Monitoring During Migration**
- **Health Dashboard**: OpenWebUI-HorizontalScaling (CloudWatch)
- **Instance Count**: 2 healthy instances minimum
- **Response Time**: < 2 seconds average
- **Error Rate**: < 0.1%

---

## 📞 **Technical Contacts**
- **Primary**: [Your Name] - [Your Email]
- **Infrastructure Team**: Available for support during migration
- **Escalation**: [Manager Name] - [Manager Email]

---

## 📅 **Proposed Timeline**
1. **Pre-Change**: Validate all endpoints (COMPLETED)
2. **Change Window**: [Specify preferred maintenance window]
3. **Post-Change**: Monitor for 30 minutes
4. **Sign-off**: Confirm successful migration

---

## 🔍 **Technical Details**
### **Current Architecture:**
```
Entra App Proxy → ai.ggai:8080 → Single OpenWebUI Instance
```

### **New Architecture:**
```
Entra App Proxy → ai-scaled.ggai:8080 → Load Balancer → Multiple OpenWebUI Instances
                                                        ↓
                                                   Redis Session Sharing
```

### **Infrastructure Components:**
- **Load Balancer**: AWS Application Load Balancer (internal)
- **Compute**: 2-10 AWS Fargate instances (auto-scaling)
- **Session Store**: AWS ElastiCache Redis
- **Database**: Existing Aurora PostgreSQL (no changes)
- **Storage**: Existing EFS (no changes)

---

## ✅ **Approval & Sign-off**

**Requested by:** [Your Name]  
**Date:** [Current Date]  
**Priority:** Normal  

**IT Team Approval:**  
☐ Infrastructure Team Lead  
☐ Network Team Lead  
☐ Security Team (if required)  

**Execution Authorization:**  
☐ Change Management Approval  
☐ Business Owner Sign-off  

---

**Questions? Contact [Your Email] or [Your Phone]**
feat: add terraform project 2025-07-24 17:39:52 +00:00			`# IT Request: OpenWebUI Horizontal Scaling Migration`

			`## 📋 Request Summary`
			`Simple DNS endpoint change for OpenWebUI horizontal scaling implementation`

			`---`

			`## 🎯 Required Change`
			`Update Microsoft Entra App Proxy backend configuration:`

			`\| Field \| Current Value \| New Value \|`
			`\|-------\|---------------\|-----------\|`
			\| Backend URL \| `http://ai.ggai:8080` \| `http://ai-scaled.ggai:8080` \|

			`---`

			`## ⏱️ Impact & Timing`
			`- Downtime: < 30 seconds (DNS propagation)`
			`- Risk Level: Very Low (simple DNS change)`
			`- Rollback Time: < 30 seconds (revert DNS change)`
			`- User Impact: Minimal (brief reconnection required)`

			`---`

			`## ✅ Pre-Migration Validation`
			`All testing completed successfully:`

			```bash
			`# DNS Resolution Test ✅`
			`nslookup ai-scaled.ggai`
			`# Result: Points to load balancer (192.168.144.126, 192.168.144.75)`

			`# HTTP Health Check Test ✅`
			`curl -I http://ai-scaled.ggai:8080/health`
			`# Result: HTTP 200 OK`

			`# Load Balancing Test ✅`
			`# Result: Traffic distributed across multiple healthy instances`
			```

			`---`

			`## 🎯 Business Justification`

			`### 💰 Cost Impact: 40% SAVINGS ($1,644/year)`
			`- Current Cost: $340.22/month (over-provisioned single instance)`
			`- New Cost: $203.15/month (right-sized horizontal scaling)`
			`- Monthly Savings: $137.07/month = $1,644.84 annual savings`

			`### Current Limitations (Single Instance):`
			`- ❌ No redundancy - single point of failure`
			`- ❌ No load distribution - performance bottlenecks`
			`- ❌ No auto-scaling - cannot handle traffic spikes`
			`- ❌ Manual intervention required for outages`
			`- ❌ Over-provisioned resources (8 vCPU, 32 GB unused capacity)`

			`### New Benefits (Horizontal Scaling):`
			`- ✅ High Availability: 2-10 instances with automatic failover`
			`- ✅ Auto-Scaling: Automatic scaling based on CPU utilization`
			`- ✅ Load Distribution: Traffic balanced across healthy instances`
			`- ✅ Session Management: Redis-based session sharing`
			`- ✅ Health Monitoring: Automatic unhealthy instance removal`
			`- ✅ Resource Optimization: Right-sized instances (50% CPU/Memory reduction)`

			`---`

			`## 🛡️ Risk Mitigation`

			`### Rollback Plan`
			`If any issues occur, immediately revert:`
			```
			`Backend URL: http://ai-scaled.ggai:8080 → http://ai.ggai:8080`
			`Rollback Time: < 30 seconds`
			```

			`### Monitoring During Migration`
			`- Health Dashboard: OpenWebUI-HorizontalScaling (CloudWatch)`
			`- Instance Count: 2 healthy instances minimum`
			`- Response Time: < 2 seconds average`
			`- Error Rate: < 0.1%`

			`---`

			`## 📞 Technical Contacts`
			`- Primary: [Your Name] - [Your Email]`
			`- Infrastructure Team: Available for support during migration`
			`- Escalation: [Manager Name] - [Manager Email]`

			`---`

			`## 📅 Proposed Timeline`
			`1. Pre-Change: Validate all endpoints (COMPLETED)`
			`2. Change Window: [Specify preferred maintenance window]`
			`3. Post-Change: Monitor for 30 minutes`
			`4. Sign-off: Confirm successful migration`

			`---`

			`## 🔍 Technical Details`
			`### Current Architecture:`
			```
			`Entra App Proxy → ai.ggai:8080 → Single OpenWebUI Instance`
			```

			`### New Architecture:`
			```
			`Entra App Proxy → ai-scaled.ggai:8080 → Load Balancer → Multiple OpenWebUI Instances`
			`↓`
			`Redis Session Sharing`
			```

			`### Infrastructure Components:`
			`- Load Balancer: AWS Application Load Balancer (internal)`
			`- Compute: 2-10 AWS Fargate instances (auto-scaling)`
			`- Session Store: AWS ElastiCache Redis`
			`- Database: Existing Aurora PostgreSQL (no changes)`
			`- Storage: Existing EFS (no changes)`

			`---`

			`## ✅ Approval & Sign-off`

			`Requested by: [Your Name]`
			`Date: [Current Date]`
			`Priority: Normal`

			`IT Team Approval:`
			`☐ Infrastructure Team Lead`
			`☐ Network Team Lead`
			`☐ Security Team (if required)`

			`Execution Authorization:`
			`☐ Change Management Approval`
			`☐ Business Owner Sign-off`

			`---`

			`Questions? Contact [Your Email] or [Your Phone]`