GKE Cluster Analysis: camarades¶
Executive Summary¶
The camarades GKE cluster is significantly overprovisioned with 50 active cost optimization recommendations from GCP Recommender API. Workloads are requesting 10-100x more CPU resources than they actually use, causing the cluster to maintain 10 nodes when 3-4 would suffice.
Estimated savings: $150-200/month (30-60% reduction) by right-sizing workload resource requests and enabling autoscaling.
Key Findings:
- Current CPU utilization: 1-4% across all nodes
- Production API requesting 1500m CPU, using 20m (98.7% waste)
- 168 pods distributed across 10 nodes (could fit on 3-4 nodes)
- 7/10 nodes already using cost-effective preemptible instances
Cluster Configuration¶
Overview¶
- Cluster Name: camarades
- Project: camarades-net
- Location: europe-west2-a (London)
- Status: RUNNING ✅
- Tier: STANDARD
- Created: January 23, 2021
- Control Plane Version: 1.33.5-gke.1125000
- Node Version: 1.33.5-gke.1080000 (some pools at 1.33.5-gke.1125000)
- Total Nodes: 10
- Total Pods: 168 (~17 pods/node average)
Network Configuration¶
- Public Endpoint: 34.89.63.71
- Private Endpoint: 10.154.15.240
- Network: default
- Subnetwork: default
- Cluster IP Range: 10.68.0.0/14
- Services IP Range: 10.71.240.0/20
- Pod CIDR Size: /24 per node
Security Features¶
- Shielded Nodes: ✅ Enabled
- Workload Identity: ✅ Enabled (camarades-net.svc.id.goog)
- Binary Authorization: Configured
- Master Authorized Networks: GCP Public CIDRs access enabled
Maintenance¶
- Daily Window: 03:00 UTC (4 hour duration)
- Auto-upgrade: ✅ Enabled on all pools
- Auto-repair: ✅ Enabled on all pools
Node Pools¶
pool-1 (4 nodes) - Compute Optimized¶
| Specification | Value |
|---|---|
| Machine Type | c2-standard-4 |
| vCPUs | 4 per node (16 total) |
| Memory | 16 GB per node (64 GB total) |
| Disk | 100 GB pd-standard |
| Preemptible | ✅ Yes (~70% cost savings) |
| Location | europe-west2-a |
| Status | RUNNING ✅ |
Current Utilization:
- CPU: 1-2% per node
- Memory: 11-42% per node (highest: 5.6 GB / 16 GB)
main-pool (3 nodes) - General Purpose¶
| Specification | Value |
|---|---|
| Machine Type | e2-standard-2 |
| vCPUs | 2 per node (6 total) |
| Memory | 8 GB per node (24 GB total) |
| Disk | 100 GB pd-standard |
| Preemptible | ✅ Yes (~70% cost savings) |
| Location | europe-west2-a |
| Status | RUNNING ✅ |
Current Utilization:
- CPU: 3-4% per node
- Memory: 15-24% per node (highest: 1.9 GB / 8 GB)
pool-2 (3 nodes) - High CPU¶
| Specification | Value |
|---|---|
| Machine Type | e2-highcpu-4 |
| vCPUs | 4 per node (12 total) |
| Memory | 4 GB per node (12 GB total) |
| Disk | 100 GB pd-standard |
| Preemptible | ❌ No (standard pricing) |
| Location | europe-west2-a |
| Status | RUNNING ✅ |
Current Utilization:
- CPU: 2-3% per node
- Memory: 45-62% per node (highest: 2.5 GB / 4 GB)
Note: This pool has the highest memory utilization and should be kept as-is.
Resource Utilization Analysis¶
Cluster-Wide Summary¶
Total Capacity:
- 34 vCPUs
- 104 GB RAM
- 10 nodes
Actual Usage (Real-time from kubectl top nodes):
NODE CPU(cores) CPU(%) MEMORY(bytes) MEMORY(%)
gke-camarades-main-pool-0e899852-0gfa 93m 4% 1471Mi 24%
gke-camarades-main-pool-0e899852-5x55 73m 3% 928Mi 15%
gke-camarades-main-pool-0e899852-dnck 86m 4% 1458Mi 24%
gke-camarades-pool-1-97eaf9ea-1lr6 53m 1% 1539Mi 11%
gke-camarades-pool-1-97eaf9ea-dpxn 108m 2% 5673Mi 42%
gke-camarades-pool-1-97eaf9ea-gk9e 49m 1% 2061Mi 15%
gke-camarades-pool-1-97eaf9ea-ydao 86m 2% 3459Mi 26%
gke-camarades-pool-2-d9833c38-iiwa 135m 3% 1757Mi 62%
gke-camarades-pool-2-d9833c38-w9o4 149m 3% 1518Mi 54%
gke-camarades-pool-2-d9833c38-xg1q 94m 2% 1286Mi 45%
Key Metrics:
- Average CPU Usage: 1-4% per node
- Average Memory Usage: 11-62% per node (highly variable)
- Total CPU Usage: ~926m out of 34,000m available (2.7%)
- Total Memory Usage: ~20 GB out of 104 GB available (19%)
Utilization by Pool¶
| Pool | CPU Utilization | Memory Utilization | Assessment |
|---|---|---|---|
| pool-1 | 1-2% | 11-42% | 🔴 Massively underutilized |
| main-pool | 3-4% | 15-24% | 🔴 Significantly underutilized |
| pool-2 | 2-3% | 45-62% | 🟡 CPU underutilized, memory OK |
GCP Recommender Analysis¶
Recommendation Summary¶
Date: 2025-11-11 API: Recommender API (google.container.DiagnosisRecommender)
| Category | Count | Priority |
|---|---|---|
| COST Optimization | 50 | 🔥 High |
| RELIABILITY | 33 | 🟡 Medium |
| API Deprecation | 1 | ⚠️ Critical |
Root Cause: Massively Over-Requested Resources¶
Kubernetes pods are requesting 10-100x more CPU and memory than they actually consume. This causes:
- Scheduler Perspective: Nodes appear "full" based on resource requests
- GKE Response: Provisions more nodes to accommodate new pods
- Reality: Nodes are 96-99% idle while appearing fully allocated
- Cost Impact: Paying for 10 nodes to run workloads that fit on 3-4 nodes
Top Overprovisioned Workloads¶
1. Production API (jx-production/syrf-api) 🔴 CRITICAL¶
Container: syrf-api
| Metric | Current Request | Actual Usage | Recommended | Waste |
|---|---|---|---|---|
| CPU | 1500m (1.5 cores) | ~20m | 20m | 98.7% |
| Memory | 3 GiB (3221 MiB) | ~795 MiB | 795 MiB | 74% |
Impact Analysis:
- Each pod reserves 1.5 CPU cores but uses only 0.02 cores
- This single deployment blocks resources equivalent to ~4 e2-standard-2 nodes
- Requesting 75x more CPU than needed
Recommended Configuration:
resources:
requests:
cpu: 20m # was: 1500m
memory: 795Mi # was: 3Gi
limits:
cpu: 100m # allow 5x burst
memory: 795Mi # match request for Guaranteed QoS
2. Staging/Dev API (jx/syrf-api) 🔴 HIGH¶
Container: syrf-api
| Metric | Current Request | Actual Usage | Recommended | Waste |
|---|---|---|---|---|
| CPU | 200m | ~15m | 15m | 92.5% |
| Memory | 3 GiB (3221 MiB) | ~239 MiB | 239 MiB | 92% |
Impact Analysis:
- Requesting 13x more CPU than needed
- Requesting 13x more memory than needed
- Could free up significant node capacity
Recommended Configuration:
resources:
requests:
cpu: 15m # was: 200m
memory: 239Mi # was: 3Gi
limits:
cpu: 50m # allow 3x burst
memory: 239Mi
3. Staging Web (jx-staging/syrf-web) 🔴 HIGH¶
Container: syrf-web
| Metric | Current Request | Actual Usage | Recommended | Waste |
|---|---|---|---|---|
| CPU | 200m | ~2m | 2m | 99% |
| Memory | 128 MiB | ~7 MiB | 7 MiB | 94.5% |
Impact Analysis:
- Requesting 100x more CPU than needed (most extreme case)
- Requesting 18x more memory than needed
- Angular static site with NGINX has minimal resource needs
Recommended Configuration:
resources:
requests:
cpu: 2m # was: 200m
memory: 7Mi # was: 128Mi
limits:
cpu: 10m # allow 5x burst
memory: 14Mi # 2x request
4. Health Check (kuberhealthy/check-reaper) 🟡 MEDIUM¶
Container: check-reaper
| Metric | Current Request | Actual Usage | Recommended | Waste |
|---|---|---|---|---|
| CPU | 20m | ~4m | 4m | 80% |
| Memory | 100 MiB | ~54 MiB | 54 MiB | 46% |
Recommended Configuration:
resources:
requests:
cpu: 4m # was: 20m
memory: 54Mi # was: 100Mi
limits:
cpu: 20m # keep existing burst capacity
memory: 54Mi
Root Cause Analysis¶
Why Is This Happening?¶
1. Default Resource Requests Never Tuned¶
- Resources were set during initial deployment (likely conservative estimates)
- No tuning performed based on actual production usage
- Development and production use same resource values
2. No Automatic Right-Sizing¶
- Vertical Pod Autoscaler (VPA) not enabled
- Manual review of resource usage not part of workflow
- No monitoring alerts for over-provisioned workloads
3. Kubernetes Scheduler Uses Requests, Not Actual Usage¶
- Scheduler makes placement decisions based on
requestsfield - Actual usage (1-4% CPU) is invisible to scheduling logic
- Nodes appear "full" when they're 96% idle
4. No Cost Optimization Feedback Loop¶
- Team unaware of GCP Recommender findings (API wasn't enabled)
- No regular review of cluster efficiency metrics
- Cost monitoring not connected to resource utilization
Cascading Effects¶
Over-Requested Resources
↓
Scheduler Thinks Nodes Full
↓
GKE Maintains 10 Nodes
↓
1-4% CPU Utilization
↓
Unnecessary Cloud Costs
Cost Impact Analysis¶
Current Monthly Costs (Estimate)¶
| Pool | Nodes | Machine Type | Pricing Model | Cost/Node/Month | Total/Month |
|---|---|---|---|---|---|
| pool-1 | 4 | c2-standard-4 | Preemptible | ~$30 | ~$120 |
| main-pool | 3 | e2-standard-2 | Preemptible | ~$15 | ~$45 |
| pool-2 | 3 | e2-highcpu-4 | Standard | ~$90 | ~$270 |
| Total | 10 | ~$435/month |
Note: Preemptible instances already provide ~70% savings vs standard pricing. Without preemptible, cost would be ~$900/month.
Projected Costs After Optimization¶
Scenario 1: Right-size + Manual Autoscale (Conservative)¶
Approach: Update resource requests, manually reduce node counts
| Pool | Current Nodes | Target Nodes | Monthly Savings |
|---|---|---|---|
| pool-1 | 4 | 2 | ~$60 |
| main-pool | 3 | 2 | ~$15 |
| pool-2 | 3 | 3 | $0 (keep as-is) |
| Total | 10 | 7 | ~$75/month (17% reduction) |
New Monthly Cost: ~$360/month
Scenario 2: Right-size + Cluster Autoscaler (Recommended)¶
Approach: Update resource requests, enable cluster autoscaler with min=1-2 per pool
| Pool | Current | Min Nodes | Max Nodes | Expected Steady State | Savings |
|---|---|---|---|---|---|
| pool-1 | 4 | 1 | 4 | 2 | ~$60/month |
| main-pool | 3 | 1 | 3 | 2 | ~$15/month |
| pool-2 | 3 | 1 | 3 | 2-3 | ~$0-90/month |
| Total | 10 | 3 | 10 | 6-7 | ~$75-165/month (17-38%) |
New Monthly Cost: ~$270-360/month
Scenario 3: Right-size + Autoscaler + VPA (Aggressive)¶
Approach: Full automation with VPA continuously optimizing requests
| Metric | Value |
|---|---|
| Expected Node Count | 4-5 nodes |
| Monthly Savings | ~$150-200/month |
| Reduction | 40-46% |
New Monthly Cost: ~$235-285/month
Cost Savings Summary¶
| Scenario | Action Required | Monthly Savings | One-Time Effort | Ongoing Maintenance |
|---|---|---|---|---|
| Do Nothing | None | $0 | 0 hours | High (manual scaling) |
| Manual Right-size | Update manifests | $75 | 2-4 hours | High (manual monitoring) |
| + Autoscaler | Enable autoscaling | $75-165 | 3-5 hours | Medium (occasional tuning) |
| + VPA | Enable VPA | $150-200 | 4-6 hours | Low (automated) |
Recommendation: Scenario 3 (Full Automation) provides best long-term value.
Recommendations¶
🔥 Priority 1: Right-Size Critical Workloads (Immediate)¶
Timeline: Week 1 Effort: 2-4 hours Risk: Low (testing in staging first) Impact: High (enables all other optimizations)
Step 1: Update Production API (Highest Impact)¶
File: Kubernetes manifest for jx-production/syrf-api
apiVersion: apps/v1
kind: Deployment
metadata:
name: syrf-api
namespace: jx-production
spec:
template:
spec:
containers:
- name: syrf-api
resources:
requests:
cpu: 20m # was: 1500m (-98.7%)
memory: 795Mi # was: 3Gi (-74%)
limits:
cpu: 100m # allow 5x burst for peak loads
memory: 795Mi # Guaranteed QoS
Expected Impact:
- Frees up 1.48 CPU cores per pod
- Could eliminate 1-2 nodes from cluster
Step 2: Update Staging/Dev Workloads¶
Files:
jx/syrf-apideploymentjx-staging/syrf-webdeployment
# jx/syrf-api
resources:
requests:
cpu: 15m # was: 200m
memory: 239Mi # was: 3Gi
limits:
cpu: 50m
memory: 239Mi
# jx-staging/syrf-web
resources:
requests:
cpu: 2m # was: 200m
memory: 7Mi # was: 128Mi
limits:
cpu: 10m
memory: 14Mi
Step 3: Testing Protocol¶
- Apply changes to staging first
- Monitor for 24-48 hours
# Watch pod metrics
kubectl top pods -n jx-staging --watch
# Check for OOMKilled or CPU throttling
kubectl get events -n jx-staging --watch
- Verify application health
- Check application logs for errors
- Run integration tests
-
Monitor response times
-
If stable, promote to production
- Monitor production for 48 hours
- Same monitoring as staging
- Be ready to rollback if issues
🎯 Priority 2: Enable Cluster Autoscaler (Short-term)¶
Timeline: Week 2 Effort: 1-2 hours Risk: Low (requires right-sizing first) Impact: Medium (automatic cost savings)
Enable Autoscaling on Node Pools¶
# pool-1: Scale 1-4 nodes
gcloud container clusters update camarades \
--enable-autoscaling \
--node-pool=pool-1 \
--min-nodes=1 \
--max-nodes=4 \
--location=europe-west2-a \
--project=camarades-net
# main-pool: Scale 1-3 nodes
gcloud container clusters update camarades \
--enable-autoscaling \
--node-pool=main-pool \
--min-nodes=1 \
--max-nodes=3 \
--location=europe-west2-a \
--project=camarades-net
# pool-2: Scale 1-3 nodes
gcloud container clusters update camarades \
--enable-autoscaling \
--node-pool=pool-2 \
--min-nodes=1 \
--max-nodes=3 \
--location=europe-west2-a \
--project=camarades-net
Expected Behavior¶
After right-sizing resource requests:
- Cluster autoscaler recognizes excess capacity
- Begins draining underutilized nodes
- Scales down to minimum nodes (3 total)
- Scales up automatically when pods are pending
- Scales down during off-peak hours
Monitoring Autoscaling¶
# View autoscaler status
kubectl get configmap cluster-autoscaler-status \
-n kube-system \
-o yaml
# View autoscaler logs
kubectl logs -f deployment/cluster-autoscaler \
-n kube-system
# Check node pool sizes
gcloud container node-pools list \
--cluster=camarades \
--location=europe-west2-a
🤖 Priority 3: Enable Vertical Pod Autoscaler (Short-term)¶
Timeline: Week 2-3 Effort: 2-3 hours Risk: Low (recommendation mode first) Impact: High (automated right-sizing)
Enable VPA on Cluster¶
gcloud container clusters update camarades \
--enable-vertical-pod-autoscaling \
--location=europe-west2-a \
--project=camarades-net
Apply VPA to Key Workloads¶
Start with recommendation mode (doesn't auto-apply changes):
# vpa-syrf-api-production.yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: syrf-api-vpa
namespace: jx-production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: syrf-api
updatePolicy:
updateMode: "Off" # Start with recommendations only
resourcePolicy:
containerPolicies:
- containerName: syrf-api
minAllowed:
cpu: 10m
memory: 100Mi
maxAllowed:
cpu: 500m
memory: 2Gi
Apply VPAs:
kubectl apply -f vpa-syrf-api-production.yaml
kubectl apply -f vpa-syrf-api-staging.yaml
kubectl apply -f vpa-syrf-web-staging.yaml
Review VPA Recommendations¶
After 24 hours, check recommendations:
Enable Auto-Updates (After Testing)¶
Once confident in recommendations, switch to auto mode:
Note: VPA will evict and recreate pods to apply new resource requests. Consider using updateMode: "Recreate" for more control.
📊 Priority 4: Implement Monitoring & Alerting (Ongoing)¶
Timeline: Week 3-4 Effort: 3-4 hours setup, ongoing monitoring Risk: None (observability only) Impact: Medium (prevents future over-provisioning)
Enable GKE Cost Allocation¶
gcloud container clusters update camarades \
--enable-cost-allocation \
--location=europe-west2-a \
--project=camarades-net
Benefits:
- Track costs per namespace, workload, and label
- Identify cost trends over time
- Export to BigQuery for analysis
Set Up Monitoring Dashboards¶
Create custom dashboard in Google Cloud Console:
Metrics to Track:
- Node CPU/Memory utilization (target: 40-70%)
- Pod CPU/Memory requests vs usage
- Cluster autoscaler events
- VPA recommendation application rate
- Cost per namespace
Configure Alerts¶
Alert 1: Low CPU Utilization
# Alert when node CPU < 20% for 1 hour
condition:
metric: kubernetes.io/node/cpu/allocatable_utilization
threshold: 0.2
duration: 3600s
comparison: LESS_THAN
Alert 2: Pending Pods (Autoscaler Issue)
# Alert when pods pending > 5 minutes
condition:
metric: kubernetes.io/pod/status/phase
value: "Pending"
duration: 300s
Weekly Review Process¶
- Review GCP Recommender (automated)
gcloud recommender recommendations list \
--project=camarades-net \
--location=europe-west2-a \
--recommender=google.container.DiagnosisRecommender \
--filter="primaryImpact.category=COST AND stateInfo.state=ACTIVE"
- Check VPA recommendations (if not in auto mode)
- Review cost trends in GCP Cost Explorer
- Adjust autoscaler thresholds if needed
Implementation Roadmap¶
Week 1: Immediate Actions (Priority 1)¶
- Update production API resource requests (jx-production/syrf-api)
- Update staging API resource requests (jx/syrf-api)
- Update staging web resource requests (jx-staging/syrf-web)
- Monitor for 48 hours
- Apply to remaining overprovisioned workloads (top 10)
Deliverables: Updated Kubernetes manifests, monitoring evidence
Week 2: Enable Autoscaling (Priority 2)¶
- Enable cluster autoscaler on pool-1
- Enable cluster autoscaler on main-pool
- Enable cluster autoscaler on pool-2
- Monitor scale-down events
- Verify application stability during scaling
Deliverables: Autoscaling enabled, initial cost savings visible
Week 3: Enable VPA (Priority 3)¶
- Enable VPA on cluster
- Deploy VPA objects in "Off" mode (recommendations only)
- Review recommendations after 24-48 hours
- Switch to "Auto" mode for non-critical workloads
- Monitor VPA behavior for 1 week
Deliverables: VPA enabled and actively managing resources
Week 4: Monitoring & Optimization (Priority 4)¶
- Enable GKE Cost Allocation
- Create monitoring dashboards
- Configure alerting rules
- Document weekly review process
- Train team on new monitoring tools
Deliverables: Monitoring infrastructure, runbook documentation
Ongoing: Continuous Optimization¶
- Weekly review of GCP Recommender
- Monthly cost analysis and trend review
- Quarterly node pool optimization review
- Update this document with new findings
Critical Warnings & Risks¶
⚠️ API Deprecation (Kubernetes 1.25)¶
Issue: Cluster uses APIs deprecated in Kubernetes 1.25+
Action Required:
- Review deprecated APIs: https://cloud.google.com/kubernetes-engine/docs/deprecations/apis-1-25
- Identify affected manifests:
- Update manifests before upgrading control plane
Timeline: Before next major GKE upgrade
⚠️ Memory Limits Best Practice¶
Current State: Some workloads have no memory limits set
Risk: Pods can OOM the node, affecting other workloads
Recommendation:
- Always set memory limits equal to requests for Guaranteed QoS
- For Burstable QoS, set limits 1.5-2x requests
- Never run without limits in production
Example:
⚠️ Preemptible Node Disruption¶
Current State: 7/10 nodes use preemptible instances
Risk: Google can terminate preemptible VMs with 30s notice
Mitigation:
- Ensure applications handle graceful shutdowns
- Use PodDisruptionBudgets for critical workloads
- Consider mixing preemptible and standard nodes for high-availability
Example PDB:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: syrf-api-pdb
namespace: jx-production
spec:
minAvailable: 1
selector:
matchLabels:
app: syrf-api
Rollback Plans¶
If Resource Changes Cause Issues¶
Symptoms of Under-Provisioning¶
- OOMKilled pods (check
kubectl get events) - CPU throttling warnings in logs
- Increased response times
- 502/503 errors
Quick Rollback Steps¶
- Revert to previous manifest
- Or manually increase resources
kubectl set resources deployment/syrf-api -n jx-production \
--containers=syrf-api \
--requests=cpu=100m,memory=1Gi \
--limits=cpu=200m,memory=1Gi
- Monitor recovery
If Autoscaler Scales Down Too Aggressively¶
- Adjust minimum nodes
gcloud container clusters update camarades \
--node-pool=pool-1 \
--min-nodes=2 \
--location=europe-west2-a
- Or disable autoscaling temporarily
gcloud container clusters update camarades \
--no-enable-autoscaling \
--node-pool=pool-1 \
--location=europe-west2-a
If VPA Makes Incorrect Recommendations¶
- Switch to recommendation-only mode
kubectl patch vpa syrf-api-vpa -n jx-production \
--type='json' \
-p='[{"op": "replace", "path": "/spec/updatePolicy/updateMode", "value": "Off"}]'
- Or adjust min/max bounds
resourcePolicy:
containerPolicies:
- containerName: syrf-api
minAllowed:
cpu: 50m # Increase minimum
memory: 500Mi
Success Metrics¶
Short-term (1-2 weeks)¶
- Node CPU utilization increases from 1-4% to 30-50%
- Node count reduces from 10 to 6-7 nodes
- No increase in pod restarts or OOMKilled events
- Application response times remain stable
Medium-term (1 month)¶
- Monthly GCP bill decreases by $75-150
- Cluster autoscaler successfully scales down during off-peak
- VPA recommendations align with actual usage (validation)
- Zero production incidents related to resource constraints
Long-term (3 months)¶
- Sustained 40-60% node utilization
- Node count stabilizes at 4-5 nodes average
- Cost savings of $150-200/month achieved
- Automated optimization reduces manual intervention
Related Documentation¶
Internal Documentation¶
External Resources¶
- GKE Best Practices: Resource Requests and Limits
- Vertical Pod Autoscaling
- Cluster Autoscaling
- GKE Recommender API
- GKE Cost Allocation
Appendix: Detailed Metrics¶
Full Node Utilization Table¶
| Node Name | Pool | CPU Cores | CPU Usage | CPU % | Memory | Memory Usage | Memory % |
|---|---|---|---|---|---|---|---|
| gke-camarades-main-pool-0e899852-0gfa | main-pool | 2 | 93m | 4% | 8 GB | 1471 Mi | 24% |
| gke-camarades-main-pool-0e899852-5x55 | main-pool | 2 | 73m | 3% | 8 GB | 928 Mi | 15% |
| gke-camarades-main-pool-0e899852-dnck | main-pool | 2 | 86m | 4% | 8 GB | 1458 Mi | 24% |
| gke-camarades-pool-1-97eaf9ea-1lr6 | pool-1 | 4 | 53m | 1% | 16 GB | 1539 Mi | 11% |
| gke-camarades-pool-1-97eaf9ea-dpxn | pool-1 | 4 | 108m | 2% | 16 GB | 5673 Mi | 42% |
| gke-camarades-pool-1-97eaf9ea-gk9e | pool-1 | 4 | 49m | 1% | 16 GB | 2061 Mi | 15% |
| gke-camarades-pool-1-97eaf9ea-ydao | pool-1 | 4 | 86m | 2% | 16 GB | 3459 Mi | 26% |
| gke-camarades-pool-2-d9833c38-iiwa | pool-2 | 4 | 135m | 3% | 4 GB | 1757 Mi | 62% |
| gke-camarades-pool-2-d9833c38-w9o4 | pool-2 | 4 | 149m | 3% | 4 GB | 1518 Mi | 54% |
| gke-camarades-pool-2-d9833c38-xg1q | pool-2 | 4 | 94m | 2% | 4 GB | 1286 Mi | 45% |
Totals:
- CPU: 926m / 34,000m (2.7% utilization)
- Memory: 20,150 Mi / 106,496 Mi (19% utilization)
Change Log¶
| Date | Author | Changes |
|---|---|---|
| 2025-11-11 | Claude (AI Assistant) | Initial analysis and recommendations |
Document Status: Approved for implementation Next Review: 2025-12-11 (1 month after implementation) Owner: DevOps Team