Skip to content

GKE Cluster Analysis: camarades

Executive Summary

The camarades GKE cluster is significantly overprovisioned with 50 active cost optimization recommendations from GCP Recommender API. Workloads are requesting 10-100x more CPU resources than they actually use, causing the cluster to maintain 10 nodes when 3-4 would suffice.

Estimated savings: $150-200/month (30-60% reduction) by right-sizing workload resource requests and enabling autoscaling.

Key Findings:

  • Current CPU utilization: 1-4% across all nodes
  • Production API requesting 1500m CPU, using 20m (98.7% waste)
  • 168 pods distributed across 10 nodes (could fit on 3-4 nodes)
  • 7/10 nodes already using cost-effective preemptible instances

Cluster Configuration

Overview

  • Cluster Name: camarades
  • Project: camarades-net
  • Location: europe-west2-a (London)
  • Status: RUNNING ✅
  • Tier: STANDARD
  • Created: January 23, 2021
  • Control Plane Version: 1.33.5-gke.1125000
  • Node Version: 1.33.5-gke.1080000 (some pools at 1.33.5-gke.1125000)
  • Total Nodes: 10
  • Total Pods: 168 (~17 pods/node average)

Network Configuration

  • Public Endpoint: 34.89.63.71
  • Private Endpoint: 10.154.15.240
  • Network: default
  • Subnetwork: default
  • Cluster IP Range: 10.68.0.0/14
  • Services IP Range: 10.71.240.0/20
  • Pod CIDR Size: /24 per node

Security Features

  • Shielded Nodes: ✅ Enabled
  • Workload Identity: ✅ Enabled (camarades-net.svc.id.goog)
  • Binary Authorization: Configured
  • Master Authorized Networks: GCP Public CIDRs access enabled

Maintenance

  • Daily Window: 03:00 UTC (4 hour duration)
  • Auto-upgrade: ✅ Enabled on all pools
  • Auto-repair: ✅ Enabled on all pools

Node Pools

pool-1 (4 nodes) - Compute Optimized

Specification Value
Machine Type c2-standard-4
vCPUs 4 per node (16 total)
Memory 16 GB per node (64 GB total)
Disk 100 GB pd-standard
Preemptible ✅ Yes (~70% cost savings)
Location europe-west2-a
Status RUNNING ✅

Current Utilization:

  • CPU: 1-2% per node
  • Memory: 11-42% per node (highest: 5.6 GB / 16 GB)

main-pool (3 nodes) - General Purpose

Specification Value
Machine Type e2-standard-2
vCPUs 2 per node (6 total)
Memory 8 GB per node (24 GB total)
Disk 100 GB pd-standard
Preemptible ✅ Yes (~70% cost savings)
Location europe-west2-a
Status RUNNING ✅

Current Utilization:

  • CPU: 3-4% per node
  • Memory: 15-24% per node (highest: 1.9 GB / 8 GB)

pool-2 (3 nodes) - High CPU

Specification Value
Machine Type e2-highcpu-4
vCPUs 4 per node (12 total)
Memory 4 GB per node (12 GB total)
Disk 100 GB pd-standard
Preemptible ❌ No (standard pricing)
Location europe-west2-a
Status RUNNING ✅

Current Utilization:

  • CPU: 2-3% per node
  • Memory: 45-62% per node (highest: 2.5 GB / 4 GB)

Note: This pool has the highest memory utilization and should be kept as-is.


Resource Utilization Analysis

Cluster-Wide Summary

Total Capacity:

  • 34 vCPUs
  • 104 GB RAM
  • 10 nodes

Actual Usage (Real-time from kubectl top nodes):

NODE                                    CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)
gke-camarades-main-pool-0e899852-0gfa   93m          4%       1471Mi          24%
gke-camarades-main-pool-0e899852-5x55   73m          3%       928Mi           15%
gke-camarades-main-pool-0e899852-dnck   86m          4%       1458Mi          24%
gke-camarades-pool-1-97eaf9ea-1lr6      53m          1%       1539Mi          11%
gke-camarades-pool-1-97eaf9ea-dpxn      108m         2%       5673Mi          42%
gke-camarades-pool-1-97eaf9ea-gk9e      49m          1%       2061Mi          15%
gke-camarades-pool-1-97eaf9ea-ydao      86m          2%       3459Mi          26%
gke-camarades-pool-2-d9833c38-iiwa      135m         3%       1757Mi          62%
gke-camarades-pool-2-d9833c38-w9o4      149m         3%       1518Mi          54%
gke-camarades-pool-2-d9833c38-xg1q      94m          2%       1286Mi          45%

Key Metrics:

  • Average CPU Usage: 1-4% per node
  • Average Memory Usage: 11-62% per node (highly variable)
  • Total CPU Usage: ~926m out of 34,000m available (2.7%)
  • Total Memory Usage: ~20 GB out of 104 GB available (19%)

Utilization by Pool

Pool CPU Utilization Memory Utilization Assessment
pool-1 1-2% 11-42% 🔴 Massively underutilized
main-pool 3-4% 15-24% 🔴 Significantly underutilized
pool-2 2-3% 45-62% 🟡 CPU underutilized, memory OK

GCP Recommender Analysis

Recommendation Summary

Date: 2025-11-11 API: Recommender API (google.container.DiagnosisRecommender)

Category Count Priority
COST Optimization 50 🔥 High
RELIABILITY 33 🟡 Medium
API Deprecation 1 ⚠️ Critical

Root Cause: Massively Over-Requested Resources

Kubernetes pods are requesting 10-100x more CPU and memory than they actually consume. This causes:

  1. Scheduler Perspective: Nodes appear "full" based on resource requests
  2. GKE Response: Provisions more nodes to accommodate new pods
  3. Reality: Nodes are 96-99% idle while appearing fully allocated
  4. Cost Impact: Paying for 10 nodes to run workloads that fit on 3-4 nodes

Top Overprovisioned Workloads

1. Production API (jx-production/syrf-api) 🔴 CRITICAL

Container: syrf-api

Metric Current Request Actual Usage Recommended Waste
CPU 1500m (1.5 cores) ~20m 20m 98.7%
Memory 3 GiB (3221 MiB) ~795 MiB 795 MiB 74%

Impact Analysis:

  • Each pod reserves 1.5 CPU cores but uses only 0.02 cores
  • This single deployment blocks resources equivalent to ~4 e2-standard-2 nodes
  • Requesting 75x more CPU than needed

Recommended Configuration:

resources:
  requests:
    cpu: 20m        # was: 1500m
    memory: 795Mi   # was: 3Gi
  limits:
    cpu: 100m       # allow 5x burst
    memory: 795Mi   # match request for Guaranteed QoS

2. Staging/Dev API (jx/syrf-api) 🔴 HIGH

Container: syrf-api

Metric Current Request Actual Usage Recommended Waste
CPU 200m ~15m 15m 92.5%
Memory 3 GiB (3221 MiB) ~239 MiB 239 MiB 92%

Impact Analysis:

  • Requesting 13x more CPU than needed
  • Requesting 13x more memory than needed
  • Could free up significant node capacity

Recommended Configuration:

resources:
  requests:
    cpu: 15m        # was: 200m
    memory: 239Mi   # was: 3Gi
  limits:
    cpu: 50m        # allow 3x burst
    memory: 239Mi

3. Staging Web (jx-staging/syrf-web) 🔴 HIGH

Container: syrf-web

Metric Current Request Actual Usage Recommended Waste
CPU 200m ~2m 2m 99%
Memory 128 MiB ~7 MiB 7 MiB 94.5%

Impact Analysis:

  • Requesting 100x more CPU than needed (most extreme case)
  • Requesting 18x more memory than needed
  • Angular static site with NGINX has minimal resource needs

Recommended Configuration:

resources:
  requests:
    cpu: 2m         # was: 200m
    memory: 7Mi     # was: 128Mi
  limits:
    cpu: 10m        # allow 5x burst
    memory: 14Mi    # 2x request

4. Health Check (kuberhealthy/check-reaper) 🟡 MEDIUM

Container: check-reaper

Metric Current Request Actual Usage Recommended Waste
CPU 20m ~4m 4m 80%
Memory 100 MiB ~54 MiB 54 MiB 46%

Recommended Configuration:

resources:
  requests:
    cpu: 4m         # was: 20m
    memory: 54Mi    # was: 100Mi
  limits:
    cpu: 20m        # keep existing burst capacity
    memory: 54Mi

Root Cause Analysis

Why Is This Happening?

1. Default Resource Requests Never Tuned

  • Resources were set during initial deployment (likely conservative estimates)
  • No tuning performed based on actual production usage
  • Development and production use same resource values

2. No Automatic Right-Sizing

  • Vertical Pod Autoscaler (VPA) not enabled
  • Manual review of resource usage not part of workflow
  • No monitoring alerts for over-provisioned workloads

3. Kubernetes Scheduler Uses Requests, Not Actual Usage

  • Scheduler makes placement decisions based on requests field
  • Actual usage (1-4% CPU) is invisible to scheduling logic
  • Nodes appear "full" when they're 96% idle

4. No Cost Optimization Feedback Loop

  • Team unaware of GCP Recommender findings (API wasn't enabled)
  • No regular review of cluster efficiency metrics
  • Cost monitoring not connected to resource utilization

Cascading Effects

Over-Requested Resources
Scheduler Thinks Nodes Full
GKE Maintains 10 Nodes
1-4% CPU Utilization
Unnecessary Cloud Costs

Cost Impact Analysis

Current Monthly Costs (Estimate)

Pool Nodes Machine Type Pricing Model Cost/Node/Month Total/Month
pool-1 4 c2-standard-4 Preemptible ~$30 ~$120
main-pool 3 e2-standard-2 Preemptible ~$15 ~$45
pool-2 3 e2-highcpu-4 Standard ~$90 ~$270
Total 10 ~$435/month

Note: Preemptible instances already provide ~70% savings vs standard pricing. Without preemptible, cost would be ~$900/month.

Projected Costs After Optimization

Scenario 1: Right-size + Manual Autoscale (Conservative)

Approach: Update resource requests, manually reduce node counts

Pool Current Nodes Target Nodes Monthly Savings
pool-1 4 2 ~$60
main-pool 3 2 ~$15
pool-2 3 3 $0 (keep as-is)
Total 10 7 ~$75/month (17% reduction)

New Monthly Cost: ~$360/month

Approach: Update resource requests, enable cluster autoscaler with min=1-2 per pool

Pool Current Min Nodes Max Nodes Expected Steady State Savings
pool-1 4 1 4 2 ~$60/month
main-pool 3 1 3 2 ~$15/month
pool-2 3 1 3 2-3 ~$0-90/month
Total 10 3 10 6-7 ~$75-165/month (17-38%)

New Monthly Cost: ~$270-360/month

Scenario 3: Right-size + Autoscaler + VPA (Aggressive)

Approach: Full automation with VPA continuously optimizing requests

Metric Value
Expected Node Count 4-5 nodes
Monthly Savings ~$150-200/month
Reduction 40-46%

New Monthly Cost: ~$235-285/month

Cost Savings Summary

Scenario Action Required Monthly Savings One-Time Effort Ongoing Maintenance
Do Nothing None $0 0 hours High (manual scaling)
Manual Right-size Update manifests $75 2-4 hours High (manual monitoring)
+ Autoscaler Enable autoscaling $75-165 3-5 hours Medium (occasional tuning)
+ VPA Enable VPA $150-200 4-6 hours Low (automated)

Recommendation: Scenario 3 (Full Automation) provides best long-term value.


Recommendations

🔥 Priority 1: Right-Size Critical Workloads (Immediate)

Timeline: Week 1 Effort: 2-4 hours Risk: Low (testing in staging first) Impact: High (enables all other optimizations)

Step 1: Update Production API (Highest Impact)

File: Kubernetes manifest for jx-production/syrf-api

apiVersion: apps/v1
kind: Deployment
metadata:
  name: syrf-api
  namespace: jx-production
spec:
  template:
    spec:
      containers:
      - name: syrf-api
        resources:
          requests:
            cpu: 20m        # was: 1500m (-98.7%)
            memory: 795Mi   # was: 3Gi (-74%)
          limits:
            cpu: 100m       # allow 5x burst for peak loads
            memory: 795Mi   # Guaranteed QoS

Expected Impact:

  • Frees up 1.48 CPU cores per pod
  • Could eliminate 1-2 nodes from cluster

Step 2: Update Staging/Dev Workloads

Files:

  • jx/syrf-api deployment
  • jx-staging/syrf-web deployment
# jx/syrf-api
resources:
  requests:
    cpu: 15m        # was: 200m
    memory: 239Mi   # was: 3Gi
  limits:
    cpu: 50m
    memory: 239Mi

# jx-staging/syrf-web
resources:
  requests:
    cpu: 2m         # was: 200m
    memory: 7Mi     # was: 128Mi
  limits:
    cpu: 10m
    memory: 14Mi

Step 3: Testing Protocol

  1. Apply changes to staging first
kubectl apply -f deployments/staging/
  1. Monitor for 24-48 hours
# Watch pod metrics
kubectl top pods -n jx-staging --watch

# Check for OOMKilled or CPU throttling
kubectl get events -n jx-staging --watch
  1. Verify application health
  2. Check application logs for errors
  3. Run integration tests
  4. Monitor response times

  5. If stable, promote to production

kubectl apply -f deployments/production/
  1. Monitor production for 48 hours
  2. Same monitoring as staging
  3. Be ready to rollback if issues

🎯 Priority 2: Enable Cluster Autoscaler (Short-term)

Timeline: Week 2 Effort: 1-2 hours Risk: Low (requires right-sizing first) Impact: Medium (automatic cost savings)

Enable Autoscaling on Node Pools

# pool-1: Scale 1-4 nodes
gcloud container clusters update camarades \
  --enable-autoscaling \
  --node-pool=pool-1 \
  --min-nodes=1 \
  --max-nodes=4 \
  --location=europe-west2-a \
  --project=camarades-net

# main-pool: Scale 1-3 nodes
gcloud container clusters update camarades \
  --enable-autoscaling \
  --node-pool=main-pool \
  --min-nodes=1 \
  --max-nodes=3 \
  --location=europe-west2-a \
  --project=camarades-net

# pool-2: Scale 1-3 nodes
gcloud container clusters update camarades \
  --enable-autoscaling \
  --node-pool=pool-2 \
  --min-nodes=1 \
  --max-nodes=3 \
  --location=europe-west2-a \
  --project=camarades-net

Expected Behavior

After right-sizing resource requests:

  1. Cluster autoscaler recognizes excess capacity
  2. Begins draining underutilized nodes
  3. Scales down to minimum nodes (3 total)
  4. Scales up automatically when pods are pending
  5. Scales down during off-peak hours

Monitoring Autoscaling

# View autoscaler status
kubectl get configmap cluster-autoscaler-status \
  -n kube-system \
  -o yaml

# View autoscaler logs
kubectl logs -f deployment/cluster-autoscaler \
  -n kube-system

# Check node pool sizes
gcloud container node-pools list \
  --cluster=camarades \
  --location=europe-west2-a

🤖 Priority 3: Enable Vertical Pod Autoscaler (Short-term)

Timeline: Week 2-3 Effort: 2-3 hours Risk: Low (recommendation mode first) Impact: High (automated right-sizing)

Enable VPA on Cluster

gcloud container clusters update camarades \
  --enable-vertical-pod-autoscaling \
  --location=europe-west2-a \
  --project=camarades-net

Apply VPA to Key Workloads

Start with recommendation mode (doesn't auto-apply changes):

# vpa-syrf-api-production.yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: syrf-api-vpa
  namespace: jx-production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: syrf-api
  updatePolicy:
    updateMode: "Off"  # Start with recommendations only
  resourcePolicy:
    containerPolicies:
    - containerName: syrf-api
      minAllowed:
        cpu: 10m
        memory: 100Mi
      maxAllowed:
        cpu: 500m
        memory: 2Gi

Apply VPAs:

kubectl apply -f vpa-syrf-api-production.yaml
kubectl apply -f vpa-syrf-api-staging.yaml
kubectl apply -f vpa-syrf-web-staging.yaml

Review VPA Recommendations

After 24 hours, check recommendations:

kubectl describe vpa syrf-api-vpa -n jx-production

Enable Auto-Updates (After Testing)

Once confident in recommendations, switch to auto mode:

updatePolicy:
  updateMode: "Auto"  # was: "Off"

Note: VPA will evict and recreate pods to apply new resource requests. Consider using updateMode: "Recreate" for more control.


📊 Priority 4: Implement Monitoring & Alerting (Ongoing)

Timeline: Week 3-4 Effort: 3-4 hours setup, ongoing monitoring Risk: None (observability only) Impact: Medium (prevents future over-provisioning)

Enable GKE Cost Allocation

gcloud container clusters update camarades \
  --enable-cost-allocation \
  --location=europe-west2-a \
  --project=camarades-net

Benefits:

  • Track costs per namespace, workload, and label
  • Identify cost trends over time
  • Export to BigQuery for analysis

Set Up Monitoring Dashboards

Create custom dashboard in Google Cloud Console:

Metrics to Track:

  • Node CPU/Memory utilization (target: 40-70%)
  • Pod CPU/Memory requests vs usage
  • Cluster autoscaler events
  • VPA recommendation application rate
  • Cost per namespace

Configure Alerts

Alert 1: Low CPU Utilization

# Alert when node CPU < 20% for 1 hour
condition:
  metric: kubernetes.io/node/cpu/allocatable_utilization
  threshold: 0.2
  duration: 3600s
  comparison: LESS_THAN

Alert 2: Pending Pods (Autoscaler Issue)

# Alert when pods pending > 5 minutes
condition:
  metric: kubernetes.io/pod/status/phase
  value: "Pending"
  duration: 300s

Weekly Review Process

  1. Review GCP Recommender (automated)
gcloud recommender recommendations list \
  --project=camarades-net \
  --location=europe-west2-a \
  --recommender=google.container.DiagnosisRecommender \
  --filter="primaryImpact.category=COST AND stateInfo.state=ACTIVE"
  1. Check VPA recommendations (if not in auto mode)
  2. Review cost trends in GCP Cost Explorer
  3. Adjust autoscaler thresholds if needed

Implementation Roadmap

Week 1: Immediate Actions (Priority 1)

  • Update production API resource requests (jx-production/syrf-api)
  • Update staging API resource requests (jx/syrf-api)
  • Update staging web resource requests (jx-staging/syrf-web)
  • Monitor for 48 hours
  • Apply to remaining overprovisioned workloads (top 10)

Deliverables: Updated Kubernetes manifests, monitoring evidence

Week 2: Enable Autoscaling (Priority 2)

  • Enable cluster autoscaler on pool-1
  • Enable cluster autoscaler on main-pool
  • Enable cluster autoscaler on pool-2
  • Monitor scale-down events
  • Verify application stability during scaling

Deliverables: Autoscaling enabled, initial cost savings visible

Week 3: Enable VPA (Priority 3)

  • Enable VPA on cluster
  • Deploy VPA objects in "Off" mode (recommendations only)
  • Review recommendations after 24-48 hours
  • Switch to "Auto" mode for non-critical workloads
  • Monitor VPA behavior for 1 week

Deliverables: VPA enabled and actively managing resources

Week 4: Monitoring & Optimization (Priority 4)

  • Enable GKE Cost Allocation
  • Create monitoring dashboards
  • Configure alerting rules
  • Document weekly review process
  • Train team on new monitoring tools

Deliverables: Monitoring infrastructure, runbook documentation

Ongoing: Continuous Optimization

  • Weekly review of GCP Recommender
  • Monthly cost analysis and trend review
  • Quarterly node pool optimization review
  • Update this document with new findings

Critical Warnings & Risks

⚠️ API Deprecation (Kubernetes 1.25)

Issue: Cluster uses APIs deprecated in Kubernetes 1.25+

Action Required:

  1. Review deprecated APIs: https://cloud.google.com/kubernetes-engine/docs/deprecations/apis-1-25
  2. Identify affected manifests:
kubectl get events --all-namespaces | grep -i deprecated
  1. Update manifests before upgrading control plane

Timeline: Before next major GKE upgrade


⚠️ Memory Limits Best Practice

Current State: Some workloads have no memory limits set

Risk: Pods can OOM the node, affecting other workloads

Recommendation:

  • Always set memory limits equal to requests for Guaranteed QoS
  • For Burstable QoS, set limits 1.5-2x requests
  • Never run without limits in production

Example:

resources:
  requests:
    memory: 795Mi
  limits:
    memory: 795Mi  # Same as request = Guaranteed QoS

⚠️ Preemptible Node Disruption

Current State: 7/10 nodes use preemptible instances

Risk: Google can terminate preemptible VMs with 30s notice

Mitigation:

  • Ensure applications handle graceful shutdowns
  • Use PodDisruptionBudgets for critical workloads
  • Consider mixing preemptible and standard nodes for high-availability

Example PDB:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: syrf-api-pdb
  namespace: jx-production
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: syrf-api

Rollback Plans

If Resource Changes Cause Issues

Symptoms of Under-Provisioning

  • OOMKilled pods (check kubectl get events)
  • CPU throttling warnings in logs
  • Increased response times
  • 502/503 errors

Quick Rollback Steps

  1. Revert to previous manifest
kubectl rollout undo deployment/syrf-api -n jx-production
  1. Or manually increase resources
kubectl set resources deployment/syrf-api -n jx-production \
  --containers=syrf-api \
  --requests=cpu=100m,memory=1Gi \
  --limits=cpu=200m,memory=1Gi
  1. Monitor recovery
kubectl rollout status deployment/syrf-api -n jx-production
kubectl top pods -n jx-production

If Autoscaler Scales Down Too Aggressively

  1. Adjust minimum nodes
gcloud container clusters update camarades \
  --node-pool=pool-1 \
  --min-nodes=2 \
  --location=europe-west2-a
  1. Or disable autoscaling temporarily
gcloud container clusters update camarades \
  --no-enable-autoscaling \
  --node-pool=pool-1 \
  --location=europe-west2-a

If VPA Makes Incorrect Recommendations

  1. Switch to recommendation-only mode
kubectl patch vpa syrf-api-vpa -n jx-production \
  --type='json' \
  -p='[{"op": "replace", "path": "/spec/updatePolicy/updateMode", "value": "Off"}]'
  1. Or adjust min/max bounds
resourcePolicy:
  containerPolicies:
  - containerName: syrf-api
    minAllowed:
      cpu: 50m      # Increase minimum
      memory: 500Mi

Success Metrics

Short-term (1-2 weeks)

  • Node CPU utilization increases from 1-4% to 30-50%
  • Node count reduces from 10 to 6-7 nodes
  • No increase in pod restarts or OOMKilled events
  • Application response times remain stable

Medium-term (1 month)

  • Monthly GCP bill decreases by $75-150
  • Cluster autoscaler successfully scales down during off-peak
  • VPA recommendations align with actual usage (validation)
  • Zero production incidents related to resource constraints

Long-term (3 months)

  • Sustained 40-60% node utilization
  • Node count stabilizes at 4-5 nodes average
  • Cost savings of $150-200/month achieved
  • Automated optimization reduces manual intervention

Internal Documentation

External Resources


Appendix: Detailed Metrics

Full Node Utilization Table

Node Name Pool CPU Cores CPU Usage CPU % Memory Memory Usage Memory %
gke-camarades-main-pool-0e899852-0gfa main-pool 2 93m 4% 8 GB 1471 Mi 24%
gke-camarades-main-pool-0e899852-5x55 main-pool 2 73m 3% 8 GB 928 Mi 15%
gke-camarades-main-pool-0e899852-dnck main-pool 2 86m 4% 8 GB 1458 Mi 24%
gke-camarades-pool-1-97eaf9ea-1lr6 pool-1 4 53m 1% 16 GB 1539 Mi 11%
gke-camarades-pool-1-97eaf9ea-dpxn pool-1 4 108m 2% 16 GB 5673 Mi 42%
gke-camarades-pool-1-97eaf9ea-gk9e pool-1 4 49m 1% 16 GB 2061 Mi 15%
gke-camarades-pool-1-97eaf9ea-ydao pool-1 4 86m 2% 16 GB 3459 Mi 26%
gke-camarades-pool-2-d9833c38-iiwa pool-2 4 135m 3% 4 GB 1757 Mi 62%
gke-camarades-pool-2-d9833c38-w9o4 pool-2 4 149m 3% 4 GB 1518 Mi 54%
gke-camarades-pool-2-d9833c38-xg1q pool-2 4 94m 2% 4 GB 1286 Mi 45%

Totals:

  • CPU: 926m / 34,000m (2.7% utilization)
  • Memory: 20,150 Mi / 106,496 Mi (19% utilization)

Change Log

Date Author Changes
2025-11-11 Claude (AI Assistant) Initial analysis and recommendations

Document Status: Approved for implementation Next Review: 2025-12-11 (1 month after implementation) Owner: DevOps Team