Skip to content

Deploying Services with GitOps

Complete guide to deploying and updating SyRF services using ArgoCD and GitOps.

Overview

The SyRF platform uses a fully automated GitOps deployment model where:

  1. CI/CD Pipeline (in syrf monorepo) builds Docker images, tags them, and creates git tags
  2. CI/CD automatically updates cluster-gitops repository with new versions
  3. ArgoCD continuously watches cluster-gitops and syncs changes to Kubernetes
  4. Staging promotion is fully automated via pull requests
  5. Production promotion requires manual PR approval for safety

Prerequisites

  • Access to cluster-gitops repository
  • Access to camaradesuk GKE cluster (via kubectl)
  • ArgoCD CLI installed (optional, for advanced operations)
  • Docker images already built and pushed to GHCR (via CI/CD)

Deployment Workflow

1. Image Build (Automatic)

When code is merged to master in the syrf monorepo:

1. GitHub Actions detects changes (path-based filters)
2. GitVersion calculates semantic version
3. Docker image built and pushed to GHCR
   └─ ghcr.io/camaradesuk/syrf-api:1.2.3
   └─ ghcr.io/camaradesuk/syrf-api:1.2.3-sha.abc1234
   └─ ghcr.io/camaradesuk/syrf-api:latest
4. Git tag created: api-v1.2.3

Image Naming Convention: - {service}:{version} - Semantic version (e.g., 1.2.3) - {service}:{version}-sha.{shortsha} - Version + commit SHA (e.g., 1.2.3-sha.abc1234) - {service}:latest - Latest version (not recommended for production)

2. Update Staging Environment (Automatic)

This step is fully automated by CI/CD.

After a successful build, the CI/CD workflow:

  1. Creates a pull request to cluster-gitops
  2. Updates syrf/environments/staging/{service}/config.yaml
  3. Sets the service.chartTag field to the new version
  4. Validates YAML syntax
  5. Auto-merges the PR after validation passes
  6. Waits for merge completion before marking deployment as successful

Example update:

# syrf/environments/staging/api/config.yaml
serviceName: api
envName: staging
service:
  enabled: true
  chartTag: api-v9.1.1  # ← Updated by CI/CD

No manual intervention required for staging deployments.

3. ArgoCD Auto-Sync

ArgoCD detects the change immediately via GitHub webhook and automatically syncs:

1. GitHub webhook notifies ArgoCD of repository change
2. ApplicationSet regenerates Application with new chartTag
3. ArgoCD syncs Application to cluster
4. Helm renders chart from git tag (e.g., api-v9.1.1)
5. Kubernetes pulls new image and updates deployment

Monitor Sync Status:

# Via kubectl
kubectl get pods -n syrf-staging -l app=api -w

# Via ArgoCD UI
# Navigate to: https://argocd.camarades.net

# Via ArgoCD CLI
argocd app get api-staging
argocd app sync api-staging  # Force immediate sync if needed

4. Verify Deployment

Check deployment status:

# Check pod status
kubectl get pods -n syrf-staging -l app=api

# Check pod logs
kubectl logs -n syrf-staging -l app=api --tail=100 -f

# Check service health (if health endpoint exists)
kubectl exec -n syrf-staging deployment/api -- curl -f http://localhost:8080/health

Production Deployment

Automated PR Creation

After successful staging deployment, CI/CD automatically creates a production promotion PR:

  1. Copies all service configurations from staging to production
  2. Updates envName from "staging" to "production"
  3. Creates PR with "requires-review" label
  4. Includes review checklist in PR description

Example PR:

# syrf/environments/production/api/config.yaml
serviceName: api
envName: production  # ← Changed from staging
service:
  enabled: true
  chartTag: api-v9.1.1  # ← Same version as staging

Manual Review and Approval

Platform team must manually review and merge the PR:

  1. Review staging performance and stability
  2. Check for any critical bugs or issues
  3. Verify database migrations (if applicable)
  4. Approve and merge the PR

No automatic merge for production - deliberate manual gate for safety.

ArgoCD Sync to Production

After PR merge, ArgoCD syncs to production namespace:

# Monitor production deployment
kubectl get pods -n syrf-production -l app=api -w

# Check ArgoCD sync status
argocd app get api-production

Checking Current Versions

Via cluster-gitops Repository

cd cluster-gitops

# Check staging versions
for service in api web project-management quartz docs user-guide; do
  echo -n "$service: "
  yq eval '.service.chartTag' syrf/environments/staging/$service/config.yaml
done

# Check production versions
for service in api web project-management quartz docs user-guide; do
  echo -n "$service: "
  yq eval '.service.chartTag' syrf/environments/production/$service/config.yaml
done

Via kubectl

# Check staging image versions
kubectl get deployments -n syrf-staging -o json | \
  jq -r '.items[] | "\(.metadata.name): \(.spec.template.spec.containers[0].image)"'

# Check production image versions
kubectl get deployments -n syrf-production -o json | \
  jq -r '.items[] | "\(.metadata.name): \(.spec.template.spec.containers[0].image)"'

Deploying Specific Services

All Services Follow the Same Pattern

Services: - api - Main API service - web - Angular frontend - project-management - Project management service - quartz - Background job scheduler - docs - Team documentation (MkDocs) - user-guide - User documentation (Jekyll)

Deployment Process: 1. Push code changes to main branch in syrf monorepo 2. CI/CD builds and tags (e.g., api-v9.1.1) 3. CI/CD auto-updates staging via PR 4. ArgoCD syncs to staging namespace 5. CI/CD creates production promotion PR 6. Team reviews and manually merges 7. ArgoCD syncs to production namespace

Chart Locations (in syrf monorepo): - src/services/{service}/.chart/ - Helm charts for each service

Configuration Files (in cluster-gitops): - Base: syrf/services/{service}/config.yaml - Staging: syrf/environments/staging/{service}/config.yaml - Production: syrf/environments/production/{service}/config.yaml

Advanced Operations

Force Immediate Sync

If you don't want to wait for ArgoCD's polling interval:

# Via ArgoCD CLI
argocd app sync api-staging

# Via kubectl (label the Application resource)
kubectl patch application api-staging -n argocd --type merge -p '{"operation": {"initiatedBy": {"automated": true}}}'

Rollback to Previous Version

To rollback a service:

cd cluster-gitops

# 1. Check git history to find previous version
git log -- syrf/environments/staging/api/config.yaml

# 2. Manually update to previous chartTag
yq eval '.service.chartTag = "api-v9.1.0"' -i syrf/environments/staging/api/config.yaml

# 3. Commit and push
git checkout -b rollback-api-v9.1.0
git add syrf/environments/staging/api/config.yaml
git commit -m "rollback(api): revert staging to v9.1.0"
git push

# 4. Create PR for review
gh pr create --title "Rollback API to v9.1.0" --body "Rolling back due to issue X"

Deploy with Environment-Specific Helm Values

You can customize service behavior per environment using values files:

# Edit environment-specific Helm values
nano syrf/environments/staging/api/values.yaml

Example with custom configuration:

replicaCount: 3  # Scale up in staging

env:
  - name: LOG_LEVEL
    value: "DEBUG"  # More verbose logging in staging

resources:
  requests:
    cpu: "200m"
    memory: "512Mi"
  limits:
    cpu: "1000m"
    memory: "1Gi"

Troubleshooting

Deployment Not Syncing

Symptom: ArgoCD shows "OutOfSync" but doesn't sync

Solutions:

# 1. Check ArgoCD application status
argocd app get api-staging

# 2. Check for sync errors
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller

# 3. Force sync
argocd app sync api-staging --force

# 4. Check Helm chart validity
helm template syrf/src/services/api/helm/ -f environments/staging/api.values.yaml

Image Pull Errors

Symptom: Pods stuck in ImagePullBackOff or ErrImagePull

Solutions:

# 1. Check pod events
kubectl describe pod -n staging <pod-name>

# 2. Verify image exists in GHCR
# Check: https://github.com/orgs/camaradesuk/packages

# 3. Verify image tag matches values file
kubectl get deployment -n staging api -o jsonpath='{.spec.template.spec.containers[0].image}'

# 4. Check image pull secrets (if private repo)
kubectl get secret -n staging regcred

Pod CrashLoopBackOff

Symptom: Pods restart repeatedly

Solutions:

# 1. Check pod logs
kubectl logs -n staging -l app=api --tail=200

# 2. Check previous container logs
kubectl logs -n staging <pod-name> --previous

# 3. Check liveness/readiness probes
kubectl describe pod -n staging <pod-name> | grep -A 5 Liveness
kubectl describe pod -n staging <pod-name> | grep -A 5 Readiness

# 4. Verify environment variables and secrets
kubectl exec -n staging deployment/api -- env | grep -v SECRET

External Resources

  • ArgoCD Documentation: https://argo-cd.readthedocs.io/
  • Helm Documentation: https://helm.sh/docs/
  • Kubernetes Documentation: https://kubernetes.io/docs/