Skip to content

Production Promotion and Deployment Notifications

This guide explains how to use the manual production promotion process and deployment success notifications in the CI/CD pipeline.

Overview

The CI/CD pipeline includes two key features for production deployments:

  1. Manual Production Promotion: Creates a PR requiring manual review before promoting services from staging to production
  2. Deployment Success Notifications: Automatically notifies the source repository when ArgoCD successfully deploys services

Manual Production Promotion

How It Works

After staging deployment succeeds:

  1. Automatic Trigger: The promote-to-production job automatically starts after successful staging promotion
  2. Create PR: Job copies service versions from staging to production and creates a PR in cluster-gitops
  3. Workflow Completes: CI/CD workflow finishes successfully after PR creation
  4. Manual Review: Administrator reviews the PR in the cluster-gitops repository
  5. Manual Merge: Administrator approves and merges the PR manually
  6. ArgoCD Sync: Production cluster automatically syncs with new versions after merge

Key Point: No GitHub Environment configuration required. The manual gate happens at the PR merge step in cluster-gitops.

Workflow Overview

CI/CD Build → Staging Promotion → Production PR Created → Workflow Complete
                         (Manual Review in cluster-gitops)
                              PR Merged → ArgoCD Syncs Production

Production Promotion Process

Step 1: Automatic PR Creation

When code is pushed to main and staging deployment succeeds:

  1. CI/CD runs: All services build and deploy to staging
  2. Production job triggers: promote-to-production job starts automatically
  3. PR created: A new PR is created in cluster-gitops with the label requires-review
  4. Workflow completes: CI/CD workflow shows success

Step 2: Review the Production PR

  1. Navigate to cluster-gitops:
  2. Go to https://github.com/camaradesuk/cluster-gitops/pulls
  3. Look for PRs with label requires-review

  4. Review the changes:

  5. Check which services are being updated
  6. Verify versions match what's in staging
  7. Review the checklist in PR description

  8. Checklist to verify (included in PR):

  9. All services have been tested in staging
  10. No critical issues reported in staging
  11. Release notes reviewed (if applicable)
  12. Stakeholders notified of production deployment

Step 3: Approve and Merge

Option 1: Merge via GitHub UI

  1. Click "Merge pull request" button
  2. Confirm merge
  3. PR is merged to main
  4. ArgoCD detects changes and syncs production

Option 2: Merge via CLI

cd /path/to/cluster-gitops
gh pr list --label requires-review
gh pr view <PR_NUMBER>  # Review changes
gh pr merge <PR_NUMBER> --squash --delete-branch

Step 4: Monitor Deployment

After merging:

  1. Check ArgoCD sync:
kubectl get applications -n argocd | grep production
  1. Watch deployment progress:
# For specific service (e.g., API)
kubectl get pods -n syrf-production -l app.kubernetes.io/name=syrf-api -w
  1. Verify new version:
kubectl get deployment api-production-syrf-api -n syrf-production \
  -o jsonpath='{.spec.template.spec.containers[0].image}'

Example Production PR

The automated PR will look like this:

## Production Promotion - Manual Review Required

This PR promotes services from staging to production.

**⚠️ MANUAL REVIEW REQUIRED**: This PR requires manual approval and merge by an administrator.

**Source Run**: [42](https://github.com/camaradesuk/syrf-test/actions/runs/19340633801)
**Source Commit**: 6b47f42a
**Created By**: platform-bot

### Services Updated

- API: `api-v8.21.0`
- Web: `web-v5.0.1`
- Docs: `docs-v1.2.3`

### Review Checklist

Before merging, please verify:

- [ ] All services have been tested in staging
- [ ] No critical issues reported in staging
- [ ] Release notes reviewed (if applicable)
- [ ] Stakeholders notified of production deployment

### Deployment

Once merged, ArgoCD will automatically sync the changes to the production cluster.

Deployment Success Notifications

How It Works

After ArgoCD successfully syncs a service to staging or production:

  1. PostSync Hook: ArgoCD triggers a Kubernetes Job after successful sync
  2. GitHub App Authentication: Job uses GitHub App credentials from cluster secrets
  3. API Call: Creates a commit status on the source repository
  4. Status Details:
  5. Context: argocd/deploy-{environment} (e.g., argocd/deploy-staging)
  6. State: success
  7. Description: "Deployed {service}:{version} to {environment}"
  8. Target URL: Link to deployed service

Enabling Deployment Notifications

Prerequisites:

  • GitHub App credentials must be available in the cluster
  • Secret github-app-credentials must exist in the service namespace

Configuration Overview

Deployment notification configuration follows DRY principle with three levels of inheritance:

  1. Environment Shared Values (cluster-gitops/environments/{env}/shared-values.yaml):
  2. Contains common configuration for ALL services in that environment
  3. Already configured with defaults (githubOrg, githubRepo, credentials, etc.)
  4. Services inherit these values automatically

  5. Service-Specific Values (if needed):

  6. Override only what's unique to the service
  7. Typically only need to set enabled: true

Step 1: Enable Notifications for a Service

Edit the service's environment-specific values file in cluster-gitops/environments/{env}/services/{service}.yaml:

# Example: environments/staging/services/api.yaml
service:
  name: api
  chartTag: api-v8.21.0

  # Enable deployment notifications - inherits config from shared-values.yaml
  deploymentNotification:
    enabled: true  # This is all you need!
    # commitSha will be populated by CI/CD during promotion

That's it! The service inherits all other configuration from environments/staging/shared-values.yaml:

  • githubOrg: camaradesuk
  • githubRepo: syrf-test
  • credentialsSecret: github-app-credentials
  • serviceAccount: default
  • createReleaseNote: false (staging) or true (production)

Step 2: Ensure GitHub App Credentials Exist

Verify the secret exists:

kubectl get secret github-app-credentials -n syrf-staging

Expected output:

NAME                      TYPE     DATA   AGE
github-app-credentials    Opaque   2      10d

If missing, the extra-secrets chart should create it automatically from Google Secret Manager.

Step 3: (Optional) Override Configuration

If a service needs custom configuration different from shared values, you can override in the service file:

# Example: environments/staging/services/api.yaml
service:
  name: api
  chartTag: api-v8.21.0

  deploymentNotification:
    enabled: true
    # Override only if needed (inherits from shared-values.yaml by default)
    githubRepo: different-repo  # Only if this service uses a different repo
    createReleaseNote: true     # Only if you want releases in staging (unusual)

Note: In most cases, you won't need any overrides - shared values cover all services.

Viewing Deployment Notifications

Commit Statuses

  1. Go to the commit in GitHub (e.g., https://github.com/camaradesuk/syrf-test/commit/{sha})
  2. Scroll to Checks section
  3. Look for statuses like:
  4. argocd/deploy-staging - Deployed api:8.21.0 to staging
  5. argocd/deploy-production - Deployed api:8.21.0 to production

GitHub Releases (Optional)

If createReleaseNote: true:

  1. Go to Releases tab in GitHub
  2. Look for releases named like: api deployed to staging
  3. Each release includes:
  4. Service name and version
  5. Environment (staging/production)
  6. Deployment timestamp
  7. URL to deployed service

Troubleshooting Deployment Notifications

PostSync Job Not Running

Check if the job exists:

kubectl get jobs -n syrf-staging -l argocd.argoproj.io/hook=PostSync

If not running, check ArgoCD Application status:

kubectl get application api-staging -n argocd -o yaml | grep -A10 status

Job Fails to Authenticate

Check job logs:

POD=$(kubectl get pods -n syrf-staging -l job-name -o name | tail -1)
kubectl logs $POD -n syrf-staging

Common issues:

  • Missing github-app-credentials secret
  • Invalid GitHub App private key
  • Insufficient GitHub App permissions

Status Not Appearing on GitHub

Verify the commit SHA is correct:

kubectl get application api-staging -n argocd -o yaml | grep commitSha

Check GitHub App installation:

# Get installation ID
curl -H "Authorization: Bearer {JWT}" \
  -H "Accept: application/vnd.github+json" \
  "https://api.github.com/orgs/camaradesuk/installation"

Best Practices

Production Promotion

  1. Always test in staging first: Never bypass staging for production deployments
  2. Review PR changes: Check the cluster-gitops PR before merging to production
  3. Monitor deployments: Watch ArgoCD sync status after merging
  4. Rollback plan: Know how to revert production deployments quickly
  5. Communication: Notify stakeholders before merging production PRs

Deployment Notifications

  1. Start disabled: Enable notifications after testing in staging
  2. Use commit statuses: Don't create releases for every deployment (noise)
  3. Monitor job failures: Set up alerts for failed PostSync jobs
  4. Clean up old jobs: Jobs auto-delete after 5 minutes (TTL)

Security Considerations

GitHub App Permissions

The GitHub App needs these permissions:

  • Repository permissions:
  • statuses: write - Create commit statuses
  • contents: write - Create releases (if enabled)

  • Organization permissions:

  • members: read - Verify installation

Production PR Protection

Best practices for cluster-gitops repository:

  • Branch protection on main: Require PR reviews before merging
  • CODEOWNERS file: Automatically request reviews from platform team
  • Status checks: Require YAML validation to pass
  • Audit logs: Review who merged production PRs

Troubleshooting

Production Promotion Fails

Symptom: promote-to-production job fails after staging success

Solutions:

  1. Check PR creation errors:
gh run view {run_id} --log-failed
  1. Verify GitHub App token:
  2. App ID and private key are correct
  3. App is installed on cluster-gitops repository
  4. App has contents: write and pull_requests: write permissions

  5. Check YAML validation:

  6. Ensure all service files are valid YAML
  7. Check for syntax errors in updated files

Deployment Notification Job Timeout

Symptom: PostSync job exceeds 5-minute TTL

Solutions:

  1. Increase job timeout (not recommended):
spec:
  activeDeadlineSeconds: 600  # 10 minutes
  1. Simplify notification logic:
  2. Remove release creation
  3. Use commit statuses only

  4. Check GitHub API rate limits:

curl -H "Authorization: Bearer {token}" \
  https://api.github.com/rate_limit

Future Enhancements

Planned improvements:

  1. Slack notifications: Send deployment notifications to Slack channels
  2. Deployment metrics: Track deployment frequency and success rates
  3. Automated rollback: Trigger rollback on failed health checks
  4. Progressive delivery: Canary deployments with automatic promotion