Skip to content

Runbook: Renaming syrftest to syrf-prod

Status: DEFERRED - This runbook is preserved for future use. The current decision is to keep syrftest as the production database name to avoid migration complexity. Execute this runbook when/if the team decides to rename the production database.

Objective: Rename the production logical database from syrftest to syrf-prod with minimal downtime. Primary Constraint: The pmStudy collection contains 8GB of indexes vs 4GB RAM on the current M20 tier. Rebuilding these indexes without scaling up will cause severe latency. Estimated Maintenance Window: 30–45 minutes (including buffer).


✅ Phase 1: Pre-Flight Checklist

Perform these steps 24 hours to 1 hour before the maintenance window.

  1. Provision "Jumpbox" VM:

    • Create a small Ubuntu VM (e.g., AWS t3.medium) in AWS Ireland (eu-west-1).
    • Why: Keeps traffic inside the AWS internal network for maximum speed.
    • Install Tools on VM:

      wget https://fastdl.mongodb.org/tools/db/mongodb-database-tools-ubuntu2204-x86_64-100.9.4.tgz
      tar -zxvf mongodb-database-tools-*.tgz
      sudo cp mongodb-database-tools-*/bin/* /usr/local/bin/
      
  2. Verify Access:

    • Allow the VM's private IP in the MongoDB Atlas Network Access whitelist.
    • Test connection from VM: mongosh "mongodb+srv://<CLUSTER_HOST>" --username admin
  3. Prepare Application:
    • Prepare the new environment variable string: .../syrf-prod?retryWrites=true...
    • Ensure you have a way to stop/start the app services quickly.

🚀 Phase 2: Execution (Maintenance Window)

Step 1: Scale Up Cluster (Critical)

Time required: ~5-10 mins (Rolling update)

  1. Log in to MongoDB Atlas.
  2. Navigate to Database > Cluster0 > Configuration.
  3. Change Tier from M20 to M40 (16GB RAM).
    • Reasoning: You have 8GB of indexes. M20 (4GB RAM) will crash/thrash during index rebuilds. M40 ensures indexes fit in RAM for a fast rebuild.
  4. Wait for the cluster status to return to strictly Green/Active.

Step 2: Stop Traffic

  1. Stop your application containers/services.
    • Ensure no writes are hitting syrftest to prevent data loss.

Step 3: Run the Migration Pipeline

Run this from the Jumpbox VM, not your local machine.

# 1. Export Connection Variables
# Replace with your actual credentials
export ATLAS_URI="mongodb+srv://admin:YOUR_PASSWORD_HERE@cluster0.abcde.mongodb.net"
export OLD_DB="syrftest"
export NEW_DB="syrf-prod"

# 2. Run the Streaming Migration
# This pipes dump directly to restore without saving to disk.
# --numParallelCollections=4: Optimises throughput
# --stopOnError: Halts immediately if a critical issue occurs

echo "Starting migration from $OLD_DB to $NEW_DB..."

mongodump --uri="$ATLAS_URI/$OLD_DB" --archive | \
mongorestore --uri="$ATLAS_URI/$NEW_DB" --archive \
  --nsFrom="$OLD_DB.*" \
  --nsTo="$NEW_DB.*" \
  --numParallelCollections=4 \
  --stopOnError
  • What to expect:
  • You will see a flurry of activity as documents are copied.
  • The Pause: The terminal will pause at the end. Do not panic. This is the M40 cluster rebuilding the 31 indexes for pmStudy. It may take 5–15 minutes.

🔍 Phase 3: Verification

Run these commands in mongosh or check via the Atlas UI Data Explorer.

1. Check Document Counts:

use syrf-prod
// Should match ~3.1M
db.pmStudy.countDocuments()
// Should match ~4.6k
db.pmInvestigator.countDocuments()

2. Verify Views: Check that the view pmInvestigatorEmail exists and returns data (views are sometimes skipped if dependencies aren't found, though mongorestore usually handles them at the end).

db.pmInvestigatorEmail.findOne()

3. Verify Indexes: Ensure pmStudy has 31 indexes.

db.pmStudy.getIndexes().length
// Expected Output: 31

🟢 Phase 4: Cutover & Cleanup

  1. Update Application Config:
    • Change MONGODB_URI (or equivalent) in your .env or secrets manager to point to syrf-prod.
  2. Start Application:
    • Boot up your services.
    • Tail logs to ensure successful connection and no Unauthorized or NamespaceNotFound errors.
  3. Scale Down:
    • Once the app is stable (e.g., after 1 hour of monitoring), go back to Atlas.
    • Scale Cluster0 back down to M20.
    • Note: Using M40 for \<1 hour will cost pennies.
  4. Drop Old DB (Optional but Recommended later):
    • Keep syrftest for 24-48 hours as a backup.
    • Delete it via Atlas UI when confident: db.dropDatabase()

⚠️ Rollback Plan

If the migration fails or takes too long.

  1. Abort Command: Ctrl+C the migration script on the VM.
  2. Revert App Config: Ensure app config still points to syrftest.
  3. Restart App: Bring services back online pointing to the old DB.
  4. Cleanup: Drop the partially created syrf-prod database to avoid confusion.
  5. Scale Down: Return cluster to M20.