Runbook: Renaming syrftest to syrf-prod¶
Status: DEFERRED - This runbook is preserved for future use. The current decision is to keep
syrftestas the production database name to avoid migration complexity. Execute this runbook when/if the team decides to rename the production database.
Objective: Rename the production logical database from syrftest to syrf-prod with minimal downtime.
Primary Constraint: The pmStudy collection contains 8GB of indexes vs 4GB RAM on the current M20 tier. Rebuilding these indexes without scaling up will cause severe latency.
Estimated Maintenance Window: 30–45 minutes (including buffer).
✅ Phase 1: Pre-Flight Checklist¶
Perform these steps 24 hours to 1 hour before the maintenance window.
-
Provision "Jumpbox" VM:
- Create a small Ubuntu VM (e.g., AWS
t3.medium) in AWS Ireland (eu-west-1). - Why: Keeps traffic inside the AWS internal network for maximum speed.
-
Install Tools on VM:
- Create a small Ubuntu VM (e.g., AWS
-
Verify Access:
- Allow the VM's private IP in the MongoDB Atlas Network Access whitelist.
- Test connection from VM:
mongosh "mongodb+srv://<CLUSTER_HOST>" --username admin
- Prepare Application:
- Prepare the new environment variable string:
.../syrf-prod?retryWrites=true... - Ensure you have a way to stop/start the app services quickly.
- Prepare the new environment variable string:
🚀 Phase 2: Execution (Maintenance Window)¶
Step 1: Scale Up Cluster (Critical)¶
Time required: ~5-10 mins (Rolling update)
- Log in to MongoDB Atlas.
- Navigate to Database > Cluster0 > Configuration.
- Change Tier from M20 to M40 (16GB RAM).
- Reasoning: You have 8GB of indexes. M20 (4GB RAM) will crash/thrash during index rebuilds. M40 ensures indexes fit in RAM for a fast rebuild.
- Wait for the cluster status to return to strictly Green/Active.
Step 2: Stop Traffic¶
- Stop your application containers/services.
- Ensure no writes are hitting
syrftestto prevent data loss.
- Ensure no writes are hitting
Step 3: Run the Migration Pipeline¶
Run this from the Jumpbox VM, not your local machine.
# 1. Export Connection Variables
# Replace with your actual credentials
export ATLAS_URI="mongodb+srv://admin:YOUR_PASSWORD_HERE@cluster0.abcde.mongodb.net"
export OLD_DB="syrftest"
export NEW_DB="syrf-prod"
# 2. Run the Streaming Migration
# This pipes dump directly to restore without saving to disk.
# --numParallelCollections=4: Optimises throughput
# --stopOnError: Halts immediately if a critical issue occurs
echo "Starting migration from $OLD_DB to $NEW_DB..."
mongodump --uri="$ATLAS_URI/$OLD_DB" --archive | \
mongorestore --uri="$ATLAS_URI/$NEW_DB" --archive \
--nsFrom="$OLD_DB.*" \
--nsTo="$NEW_DB.*" \
--numParallelCollections=4 \
--stopOnError
- What to expect:
- You will see a flurry of activity as documents are copied.
- The Pause: The terminal will pause at the end. Do not panic. This is the M40 cluster rebuilding the 31 indexes for
pmStudy. It may take 5–15 minutes.
🔍 Phase 3: Verification¶
Run these commands in mongosh or check via the Atlas UI Data Explorer.
1. Check Document Counts:
use syrf-prod
// Should match ~3.1M
db.pmStudy.countDocuments()
// Should match ~4.6k
db.pmInvestigator.countDocuments()
2. Verify Views:
Check that the view pmInvestigatorEmail exists and returns data (views are sometimes skipped if dependencies aren't found, though mongorestore usually handles them at the end).
3. Verify Indexes:
Ensure pmStudy has 31 indexes.
🟢 Phase 4: Cutover & Cleanup¶
- Update Application Config:
- Change
MONGODB_URI(or equivalent) in your.envor secrets manager to point tosyrf-prod.
- Change
- Start Application:
- Boot up your services.
- Tail logs to ensure successful connection and no
UnauthorizedorNamespaceNotFounderrors.
- Scale Down:
- Once the app is stable (e.g., after 1 hour of monitoring), go back to Atlas.
- Scale Cluster0 back down to M20.
- Note: Using M40 for \<1 hour will cost pennies.
- Drop Old DB (Optional but Recommended later):
- Keep
syrftestfor 24-48 hours as a backup. - Delete it via Atlas UI when confident:
db.dropDatabase()
- Keep
⚠️ Rollback Plan¶
If the migration fails or takes too long.
- Abort Command: Ctrl+C the migration script on the VM.
- Revert App Config: Ensure app config still points to
syrftest. - Restart App: Bring services back online pointing to the old DB.
- Cleanup: Drop the partially created
syrf-proddatabase to avoid confusion. - Scale Down: Return cluster to M20.