Feature: MongoDB Testing and Development Isolation Strategy¶
Documentation¶
| Document | Description |
|---|---|
| README.md (this file) | Feature brief and high-level strategy |
| Technical Plan | Implementation details and code examples |
| Atlas Manual Setup | Production/staging user creation guide |
| Atlas Operator Setup | PR preview automation with K8s Operator |
| Security Model | Defense-in-depth architecture |
| Future DB Rename Runbook | Deferred runbook for production database rename |
Overview¶
Implement a robust MongoDB testing and development strategy that provides isolated database environments for development, testing, and preview environments while maintaining a reliable promotion path to production.
Implementation Status¶
Environment isolation is implemented (as of 2026-03-02). Staging uses
syrf_stagingand PR previews usesyrf_pr_{n}, both on a separate preview Atlas cluster. See MongoDB Reference for the current cluster topology.Remaining work: TestContainers integration testing, automated data snapshots for seeding preview DBs.
Problem Statement¶
Previous State (resolved):
- All environments (staging, production, PR previews) shared the same MongoDB database (
syrftest) - No integration test infrastructure exists for MongoDB (only in-memory mocks)
- Schema/domain model changes cannot be safely tested in isolation
- PR preview environments can interfere with each other's data
- Risk of test data corruption affecting production data
⚠️ Counterintuitive Naming: The production database is named
syrftest(notsyrfdev). This historical naming is confusing but changing it would require significant migration effort. See MongoDB Reference for details.
Impact:
- Features requiring schema changes are risky to develop
- Testing is limited to unit tests with mocks
- Preview environments are unreliable for data-dependent features
- No safe way to test data migrations before deployment
Goals¶
- Environment Isolation: Separate databases for staging, production, and PR previews
- Integration Testing: TestContainers-based testing for CI pipelines
- Schema Management: Safe process for domain model changes with migration support
- Developer Experience: Easy local development with isolated databases
- PR Preview Isolation: Each PR preview gets its own database (or database prefix)
Non-Goals¶
- Full MongoDB Atlas multi-tenancy (cost prohibitive)
- Real-time data replication between environments
- Automated data anonymization (future enhancement)
Solution Architecture¶
Database Isolation Strategy¶
┌─────────────────────────────────────────────────────────────────────┐
│ MongoDB Atlas Cluster │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ Production │ │ Staging │ │ Development Pool │ │
│ │ Database │ │ Database │ │ │ │
│ │ │ │ │ │ ┌───────────────┐ │ │
│ │ syrftest │ │ syrf_staging │ │ │ syrf_pr_123 │ │ │
│ │ (keep as-is) │ │ │ │ ├───────────────┤ │ │
│ └─────────────────┘ └─────────────────┘ │ │ syrf_pr_456 │ │ │
│ │ ├───────────────┤ │ │
│ │ │ syrf_local_* │ │ │
│ │ └───────────────┘ │ │
│ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Database Naming Convention¶
| Environment | Database Name | Purpose |
|---|---|---|
| Production | syrftest |
Live production data (keep existing name) |
| Staging | syrf_staging |
Pre-production testing (new isolated database) |
| PR Preview | syrf_pr_{number} |
Ephemeral PR-specific database |
| Local Dev | syrf_local_{username} |
Developer workstation |
| CI Integration | syrf_ci_{run_id} |
Ephemeral CI test database |
Note: Production keeps the existing
syrftestdatabase name to avoid migration complexity. Only non-production environments get new isolated databases.
Component Overview¶
┌──────────────────────────────────────────────────────────────────┐
│ Development Flow │
├──────────────────────────────────────────────────────────────────┤
│ │
│ 1. LOCAL DEVELOPMENT │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ TestContainers │ │ MongoDB Atlas │ │
│ │ (Ephemeral) │ OR │ (syrf_local_*) │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
│ 2. CI PIPELINE (Integration Tests) │
│ ┌─────────────────┐ │
│ │ TestContainers │ → Ephemeral per test run │
│ │ MongoDB 7.0 │ │
│ └─────────────────┘ │
│ │
│ 3. PR PREVIEW ENVIRONMENT │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ ArgoCD deploys │ → │ MongoDB Atlas │ │
│ │ to pr-{num} │ │ syrf_pr_{num} │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
│ 4. STAGING → PRODUCTION │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ syrf_staging │ → │ syrftest │ │
│ │ (Schema test) │ │ (Production) │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────┘
Implementation Plan¶
Phase 1: Integration Test Infrastructure (Foundation)¶
Goal: Enable reliable integration tests with real MongoDB
1.1 TestContainers Setup¶
Add TestContainers.MongoDB to test projects for ephemeral database testing:
// Example test fixture
public class MongoDbFixture : IAsyncLifetime
{
private readonly MongoDbContainer _container;
public IMongoDatabase Database { get; private set; }
public MongoDbFixture()
{
_container = new MongoDbBuilder()
.WithImage("mongo:7.0")
.Build();
}
public async Task InitializeAsync()
{
await _container.StartAsync();
var client = new MongoClient(_container.GetConnectionString());
Database = client.GetDatabase("test");
}
public async Task DisposeAsync() => await _container.DisposeAsync();
}
1.2 Test Data Builders¶
Create fluent builders for test data:
public class ProjectBuilder
{
private Project _project = new();
public ProjectBuilder WithName(string name) { _project.Name = name; return this; }
public ProjectBuilder WithOwner(Guid investigatorId) { /* ... */ return this; }
public ProjectBuilder WithStage(Stage stage) { /* ... */ return this; }
public Project Build() => _project;
public async Task<Project> BuildAndSave(IProjectRepository repo)
{
await repo.SaveAsync(_project);
return _project;
}
}
1.3 Repository Integration Tests¶
Create integration tests for each repository:
[Collection("MongoDB")]
public class ProjectRepositoryTests : IClassFixture<MongoDbFixture>
{
private readonly IProjectRepository _repository;
[Fact]
public async Task Save_NewProject_PersistsCorrectly()
{
var project = new ProjectBuilder()
.WithName("Test Project")
.Build();
await _repository.SaveAsync(project);
var retrieved = await _repository.GetAsync(project.Id);
retrieved.Name.Should().Be("Test Project");
}
}
Phase 2: Environment Database Isolation¶
Goal: Separate databases for staging and production
2.1 Configuration Changes¶
Update Helm values to support environment-specific database names:
# values.yaml (base)
mongoDb:
authSecretName: mongo-db
clusterAddress: cluster0.siwfo.mongodb.net/admin?retryWrites=true&w=majority
databaseName: "" # Set per environment
authDb: admin
ssl: true
# values-staging.yaml
mongoDb:
databaseName: syrf_staging
# values-production.yaml
mongoDb:
databaseName: syrftest # Keep existing production database name
2.2 Database User Separation¶
Create environment-specific MongoDB users with appropriate permissions:
// Production user (full access to syrftest only - existing production database)
db.createUser({
user: "syrf_prod_app",
pwd: "<secure-password>",
roles: [{ role: "readWrite", db: "syrftest" }]
});
// Staging user (full access to syrf_staging only)
db.createUser({
user: "syrf_staging_app",
pwd: "<secure-password>",
roles: [{ role: "readWrite", db: "syrf_staging" }]
});
2.3 Kubernetes Secrets¶
Update secret management for per-environment credentials:
# cluster-gitops/environments/staging/secrets.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: mongo-db
spec:
secretStoreRef:
name: gcp-secret-manager
kind: ClusterSecretStore
target:
name: mongo-db
data:
- secretKey: username
remoteRef:
key: syrf-staging-mongo-username
- secretKey: password
remoteRef:
key: syrf-staging-mongo-password
Phase 3: PR Preview Database Isolation¶
Goal: Each PR preview gets an isolated database
3.1 Dynamic Database Creation¶
Option A: MongoDB Atlas API (Recommended)
Use MongoDB Atlas Admin API to create/delete databases dynamically:
# .github/workflows/pr-preview.yml additions
- name: Create PR Database
if: github.event.action != 'closed'
run: |
# MongoDB Atlas doesn't require explicit database creation
# Just configure the app to use a unique database name
echo "Using database: syrf_pr_${{ github.event.number }}"
- name: Cleanup PR Database
if: github.event.action == 'closed'
run: |
# Drop the PR-specific database
mongosh "$MONGO_URI" --eval "db.getSiblingDB('syrf_pr_${{ github.event.number }}').dropDatabase()"
Option B: Collection Prefix (Simpler)
Use collection prefixes within a shared database:
// Modified MongoContext to support prefixes
public class MongoContext
{
private readonly string _collectionPrefix;
public IMongoCollection<T> GetCollection<T>()
{
var baseName = GetCollectionName<T>();
var prefixedName = string.IsNullOrEmpty(_collectionPrefix)
? baseName
: $"{_collectionPrefix}_{baseName}";
return Database.GetCollection<T>(prefixedName);
}
}
3.2 PR Preview Workflow Updates¶
# pr-preview.yml - Add database configuration
- name: Write PR Database Config
run: |
cat >> "$VERSIONS_DIR/database.yaml" << EOF
mongoDb:
databaseName: syrf_pr_${{ github.event.number }}
EOF
3.3 Database Seeding for Previews¶
Create a seed data job that runs after preview deployment:
# Helm hook for database seeding
apiVersion: batch/v1
kind: Job
metadata:
name: seed-database
annotations:
helm.sh/hook: post-install
helm.sh/hook-weight: "0"
spec:
template:
spec:
containers:
- name: seeder
image: ghcr.io/camaradesuk/syrf-db-seeder:latest
env:
- name: MONGO_URI
valueFrom:
secretKeyRef:
name: mongo-db
key: connection-string
- name: DATABASE_NAME
value: "syrf_pr_{{ .Values.prNumber }}"
Phase 4: Schema Migration Strategy¶
Goal: Safe process for domain model changes
4.1 Schema Version Tracking¶
The existing SchemaVersion property on aggregate roots supports migrations:
public abstract class AggregateRoot<TId>
{
public int SchemaVersion => Audit?.SchemaVersion ?? DefaultSchemaVersion;
protected virtual int DefaultSchemaVersion => 1;
}
4.2 Migration Runner¶
Create a migration framework for schema changes:
public interface IMigration
{
int FromVersion { get; }
int ToVersion { get; }
string CollectionName { get; }
Task<long> MigrateAsync(IMongoDatabase database, CancellationToken ct);
}
public class MigrationRunner
{
public async Task RunMigrationsAsync(IEnumerable<IMigration> migrations)
{
foreach (var migration in migrations.OrderBy(m => m.FromVersion))
{
var affected = await migration.MigrateAsync(_database, ct);
_logger.LogInformation(
"Migration {From}→{To} on {Collection}: {Count} documents",
migration.FromVersion, migration.ToVersion,
migration.CollectionName, affected);
}
}
}
4.3 Example Migration¶
public class ProjectMembershipsMigration : IMigration
{
public int FromVersion => 0;
public int ToVersion => 1;
public string CollectionName => "pmProject";
public async Task<long> MigrateAsync(IMongoDatabase db, CancellationToken ct)
{
var collection = db.GetCollection<BsonDocument>(CollectionName);
// Find documents with old schema
var filter = Builders<BsonDocument>.Filter.Or(
Builders<BsonDocument>.Filter.Exists("Audit.SchemaVersion", false),
Builders<BsonDocument>.Filter.Eq("Audit.SchemaVersion", 0)
);
// Update to new schema
var update = Builders<BsonDocument>.Update
.Rename("Registrations", "Memberships")
.Set("Audit.SchemaVersion", 1);
var result = await collection.UpdateManyAsync(filter, update, cancellationToken: ct);
return result.ModifiedCount;
}
}
4.4 Migration as Init Container¶
Run migrations before application starts:
# Helm template
initContainers:
- name: migrations
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
command: ["dotnet", "SyRF.Migrations.dll"]
env:
- name: MONGO_CONNECTION
valueFrom:
secretKeyRef:
name: mongo-db
key: connection-string
Phase 5: CI Pipeline Integration¶
Goal: Automated testing in CI with isolated databases
5.1 GitHub Actions Workflow¶
# .github/workflows/ci-cd.yml additions
test-integration:
runs-on: ubuntu-latest
services:
mongodb:
image: mongo:7.0
ports:
- 27017:27017
steps:
- uses: actions/checkout@v4
- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: '8.0.x'
- name: Run Integration Tests
run: dotnet test --filter Category=Integration
env:
MONGO_CONNECTION: mongodb://localhost:27017
MONGO_DATABASE: syrf_ci_${{ github.run_id }}
5.2 Test Categories¶
[Trait("Category", "Unit")]
public class ProjectTests { }
[Trait("Category", "Integration")]
public class ProjectRepositoryIntegrationTests { }
[Trait("Category", "E2E")]
public class ProjectWorkflowE2ETests { }
Data Flow Diagram¶
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Feature │ │ PR │ │ Staging │ │ Production │
│ Branch │───▶│ Preview │───▶│ Deploy │───▶│ Deploy │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Local Dev │ │ syrf_pr_123 │ │syrf_staging │ │ syrftest │
│ TestContain │ │ (Ephemeral) │ │ (Persistent)│ │ (Protected) │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│ │ │ │
│ │ │ │
└──────────────────┴────────ISOLATED──┴──────────────────┘
Rollout Strategy¶
Phase 1 (Week 1-2): Foundation ✅¶
- Add TestContainers.MongoDB to test projects
- Create base test fixtures and builders
- Write initial repository integration tests
- Document local development setup
Phase 2 (Week 3-4): Environment Isolation ✅¶
- Create separate MongoDB users for staging/production - See Atlas Manual Setup
- Update Helm values for environment-specific database names (cluster-gitops)
- Update External Secrets for environment-specific credentials
- Deploy to staging with new database
- Data strategy decided: Minimal seed data (no migration from production)
Phase 3 (Week 5-6): PR Preview Isolation ✅¶
- Install Atlas Operator via GitOps (cluster-gitops/plugins/helm/mongodb-atlas-operator/)
- Deploy Kyverno policy engine (cluster-gitops/plugins/helm/kyverno/)
- Deploy policy to block production/staging access (atlas-pr-user-policy.yaml)
- Update pr-preview workflow to generate AtlasDatabaseUser CR
- Configure ApplicationSet for per-PR database names (mongoDb.databaseName)
- Configure ApplicationSet for Atlas operator secret naming pattern (mongoDb.authSecretName)
- MANUAL: Create Atlas API Key with
Project Database Access Adminrole - MANUAL: Store API key in GCP Secret Manager as
atlas-operator-api-key - MANUAL: Add
ATLAS_PROJECT_IDto GitHub repository secrets - Test with actual PR (PR #2234 - services running, MongoDB connectivity verified)
- Create seed data mechanism (DatabaseSeeder + reset-on-rebuild feature)
Phase 4 (Week 7-8): Migration Framework¶
- Implement MigrationRunner
- Add migration init container to Helm charts
- Create example migration
- Document migration authoring process
Phase 5 (Ongoing): CI Integration¶
- Add integration test job to CI workflow
- Create test coverage reporting
- Add E2E test suite
Success Metrics¶
| Metric | Current | Target |
|---|---|---|
| Integration test coverage | 0% | 60% |
| Environment isolation | None | Full |
| PR preview data conflicts | Common | Zero |
| Schema change deployment confidence | Low | High |
| Time to test data model changes | Hours | Minutes |
Risks and Mitigations¶
| Risk | Impact | Mitigation |
|---|---|---|
| MongoDB Atlas costs increase | Medium | Use collection prefixes instead of separate databases for previews |
| Migration failures in production | High | Test all migrations in staging first; maintain backward compatibility |
| TestContainers slow in CI | Low | Use parallel test execution; optimize container startup |
| Orphaned preview databases | Low | Scheduled cleanup job; PR close webhook |
Dependencies¶
- MongoDB Atlas M10+ cluster (supports multiple databases)
- GitHub Actions runners with Docker support
- ArgoCD ApplicationSet for preview environments
- External Secrets Operator for credential management
Resolved Questions¶
- Database cost: ✅ Separate databases on same cluster - simpler code, Atlas handles it fine
- Data seeding: ✅ Minimal seed - same for staging and PR previews initially (can expand later if needed)
- Migration timing: ✅ Init containers - safer, app won't start if migration fails (DEFERRED to follow-up PR)
- Backward compatibility: ✅ Deferred - only relevant when migration framework is implemented
Database Reset on Rebuild¶
Overview¶
By default, when a MongoDB-using service (API, Project Management, or Quartz) is rebuilt in a PR preview environment, the preview database is automatically reset (all collections dropped and re-seeded). This ensures each rebuild starts with a clean, consistent database state.
Behavior Matrix¶
| Condition | Result |
|---|---|
PR push rebuilds api/pm/quartz, no persist-db label |
Database reset, then re-seeded |
PR push rebuilds api/pm/quartz, HAS persist-db label |
Database preserved |
| PR push only rebuilds web/docs/user-guide | Database preserved (no MongoDB services) |
| ArgoCD manual sync (no code change) | Database preserved (same SHA) |
Preserving Data with persist-db Label¶
Add the persist-db label to your PR to preserve database contents across rebuilds:
- Navigate to your PR on GitHub
- Add the label
persist-dbin the right sidebar - The workflow triggers automatically and removes any pending reset job
Label changes take effect immediately - no push required. The workflow listens for labeled and unlabeled events on the persist-db label.
Use cases for preserving data:
- Testing data migration scenarios
- Accumulating test data across multiple commits
- Debugging issues that require specific data state
Label Interaction Matrix¶
| Action | Result |
|---|---|
Remove persist-db, no push |
No immediate reset (waits for next rebuild) |
Remove persist-db, then push API changes |
Reset (rebuild + no label) |
Add persist-db, no push |
Reset job removed, DB protected immediately |
Add persist-db, then push API changes |
No reset (label present) |
How It Works¶
- Trigger: Workflow runs on push OR
persist-dblabel changes - Label Check: Workflow checks for
persist-dblabel on PR - Service Detection: Checks if any MongoDB service (api, pm, quartz) was rebuilt (action=build)
- Reset Job Generation: If reset needed, generates
db-reset-job.yamlin cluster-gitops - Marker Check: Job checks ConfigMap
db-reset-markerfor previous reset SHA - Skip if Already Done: If marker SHA matches current SHA, skip reset
- Collection Drop: Job connects to MongoDB and drops all collections in
syrf_pr_{number} - Update Marker: Job updates ConfigMap with current SHA
- Re-seeding: After services deploy,
DatabaseSeeder(IRunAtInit) populates seed data
Implementation Files¶
| File | Purpose |
|---|---|
.github/workflows/pr-preview.yml |
Label triggers, reset determination, job generation |
cluster-gitops/argocd/applicationsets/syrf-previews.yaml |
Includes db-reset-job.yaml in namespace app |
pr-{N}/db-reset-job.yaml |
PreSync Kubernetes Job with RBAC (auto-generated) |
ConfigMap db-reset-marker |
Tracks last reset SHA (created by job) |
Idempotency¶
The reset job uses a completion marker (ConfigMap) to track the last reset SHA:
- First reset for SHA: Job runs, drops collections, creates marker with SHA
- Manual ArgoCD sync (same SHA): Job checks marker, sees match, skips reset
- New push (different SHA): Job checks marker, SHA differs, runs reset, updates marker
This ensures:
- Same commit won't re-run reset on ArgoCD manual syncs
- New rebuilds always trigger reset (unless
persist-dblabel present) - No duplicate resets on retry scenarios
Technical Notes¶
Atlas Operator Connection Secret Naming¶
The MongoDB Atlas Operator creates connection secrets with a fixed naming pattern that cannot be customized:
For example, a user syrf_pr_2234_app connecting to cluster cluster0 in project syrfdb creates:
Key Points:
- The
connectionSecretReffield does not exist in the AtlasDatabaseUser CRD - When using
externalProjectRef(existing project, not operator-managed), you cannot override the secret name - The ApplicationSet passes
mongoDb.authSecretNamewith the predictable pattern to services
ApplicationSet Configuration (cluster-gitops/argocd/applicationsets/syrf-previews.yaml):
MongoDB Atlas Operator Namespace Watching Limitation¶
Important: The MongoDB Atlas Operator does not support pattern-based namespace watching (e.g., pr-*). The operator can only be configured to:
- Watch all namespaces (empty
watchNamespaces: []) - Watch only its installation namespace
Security Enforcement via Kyverno
Since the operator must watch all namespaces, security is enforced by Kyverno policies:
- Policy:
atlas-block-production-access(ClusterPolicy) - Location:
cluster-gitops/plugins/helm/kyverno/resources/atlas-pr-user-policy.yaml - Enforcement: Blocks AtlasDatabaseUser resources in
pr-*namespaces from: - Using broad-access roles (readWriteAnyDatabase, root, etc.)
- Accessing production database (
syrftest) - Accessing staging database (
syrf_staging) - Accessing admin database
This implements defense-in-depth - even if an AtlasDatabaseUser is created in a PR namespace, Kyverno will block it if it attempts to access protected databases.
References¶
- MongoDB Atlas Documentation
- MongoDB Atlas Kubernetes Operator
- TestContainers for .NET
- SyRF Mongo.Common Library
- MongoDB Reference - CSUUID format and collection naming
- PR Preview Environment Guide