Skip to content

MongoDB Database Reference

This document provides detailed reference information about SyRF's MongoDB database architecture.

Database Instances

Atlas Clusters

SyRF uses two MongoDB Atlas clusters:

Cluster Environments MCP Server (for querying)
Prod cluster Production mcp__mongodb-syrf-prod__*
Preview cluster Staging + PR previews mcp__mongodb-syrf-preview__*

A local standalone container (mcp__mongodb-syrf-local__*) holds a backup snapshot of the prod cluster taken 2026-02-21. Useful for safe exploration without touching live data.

Environment Connections (Isolated)

Environment Database Cluster
Production syrftest ⚠️ Prod
Staging syrf_staging Preview
PR Preview {n} syrf_pr_{n} Preview
Local development Configurable Local or local container

Warning: The production database is named syrftest despite its name suggesting a test environment. Treat it with care at all times.

Legacy Databases (Prod Cluster)

Database Name Purpose
syrftest PRODUCTION database — all live data
syrfdev Old development snapshot, mostly unused

Collection Architecture

Bounded Context Prefixes

Collections are named using a bounded context prefix derived from the entity's namespace:

// From MongoContext.cs
public static string GetBoundedContextCode(string? namespaceName)
    => namespaceName switch
    {
        null => throw new ArgumentException("AggregateRoot has no namespace"),
        _ when namespaceName.StartsWith("SyRF.ProjectManagement") => "pm",
        _ when namespaceName.StartsWith("SyRF.FileListings") => "as",
        _ when namespaceName.StartsWith("SyRF.LiteratureSearch") => "ls",
        _ => ""
    };

Collection naming formula: {prefix}{EntityClassName}

Project Management Collections (pm prefix)

The project management domain contains the core business entities:

Collection Entity Class Description
pmProject Project Systematic review projects with stages, memberships, questions
pmStudy Study Individual studies with screening, extraction, annotations
pmInvestigator Investigator User accounts and profile information
pmSystematicSearch SystematicSearch Literature search definitions linked to projects
pmDataExportJob DataExportJob Background export job tracking
pmStudyCorrection StudyCorrection PDF metadata correction requests
pmInvestigatorUsage InvestigatorUsage User activity and usage statistics
pmRiskOfBiasAiJob RiskOfBiasAiJob AI-assisted risk of bias analysis jobs

Source code locations:

  • Domain models: src/libs/project-management/SyRF.ProjectManagement.Core/Model/
  • Repositories: src/libs/project-management/SyRF.ProjectManagement.Mongo.Data/Repositories/

Other Bounded Contexts

Prefix Namespace Example Collections
as SyRF.FileListings File attachment storage
ls SyRF.LiteratureSearch Search configuration
(none) Other Uses entity name directly

GUID Representation

CSUUID (C# Legacy) Format

All document IDs in SyRF use CSUUID (C# Legacy GUID) format, which stores GUIDs as BinData subtype 3.

Configuration (from MongoUtils.cs):

public static void EnsureLegacyGuidSerializer()
{
    try
    {
        BsonSerializer.RegisterSerializer(new GuidSerializer(GuidRepresentation.CSharpLegacy));
    }
    catch (BsonSerializationException e)
    {
        if (BsonSerializer.LookupSerializer<Guid>() is not GuidSerializer
            {
                GuidRepresentation: GuidRepresentation.CSharpLegacy
            })
        {
            throw new InvalidOperationException(
                $"The Guid serializer is not the expected representation...", e);
        }
    }
}

Why This Matters

Format BinData Subtype Byte Order Use in SyRF
CSUUID (Legacy) 3 Little-endian first 3 groups Yes - all IDs
Standard UUID 4 Big-endian throughout No

The byte order difference means the same GUID string produces different binary representations:

GUID string: 550e8400-e29b-41d4-a716-446655440000

CSUUID binary:  00 84 0e 55  9b e2  d4 41  a7 16 44 66 55 44 00 00
                [reversed]   [rev]  [rev]  [preserved as-is]

UUID binary:    55 0e 84 00  e2 9b  41 d4  a7 16 44 66 55 44 00 00
                [as-is throughout]

Querying with CSUUID

When using MCP MongoDB tools, mongosh, or Compass:

// ❌ WRONG - Standard UUID won't match existing documents
db.pmStudy.find({ _id: UUID("550e8400-e29b-41d4-a716-446655440000") })

// ✅ CORRECT - Use CSUUID function
db.pmStudy.find({ _id: CSUUID("550e8400-e29b-41d4-a716-446655440000") })

// ✅ CORRECT - Direct BinData with subtype 3
db.pmStudy.find({ _id: BinData(3, "AIQOVZvi1EGnFkRmVUQAAA==") })

Working with GUIDs in Code

Always let the MongoDB C# driver handle serialization:

// ✅ CORRECT - Let the driver serialize
var studyId = Guid.Parse("550e8400-e29b-41d4-a716-446655440000");
var study = await collection.Find(s => s.Id == studyId).FirstOrDefaultAsync();

// ❌ WRONG - Manual binary conversion will likely use wrong byte order
var bytes = studyId.ToByteArray();
var bsonBinary = new BsonBinaryData(bytes, BsonBinarySubType.UuidLegacy);