Quartz Service Data Isolation Analysis¶

Executive Summary¶

Current State: Quartz data is fully isolated between environments. Both MassTransit saga tables and native Quartz scheduler tables now use environment-specific SQL schemas.

Implementation Status: ✅ COMPLETED (2025-12-30)

The isolation was implemented by:

Adding dynamic TablePrefix in Program.cs using syrfEnvironment.RuntimeEnvironment
Removing hardcoded prefix from appsettings.json
Creating SQL migration script for new schemas (staging, production, preview)

See Migration Script for database setup.

Previous State (Before Fix)¶

Risk Level: MEDIUM-HIGH - The native Quartz scheduler tables were shared across ALL environments due to a hardcoded table prefix, creating potential for:

Job interference between environments
Debugging complexity
Security concerns with production/staging sharing tables
All PR previews competing for the same job slots

Current Architecture¶

Database Connection¶

All environments connect to the same Azure SQL Server database:

Component	Value
Server	`syrf.database.windows.net`
Database	`syrf-quartz`
GCP Secret	`camarades-quartz-sql-server-credentials`

Both syrf-staging and syrf-production namespaces use the same GCP secret, meaning they connect to the identical database instance.

Schema Isolation (MassTransit Saga Tables)¶

The MassTransit job saga tables ARE properly isolated using environment-specific schemas:

Environment	Schema Name	Tables
Production	`production`	`production.JobSaga`, `production.JobTypeSaga`, `production.JobAttemptSaga`
Staging	`staging`	`staging.JobSaga`, `staging.JobTypeSaga`, `staging.JobAttemptSaga`
PR Previews	`preview`	`preview.JobSaga`, `preview.JobTypeSaga`, `preview.JobAttemptSaga`

How it works: The SYRF__RuntimeEnvironment environment variable (set per namespace) determines the schema name at runtime via SyrfEnvironment.RuntimeEnvironment.

// Program.cs line 29
var syrfEnvironment = new SyrfEnvironment(runtimeEnvironment);

// Used in EF Core migration
m.MigrationsHistoryTable($"__{nameof(JobServiceSagaOverrideDbContext)}", syrfEnvironment.RuntimeEnvironment);

Quartz Native Tables (NOW ISOLATED ✅)¶

RESOLVED: The native Quartz scheduler tables are now properly isolated between environments.

The table prefix is dynamically set in Program.cs:

q.UsePersistentStore(s =>
{
    s.TablePrefix = $"[{syrfEnvironment.RuntimeEnvironment}].QRTZ_";
    // ...
});

Each environment now uses its own set of tables:

Environment	Tables
Production	`[production].QRTZ_JOB_DETAILS`, `[production].QRTZ_TRIGGERS`, etc.
Staging	`[staging].QRTZ_JOB_DETAILS`, `[staging].QRTZ_TRIGGERS`, etc.
Preview	`[preview].QRTZ_JOB_DETAILS`, `[preview].QRTZ_TRIGGERS`, etc.

What Data is Affected¶

Properly Isolated (Per Environment Schema)¶

Table	Purpose	Isolation
`JobSaga`	MassTransit job lifecycle state (submitted, started, completed, faulted)	Schema-based
`JobTypeSaga`	Job type concurrency limits and active job counts	Schema-based
`JobAttemptSaga`	Individual retry attempts with status	Schema-based
`__JobServiceSagaOverrideDbContext`	EF Core migration history	Schema-based

Now Isolated (Per Environment Schema) ✅¶

Table	Purpose	Status
`QRTZ_JOB_DETAILS`	Registered job definitions	✅ Isolated per environment
`QRTZ_TRIGGERS`	Trigger definitions for scheduled jobs	✅ Isolated per environment
`QRTZ_FIRED_TRIGGERS`	Currently executing triggers	✅ Isolated per environment
`QRTZ_CRON_TRIGGERS`	Cron-based schedules	✅ Isolated per environment
`QRTZ_LOCKS`	Scheduler cluster locks	✅ Isolated per environment
`QRTZ_SCHEDULER_STATE`	Scheduler instance tracking	✅ Isolated per environment

Previous Implications (Now Resolved)¶

Note: These issues have been resolved with the implementation of environment-specific table prefixes.

1. Job Interference (RESOLVED ✅)¶

~~Quartz schedulers from different environments register as nodes in the same "cluster".~~

Now each environment has its own set of QRTZ_* tables, so:

Trigger definitions are isolated per environment
Job group names only need to be unique within an environment
Scheduler state queries return data only from that environment

2. Security Concerns (RESOLVED ✅)¶

~~Staging environment has access to production job metadata.~~

Now each environment has complete isolation:

Staging cannot see production job metadata
PR preview environments are isolated from production
Job parameters in QRTZ_JOB_DETAILS are environment-specific

3. Debugging Complexity (RESOLVED ✅)¶

~~When investigating job issues, data from all environments is intermixed.~~

Now debugging is cleaner:

Each environment's scheduler state is isolated
Lock contention metrics are per-environment
Node tracking is separated by environment

4. PR Preview Impact (RESOLVED ✅)¶

Each PR preview environment now gets its own isolated schema:

Schema naming: preview_{prNumber} (e.g., preview_2247)
Complete isolation between PR environments
No job interference between PRs

How it works: The SYRF__PrNumber environment variable (set by Helm for preview deployments) is used to create per-PR schemas. If a PR number is available in a preview environment, the schema becomes preview_{prNumber} instead of just preview.

Automatic Cleanup: When a PR is closed, the cleanup-tags job in .github/workflows/pr-preview.yml automatically drops the corresponding preview_{prNumber} schema and all its QRTZ_* tables from the SQL database.

Recommendations¶

Option 1: Environment-Specific Table Prefix (Recommended)¶

Effort: Low | Risk: Low | Impact: High

Add environment-specific Quartz table prefix via Helm values:

# In _env-blocks.tpl, add to quartz env block:
- name: SYRF__Quartz__quartz.jobStore.tablePrefix
  value: "[{{ .Values.environment.name }}].QRTZ_"

Pros:

Simple configuration change
No data migration required (new schema, existing remains untouched)
Each environment gets its own Quartz tables
Backward compatible

Cons:

Need to create new QRTZ_* tables for each environment
PR previews still share preview prefix (but isolated from staging/production)

Implementation Steps:

Add quartz.jobStore.tablePrefix to Helm templates
Create QRTZ_* tables in staging and production schemas
Deploy Quartz with new config (will auto-create tables if missing)
Optionally clean up old [development].QRTZ_* tables

Option 2: Separate Database Per Environment¶

Effort: Medium | Risk: Medium | Impact: Highest

Create environment-specific SQL databases:

syrf-quartz-staging
syrf-quartz-production
syrf-quartz-preview

Pros:

Complete isolation at database level
Clearer security boundaries
Independent backups and recovery
Can use different performance tiers

Cons:

Requires new GCP secrets per environment
Azure SQL billing increase (3x databases)
Migration effort for existing data
More infrastructure to manage

Option 3: PR-Specific Preview Isolation¶

Effort: Medium | Risk: Low | Impact: Medium

Extend schema isolation to use PR number for previews:

# For preview environments
environment:
  name: preview-{{ .prNumber }}  # e.g., preview-2246

Pros:

Each PR gets fully isolated job storage
No interference between PR previews
Better debugging experience

Cons:

Requires PR number injection into Helm values
More schemas to manage
Need cleanup job for orphaned preview schemas

Option 4: Disable Quartz in Previews¶

Effort: Low | Risk: Low | Impact: Low

For PR previews, disable the persistent job store:

# In preview.values.yaml
quartz:
  useInMemoryStore: true

Pros:

No database pollution from PRs
Simpler preview infrastructure
PRs don't need SQL Server access

Cons:

Jobs don't survive pod restarts in previews
Limited testing of scheduling functionality
May miss scheduling-related bugs

Priority Recommendation¶

Immediate (Option 1): Implement environment-specific table prefix

Fixes the staging/production sharing issue
Low risk, high value

Short-term (Option 4): Disable persistent store for previews

Reduces database pollution
Simplifies preview infrastructure

Future consideration (Option 3): PR-specific isolation

Only if debugging across PRs becomes problematic

Current Environment Configuration¶

Verified Configuration Per Environment¶

Environment	RuntimeEnvironment	Saga Schema	Quartz Prefix	MongoDB
Production	`production`	`production`	`[production].QRTZ_`	`syrftest`
Staging	`staging`	`staging`	`[staging].QRTZ_`	`syrf_staging`
PR Previews	`preview`	`preview`	`[preview_{n}].QRTZ_`	`syrf_pr_{n}`

Note: Quartz prefix is dynamically set in Program.cs based on syrfEnvironment.RuntimeEnvironment. For preview environments, if SYRF__PrNumber is set, the schema becomes preview_{prNumber}. Schemas and tables are automatically created at startup via QuartzSchemaInitializer.

Jobs Managed by Quartz¶

Search Import Jobs - Literature search workflow orchestration
Reference File Parsing - BibTeX, Endnote, PubMed parsing (5 min timeout, 5 concurrent)
Risk of Bias Processing - AI-assisted RoB calculations (2 hour timeout, 3 concurrent)
Bulk Study Updates - Batch study data imports
Screening Stats Update - Agreement threshold calculations

Schema vs Database Isolation: Trade-offs Analysis¶

Why Use Same Database with Different Schemas?¶

Benefits of Schema-Based Isolation:

Benefit	Description
Lower Cost	Single database = single billing unit. No per-database overhead costs.
Simpler Operations	Single backup strategy, single connection pool, unified monitoring
Easier Cross-Environment Queries	Can join across schemas for debugging/reporting if needed
Atomic Schema Changes	Migrations can be coordinated in a single transaction
Resource Efficiency	Schemas share memory, cache, and compute resources

Trade-offs:

Concern	Schema Isolation	Separate Databases
Security Boundary	Application-level (RBAC)	Database-level (connection string)
Resource Contention	Possible "noisy neighbor"	Complete isolation
Compliance Requirements	May not satisfy strict audits	Better for SOC2/HIPAA
Independent Scaling	Not possible	Per-database performance tiers
Backup/Restore Granularity	Whole database only	Per-environment recovery

Clarification: Security Concern Details¶

The security concern in this document specifically refers to the Quartz native tables (QRTZ_*) which are NOT schema-isolated - they all use the hardcoded [development] schema prefix.

What IS properly secured (schema-isolated):

staging.JobSaga, production.JobSaga - Different schemas, proper isolation
With proper SQL Server RBAC, staging connection could be restricted to only staging.* objects

What is NOT secured (shared tables):

[development].QRTZ_JOB_DETAILS - Contains job metadata from ALL environments
[development].QRTZ_TRIGGERS - Trigger definitions from ALL environments
Any environment's Quartz instance can read/write these tables

Why This Matters:

Job parameters in QRTZ_JOB_DETAILS may contain business data (project IDs, user IDs, etc.)
A staging environment breach could expose production job scheduling patterns
No audit trail separation between environments

When Schema Isolation is Sufficient¶

According to Microsoft's multi-tenant SaaS patterns documentation, schema-based isolation is appropriate when:

Environments have similar trust levels (both internal)
No regulatory requirement for physical data separation
Cost optimization is a priority
Operational simplicity outweighs isolation benefits

For SyRF's Quartz use case, schema isolation is sufficient for staging vs production because:

Both are internal environments managed by the same team
No sensitive PII is stored in Quartz tables (just job metadata)
The risk is operational (debugging confusion) rather than data breach

Azure SQL Pricing: Schemas vs Databases¶

Current Architecture Cost Model¶

SyRF currently uses a single Azure SQL Database (syrf-quartz), billed as one unit.

Cost Comparison¶

Based on Azure SQL Database pricing:

Approach	Billing Model	Approximate Monthly Cost*
Single DB, Multiple Schemas (current)	1x database	~$75-150 (S2/S3 tier)
Separate DBs, Same Server	3x database	~$225-450 (3x S2/S3)
Elastic Pool	Pool resources	~$150-200 (100 eDTUs)

*Estimates based on General Purpose tier, Europe West region

Key Azure Pricing Facts¶

Databases are billed individually - Creating 3 databases means 3x the per-database charges
Schemas have zero additional cost - They're just logical namespaces within a database
Elastic Pools can reduce multi-DB costs - Up to 13x reduction for databases with variable workloads
Storage is per-database - Multiple databases duplicate base storage allocation

Recommendation: Elastic Pool for Multi-Database Approach¶

If choosing separate databases (Option 2), use an Azure SQL Elastic Pool:

Configuration	Standard Elastic Pool
Pool Size	100 eDTUs
Max DBs	3 (staging, production, preview)
Approx. Monthly Cost	~$150-200
Savings vs 3 Single DBs	~40-60%

When Elastic Pool Makes Sense:

Databases have different peak usage times (staging used during dev, production at business hours)
Total combined utilization < pool capacity most of the time
At least 2-3 databases with similar resource needs

When to Avoid Elastic Pool:

All databases have sustained high utilization
Need serverless auto-pause (not supported in pools)
Databases need significantly different performance tiers

Decision Matrix¶

Criterion	Schema Isolation	Separate DBs	Elastic Pool
Monthly Cost	$ (lowest)	$$$	$$
Security Boundary	Application-level	Connection-string	Connection-string
Operational Complexity	Low	High	Medium
Independent Scaling	No	Yes	Limited
Backup Granularity	Whole DB	Per-environment	Per-pool
Best For	Internal envs, cost-sensitive	Strict compliance	Multiple DBs, variable load

For SyRF: Schema Isolation is Optimal¶

Given:

Internal-only environments (no external tenant isolation needed)
Low Quartz database load (mostly metadata, not high-throughput)
Cost-conscious infrastructure approach
Operational simplicity preference

Recommendation: Stay with single database, fix the table prefix (Option 1)

MongoDB Reference - Database architecture for MongoDB collections
Dependency Map - Service dependencies including Quartz
ADR-003: Cluster Architecture - Overall cluster design

Appendix: Verification Commands¶

# Check environment variables in Quartz pods
kubectl get deployment syrf-quartz -n syrf-staging -o jsonpath='{.spec.template.spec.containers[0].env}' | jq '.[] | select(.name | test("Runtime|Environment"; "i"))'

# Verify current Quartz table prefix (from running pod)
kubectl exec -n syrf-staging deploy/syrf-quartz -- cat /app/appsettings.json | jq '.Quartz'

# Check which secrets are used
kubectl get externalsecret quartz-sql-server-credentials -n syrf-staging -o yaml | grep -A5 "remoteRef:"

# Verify saga schema in SQL Server (requires SQL access)
SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME IN ('staging', 'production', 'preview', 'development');

Quartz Service Data Isolation Analysis¶

Executive Summary¶

Previous State (Before Fix)¶

Current Architecture¶

Database Connection¶

Schema Isolation (MassTransit Saga Tables)¶

Quartz Native Tables (NOW ISOLATED ✅)¶

What Data is Affected¶

Properly Isolated (Per Environment Schema)¶

Now Isolated (Per Environment Schema) ✅¶

Previous Implications (Now Resolved)¶

1. Job Interference (RESOLVED ✅)¶

2. Security Concerns (RESOLVED ✅)¶

3. Debugging Complexity (RESOLVED ✅)¶

4. PR Preview Impact (RESOLVED ✅)¶

Recommendations¶

Option 1: Environment-Specific Table Prefix (Recommended)¶

Option 2: Separate Database Per Environment¶

Option 3: PR-Specific Preview Isolation¶

Option 4: Disable Quartz in Previews¶

Priority Recommendation¶

Current Environment Configuration¶

Verified Configuration Per Environment¶

Jobs Managed by Quartz¶

Schema vs Database Isolation: Trade-offs Analysis¶

Why Use Same Database with Different Schemas?¶

Clarification: Security Concern Details¶

When Schema Isolation is Sufficient¶

Azure SQL Pricing: Schemas vs Databases¶

Current Architecture Cost Model¶

Cost Comparison¶

Key Azure Pricing Facts¶

Recommendation: Elastic Pool for Multi-Database Approach¶

Decision Matrix¶

For SyRF: Schema Isolation is Optimal¶

Related Documentation¶

Appendix: Verification Commands¶