Backend Annotation Validation Service¶

Executive Summary¶

This document describes a comprehensive backend validation service for annotation submissions. The current architecture has a critical gap: all conditionality and visibility logic resides exclusively in the Angular frontend, meaning API clients can bypass validation rules entirely.

Objectives:

Create a shared QuestionVisibilityService that replicates frontend conditionality logic
Create an AnnotationValidationService for comprehensive submission validation
Use these services for both API validation AND seeding annotation generation
Follow DRY principles - one source of truth for business rules

Why This Matters:

Security: API clients can currently submit invalid data
Data Integrity: No backend enforcement of required fields, conditionality, or type matching
Seeding: Cannot generate valid annotations without understanding visibility rules
Maintainability: Business rules are duplicated and inconsistent

Problem Analysis¶

Current State¶

Component	What Exists	What's Missing
Backend AnnotationQuestion	Stores `Target.ConditionalParentAnswers` metadata	No logic to evaluate conditionality
Backend Validation	Session uniqueness, investigator ID matching	Conditionality, required fields, type matching, hierarchy validation
Frontend	Full `initSubQuestions()`, `getOptions()`, `onOptionChange()`	N/A (working correctly)
Seeding	Creates studies, screening decisions	Cannot create valid annotations without visibility logic

Validation Gaps (Security Risk)¶

The following validations exist ONLY in the frontend:

Validation	Frontend Location	Backend	Risk Level
TimePoint values not null	`stage-review.component.ts:578-590`	❌ Not validated	HIGH
TimePoint array not empty	Component logic	❌ Not validated	HIGH
Conditionality enforcement	`initSubQuestions()`	❌ Not validated	HIGH
Required field validation	Form validators	❌ Not validated	HIGH
Array duplicate detection	`annotation-form.service.ts:527-537`	❌ Not validated	MEDIUM
Selected options exist	Async validator	❌ Not validated	MEDIUM
Answer type matches question type	Form construction	❌ Only at deserialization	MEDIUM

Why Backend Validation is Required¶

API Security: Any HTTP client can POST to the API, bypassing Angular validation
Data Integrity: Invalid data can corrupt the database and break exports
Seeding: Cannot generate test data without understanding what "valid" means
Future Clients: Mobile apps, CLI tools, integrations all need consistent rules

Architectural Approach¶

Target Architecture¶

┌─────────────────────────────────────────────────────────────────────┐
│                    QuestionVisibilityService                        │
│  (Shared logic for determining which questions are visible/required)│
├─────────────────────────────────────────────────────────────────────┤
│  - GetVisibleQuestionIds(stage, annotations) → visible question IDs │
│  - IsQuestionVisible(question, parentAnnotation) → bool             │
│  - GetVisibleOptions(question, parentAnswer) → filtered options     │
│  - GetRequiredQuestionIds(stage, annotations) → required IDs        │
└─────────────────────────────────────────────────────────────────────┘
                              ▲
           ┌──────────────────┼───────────────────┐
           │                  │                   │
    ┌──────┴──────┐    ┌──────┴──────┐    ┌──────┴──────┐
    │ API         │    │ Seeding     │    │ Future      │
    │ Validation  │    │ Generation  │    │ API Clients │
    └─────────────┘    └─────────────┘    └─────────────┘

Key Design Decisions¶

Decision 1: Application Layer Validation (Not Domain)¶

Problem: The current plan in the temporary plan file shows injecting services into domain entities like ExtractionInfo.AddAnnotations(). This violates DDD principles.

Decision: Validation happens at the application layer (controller/handler), BEFORE calling domain methods.

Rationale:

Domain entities should remain service-free
Validation is a cross-cutting concern, not domain logic
Easier to test validation independently
Clearer separation of concerns

Implementation:

// In ReviewController.SubmitSession():
var validationResult = _validationService.ValidateSubmission(stage, dto);
if (!validationResult.IsValid)
{
    return BadRequest(validationResult.Errors);
}

// Only after validation passes:
study.AddSessionData(investigatorId, dto, questionIds, extraction);

Decision 2: Validation Levels (Warn vs Reject)¶

Problem: Adding strict validation could break existing frontend clients if there are edge cases where frontend sends data that looks invalid but is actually acceptable.

Decision: Implement two validation levels:

Strict Mode (default for new code): Rejects invalid submissions
Compatibility Mode (opt-in): Logs warnings but allows submission

Implementation:

public class AnnotationValidationOptions
{
    public ValidationMode Mode { get; set; } = ValidationMode.Strict;
    public bool LogViolations { get; set; } = true;
}

public enum ValidationMode
{
    Strict,       // Reject invalid submissions
    Compatibility // Log warnings only
}

Decision 3: Recursive Tree Traversal for Visibility¶

Problem: Question visibility depends on parent answer, which depends on grandparent answer, etc.

Decision: Use recursive tree traversal starting from root questions, building visible set incrementally.

Algorithm:

1. Start with root questions (target == null or root == true)
2. For each root question:
   a. Add to visible set
   b. For each subquestion:
      i. Check if conditionalParentAnswers matches current annotation
      ii. If visible, add to set and recurse
3. Required questions = visible questions where optional == false

Risk Assessment¶

Risk 1: Breaking Existing Frontend Submissions¶

Severity: HIGH Likelihood: MEDIUM

Scenario: Frontend currently sends data that passes frontend validation but would fail stricter backend validation. Users get errors on data that "worked before".

Mitigations:

Audit Phase: Deploy validation in logging-only mode first
Analysis: Review logs for violations before enabling strict mode
Compatibility Mode: Allow gradual migration
Feature Flag: SYRF_STRICT_ANNOTATION_VALIDATION=true/false

Detection:

// Log violations without rejecting (compatibility mode)
if (validationResult.HasWarnings)
{
    _logger.LogWarning("Annotation validation warnings: {Warnings}",
        validationResult.Warnings);
}

Risk 2: Performance Impact¶

Severity: MEDIUM Likelihood: LOW

Scenario: Evaluating visibility for 100+ questions on every submission adds noticeable latency.

Mitigations:

Lazy Evaluation: Only evaluate questions as needed
Question Index: Build lookup structures once per stage
Caching: Cache stage question hierarchy (immutable during request)
Benchmarking: Add performance tests to CI

Measurement:

using (_metrics.MeasureTime("annotation_validation_duration"))
{
    var result = _validationService.Validate(stage, annotations);
}

Risk 3: Divergence from Frontend Logic¶

Severity: HIGH Likelihood: MEDIUM

Scenario: Backend implements visibility logic slightly differently than frontend, causing frontend to show questions that backend rejects (or vice versa).

Mitigations:

Specification Document: Use formal-specification.md as source of truth
Cross-Language Tests: Write test cases that run against both implementations
Frontend Sync: When backend changes, verify frontend still matches
Contract Tests: API contract tests verify expected behavior

Verification Process:

For each conditionality rule:
1. Document expected behavior in specification
2. Write backend unit test
3. Write frontend unit test (Jasmine/Vitest)
4. Both must pass for rule to be considered implemented

Risk 4: Reconciliation Session Complexity¶

Severity: MEDIUM Likelihood: HIGH

Scenario: Reconciliation sessions have different visibility rules (can see both reviewers' annotations). If validation doesn't account for this, reconciliation breaks.

Mitigations:

Explicit Reconciliation Handling: Separate validation path for isReconciliation = true
Test Coverage: Dedicated test suite for reconciliation scenarios
Documentation: Document reconciliation differences explicitly

Key Differences: | Aspect | Normal Session | Reconciliation Session | |--------|----------------|------------------------| | Existing Annotations | Own only | Both reviewers' | | Required Fields | All required | May differ | | Visibility Context | Single reviewer | Merged view |

Risk 5: System Questions Special Behavior¶

Severity: MEDIUM Likelihood: MEDIUM

Scenario: System questions have hardcoded behaviors (auto-populated, default values, conditional options) that aren't handled correctly in validation.

Mitigations:

System Question Registry: Centralized handling of all 18 system questions
Special Case Tests: Each system question has dedicated tests
Default Value Handling: Explicit logic for checkbox defaults, etc.

System Questions Requiring Special Handling: | Question | Special Behavior | |----------|------------------| | Error Type | Options filtered by Average Type selection | | PDF Graphs | Hidden category, lookup to PDF References | | Control Questions | Boolean with child conditionality | | Lookup Questions | Options come from other category annotations |

Risk 6: Existing Data Compatibility¶

Severity: MEDIUM Likelihood: LOW

Scenario: Annotations already stored in database would fail new validation rules. This could break data retrieval or exports.

Decision: Validation applies to NEW submissions only, not historical data.

Rationale:

Historical data was accepted under previous rules
Retroactive validation would require data migration
Read operations should not fail due to validation

Implementation Plan¶

Phase 1: Question Visibility Service¶

Duration: ~2-3 days Files:

Core/Services/IQuestionVisibilityService.cs (new)
Core/Services/QuestionVisibilityService.cs (new)
Core.Tests/Services/QuestionVisibilityServiceTests.cs (new)

Interface:

public interface IQuestionVisibilityService
{
    /// <summary>
    /// Determines which questions should be visible given current annotations.
    /// Replicates frontend initSubQuestions() logic.
    /// </summary>
    IReadOnlySet<Guid> GetVisibleQuestionIds(
        Stage stage,
        IEnumerable<Annotation> currentAnnotations);

    /// <summary>
    /// Checks if a specific question is visible based on parent's answer.
    /// </summary>
    bool IsQuestionVisible(
        AnnotationQuestion question,
        Annotation? parentAnnotation);

    /// <summary>
    /// Gets filtered options for a question based on parent answer.
    /// Replicates frontend getOptions() logic with ParentFilter.
    /// </summary>
    IEnumerable<string> GetVisibleOptions(
        AnnotationQuestion question,
        object? parentAnswer);

    /// <summary>
    /// Gets question IDs that are required (visible + not optional).
    /// </summary>
    IReadOnlySet<Guid> GetRequiredQuestionIds(
        Stage stage,
        IEnumerable<Annotation> currentAnnotations);
}

Key Implementation Details:

public class QuestionVisibilityService : IQuestionVisibilityService
{
    public bool IsQuestionVisible(
        AnnotationQuestion question,
        Annotation? parentAnnotation)
    {
        // CASE 1: No target = root question = always visible
        if (question.Target == null)
            return true;

        var conditional = question.Target.ConditionalParentAnswers;

        // CASE 2: Has parent but no conditions = always visible when parent answered
        if (conditional == null)
            return parentAnnotation != null;

        // CASE 3: Parent not answered - check default behavior
        if (parentAnnotation == null)
        {
            // For checkbox questions, check DefaultCheckedStatus
            // This replicates frontend behavior where unchecked checkbox
            // with default=true would still show conditional children
            return GetDefaultVisibility(question, conditional);
        }

        // CASE 4: Evaluate condition against parent answer
        return conditional switch
        {
            BooleanConditionalTargetParentOptions boolCond =>
                EvaluateBooleanCondition(parentAnnotation, boolCond),

            OptionConditionalTargetParentOptions optCond =>
                EvaluateOptionCondition(parentAnnotation, optCond),

            _ => true // Unknown condition type = permissive
        };
    }

    private bool GetDefaultVisibility(
        AnnotationQuestion question,
        ConditionalTargetParentOptions conditional)
    {
        // Find parent question to check if it's a checkbox with default
        // This handles the edge case in frontend where checkbox defaults
        // affect child visibility even without explicit annotation

        // For now, return false (parent not answered = children hidden)
        // TODO: Implement DefaultCheckedStatus lookup if needed
        return false;
    }

    private bool EvaluateBooleanCondition(
        Annotation parentAnnotation,
        BooleanConditionalTargetParentOptions condition)
    {
        if (parentAnnotation is not BoolAnnotation boolAnn)
            return false;

        return boolAnn.Answer == condition.TargetParentBoolean;
    }

    private bool EvaluateOptionCondition(
        Annotation parentAnnotation,
        OptionConditionalTargetParentOptions condition)
    {
        // Handle array answers (checklist with multiple selections)
        // Frontend uses ANY match - if any selected option is in target, show
        if (parentAnnotation is StringArrayAnnotation arrAnn)
        {
            return arrAnn.Answer.Any(a =>
                condition.TargetParentOptions.Contains(a, StringComparer.OrdinalIgnoreCase));
        }

        // Handle single option answer
        if (parentAnnotation is StringAnnotation strAnn)
        {
            return condition.TargetParentOptions.Contains(
                strAnn.Answer, StringComparer.OrdinalIgnoreCase);
        }

        // Handle integer option (numeric dropdown)
        if (parentAnnotation is IntAnnotation intAnn)
        {
            return condition.TargetParentOptions.Contains(intAnn.Answer.ToString());
        }

        return false;
    }
}

Phase 2: Annotation Validation Service¶

Duration: ~3-4 days Files:

Core/Services/IAnnotationValidationService.cs (new)
Core/Services/AnnotationValidationService.cs (new)
Core/Services/ValidationResult.cs (new)
Core.Tests/Services/AnnotationValidationServiceTests.cs (new)

Interface:

public interface IAnnotationValidationService
{
    ValidationResult ValidateSubmission(
        Stage stage,
        SessionSubmissionDto submission,
        AnnotationValidationOptions? options = null);
}

public class ValidationResult
{
    public bool IsValid => !Errors.Any();
    public bool HasWarnings => Warnings.Any();
    public IReadOnlyList<ValidationError> Errors { get; init; } = [];
    public IReadOnlyList<ValidationWarning> Warnings { get; init; } = [];

    public static ValidationResult Success() => new();
    public static ValidationResult Failure(params ValidationError[] errors) =>
        new() { Errors = errors };
}

public record ValidationError(
    string Code,
    string Message,
    Guid? QuestionId = null,
    Guid? AnnotationId = null,
    string? Context = null  // Additional debug info
);

public record ValidationWarning(
    string Code,
    string Message,
    Guid? QuestionId = null
);

Validation Rules:

Rule	Error Code	Description
Required question missing	`MISSING_REQUIRED_QUESTION`	Visible, non-optional question has no annotation
Hidden question answered	`ANNOTATION_FOR_HIDDEN_QUESTION`	Annotation exists for question whose condition not met
Type mismatch	`TYPE_MISMATCH`	Annotation type doesn't match question type
Invalid option	`INVALID_OPTION`	Selected option not in visible options
Duplicate array values	`DUPLICATE_ARRAY_VALUES`	Array annotation contains duplicates
Empty TimePoints	`EMPTY_TIMEPOINTS`	OutcomeData has no timepoints
Invalid TimePoint values	`INVALID_TIMEPOINT_VALUES`	TimePoint has NaN, Infinity, or negative time
Missing OutcomeData reference	`MISSING_OUTCOME_REFERENCE`	OutcomeData references non-existent annotation

Phase 3: API Integration¶

Duration: ~2 days Files:

Endpoint/Controllers/ReviewController.cs (modify)
Endpoint/DependencyInjection/ServiceCollectionExtensions.cs (modify)

Integration Point:

[HttpPost("submit-session")]
public async Task<IActionResult> SubmitSession(
    [FromBody] SessionSubmissionDto dto)
{
    var stage = await _stageRepository.GetAsync(dto.StageId);

    // NEW: Validate before processing
    var validationResult = _annotationValidationService.ValidateSubmission(
        stage, dto, _validationOptions);

    if (!validationResult.IsValid)
    {
        _logger.LogWarning(
            "Annotation validation failed for session {SessionId}: {Errors}",
            dto.Id, validationResult.Errors);

        return BadRequest(new ProblemDetails
        {
            Title = "Annotation Validation Failed",
            Detail = "One or more annotations are invalid",
            Extensions = { ["errors"] = validationResult.Errors }
        });
    }

    if (validationResult.HasWarnings)
    {
        _logger.LogInformation(
            "Annotation validation warnings for session {SessionId}: {Warnings}",
            dto.Id, validationResult.Warnings);
    }

    // Existing logic continues...
    var study = await _studyRepository.GetAsync(dto.StudyId);
    study.AddSessionData(investigatorId, dto, questionIds, stage.Extraction);
    await _unitOfWork.SaveAsync(study);

    return Ok();
}

Phase 4: Seeding Integration¶

Duration: ~2-3 days Files:

Core/Seeding/AnnotationGenerator.cs (new)
Core/Seeding/SeedDataProfile.cs (new)
Core/Seeding/DatabaseSeeder.cs (modify)

AnnotationGenerator Design:

public class AnnotationGenerator
{
    private readonly IQuestionVisibilityService _visibilityService;

    /// <summary>
    /// Generates a complete set of valid annotations for a stage.
    /// Uses the same visibility logic as the API/UI.
    /// </summary>
    public IEnumerable<Annotation> GenerateAnnotations(
        Stage stage,
        Study study,
        Guid investigatorId,
        SeedDataProfile profile)
    {
        var annotations = new List<Annotation>();
        var annotationsByQuestion = new Dictionary<Guid, Annotation>();

        // Get root questions (no parent)
        var rootQuestions = stage.AllStageAnnotationQuestions
            .Where(q => q.Root || q.Target == null);

        foreach (var rootQ in rootQuestions)
        {
            GenerateAnnotationTree(
                rootQ,
                parentAnnotation: null,
                annotations,
                annotationsByQuestion,
                stage,
                study,
                investigatorId,
                profile);
        }

        return annotations;
    }

    private void GenerateAnnotationTree(
        AnnotationQuestion question,
        Annotation? parentAnnotation,
        List<Annotation> annotations,
        Dictionary<Guid, Annotation> annotationsByQuestion,
        Stage stage,
        Study study,
        Guid investigatorId,
        SeedDataProfile profile)
    {
        // Check visibility using shared service
        if (!_visibilityService.IsQuestionVisible(question, parentAnnotation))
            return;

        // Generate annotation for this question
        var annotation = GenerateAnnotation(
            question, parentAnnotation, study, investigatorId, profile);

        annotations.Add(annotation);
        annotationsByQuestion[question.Id] = annotation;

        // Recursively generate for subquestions
        foreach (var subQId in question.SubquestionIds)
        {
            var subQ = stage.AllStageAnnotationQuestions
                .FirstOrDefault(q => q.Id == subQId);

            if (subQ != null)
            {
                GenerateAnnotationTree(
                    subQ,
                    annotation,  // Current becomes parent for children
                    annotations,
                    annotationsByQuestion,
                    stage,
                    study,
                    investigatorId,
                    profile);
            }
        }
    }
}

Phase 5: TimePoint Validation¶

Duration: ~1 day Files:

Core/Model/StudyAggregate/Extraction/OutcomeData.cs (modify)
Core/Model/StudyAggregate/TimePoint.cs (modify)

TimePoint Validation:

public class TimePoint : ValueObject<TimePoint>
{
    public TimePoint(double time, double average, double error)
    {
        // Validate on construction
        if (time < 0)
            throw new ArgumentException("Time must be non-negative", nameof(time));

        if (double.IsNaN(average) || double.IsInfinity(average))
            throw new ArgumentException("Average must be a valid number", nameof(average));

        if (double.IsNaN(error) || double.IsInfinity(error))
            throw new ArgumentException("Error must be a valid number", nameof(error));

        Time = time;
        Average = average;
        Error = error;
    }

    public double Time { get; }
    public double Average { get; }
    public double Error { get; }
}

Testing Strategy¶

Unit Test Categories¶

1. QuestionVisibilityService Tests¶

public class QuestionVisibilityServiceTests
{
    [Fact]
    public void IsQuestionVisible_UnconditionalQuestion_ReturnsTrue()
    {
        var question = CreateQuestion(target: null);
        Assert.True(_service.IsQuestionVisible(question, parentAnnotation: null));
    }

    [Fact]
    public void IsQuestionVisible_BooleanCondition_ParentMatchesTarget_ReturnsTrue()
    {
        var question = CreateQuestion(
            target: new Target
            {
                ConditionalParentAnswers = new BooleanConditionalTargetParentOptions
                {
                    TargetParentBoolean = true
                }
            });
        var parentAnn = new BoolAnnotation(answer: true, ...);

        Assert.True(_service.IsQuestionVisible(question, parentAnn));
    }

    [Fact]
    public void IsQuestionVisible_BooleanCondition_ParentDoesNotMatch_ReturnsFalse()
    {
        var question = CreateQuestion(
            target: new Target
            {
                ConditionalParentAnswers = new BooleanConditionalTargetParentOptions
                {
                    TargetParentBoolean = true
                }
            });
        var parentAnn = new BoolAnnotation(answer: false, ...);

        Assert.False(_service.IsQuestionVisible(question, parentAnn));
    }

    [Fact]
    public void IsQuestionVisible_OptionCondition_ArrayContainsMatch_ReturnsTrue()
    {
        var question = CreateQuestion(
            target: new Target
            {
                ConditionalParentAnswers = new OptionConditionalTargetParentOptions
                {
                    TargetParentOptions = ["Option A", "Option B"]
                }
            });
        var parentAnn = new StringArrayAnnotation(answer: ["Option A", "Option C"], ...);

        Assert.True(_service.IsQuestionVisible(question, parentAnn));
    }

    // ... 20+ more test cases covering all conditionality scenarios
}

2. AnnotationValidationService Tests¶

public class AnnotationValidationServiceTests
{
    [Fact]
    public void ValidateSubmission_MissingRequiredQuestion_ReturnsError()
    {
        var stage = CreateStageWithRequiredQuestion();
        var submission = CreateSubmissionWithoutRequiredAnnotation();

        var result = _service.ValidateSubmission(stage, submission);

        Assert.False(result.IsValid);
        Assert.Contains(result.Errors, e => e.Code == "MISSING_REQUIRED_QUESTION");
    }

    [Fact]
    public void ValidateSubmission_AnnotationForHiddenQuestion_ReturnsError()
    {
        var stage = CreateStageWithConditionalQuestion();
        var submission = CreateSubmissionWithHiddenQuestionAnswered();

        var result = _service.ValidateSubmission(stage, submission);

        Assert.False(result.IsValid);
        Assert.Contains(result.Errors, e => e.Code == "ANNOTATION_FOR_HIDDEN_QUESTION");
    }

    [Fact]
    public void ValidateSubmission_TypeMismatch_ReturnsError()
    {
        var stage = CreateStageWithBooleanQuestion();
        var submission = CreateSubmissionWithStringAnnotationForBooleanQuestion();

        var result = _service.ValidateSubmission(stage, submission);

        Assert.False(result.IsValid);
        Assert.Contains(result.Errors, e => e.Code == "TYPE_MISMATCH");
    }

    [Fact]
    public void ValidateSubmission_InvalidOption_ReturnsError()
    {
        var stage = CreateStageWithDropdownQuestion(options: ["A", "B", "C"]);
        var submission = CreateSubmissionWithOption("D"); // Not in list

        var result = _service.ValidateSubmission(stage, submission);

        Assert.False(result.IsValid);
        Assert.Contains(result.Errors, e => e.Code == "INVALID_OPTION");
    }

    [Fact]
    public void ValidateSubmission_CompatibilityMode_LogsWarningButPasses()
    {
        var options = new AnnotationValidationOptions
        {
            Mode = ValidationMode.Compatibility
        };
        var submission = CreateSubmissionWithMinorViolations();

        var result = _service.ValidateSubmission(stage, submission, options);

        Assert.True(result.IsValid); // Passes in compatibility mode
        Assert.True(result.HasWarnings);
    }
}

3. Integration Tests¶

public class AnnotationSubmissionIntegrationTests
{
    [Fact]
    public async Task SubmitSession_WithInvalidAnnotations_Returns400()
    {
        var client = _factory.CreateClient();
        var invalidSubmission = CreateInvalidSubmission();

        var response = await client.PostAsJsonAsync(
            "/api/review/submit-session",
            invalidSubmission);

        Assert.Equal(HttpStatusCode.BadRequest, response.StatusCode);

        var problem = await response.Content.ReadFromJsonAsync<ProblemDetails>();
        Assert.Contains("MISSING_REQUIRED_QUESTION", problem.Extensions["errors"].ToString());
    }

    [Fact]
    public async Task SeedData_GeneratedAnnotations_PassValidation()
    {
        // Ensure seeded data is valid according to our validation rules
        var stage = await GetCompleteReviewAnnotationStage();
        var study = await GetSeededStudy();

        var submission = new SessionSubmissionDto
        {
            Annotations = study.ExtractionInfo.Annotations.ToHashSet(),
            OutcomeData = study.ExtractionInfo.OutcomeData.ToList(),
            // ...
        };

        var result = _validationService.ValidateSubmission(stage, submission);

        Assert.True(result.IsValid,
            $"Seeded data should be valid. Errors: {string.Join(", ", result.Errors)}");
    }
}

Cross-Language Validation Tests¶

To ensure backend matches frontend, create test cases that can be verified in both:

## Test Case: TC-001 - Boolean Condition True Match

**Setup**:
- Parent question: Checkbox "Is this a control?"
- Child question: Text "Control description" with condition targetParentBoolean=true

**Input**:
- Parent annotation: true

**Expected**:
- Child question is VISIBLE

**Backend Test**: QuestionVisibilityServiceTests.BooleanCondition_ParentMatchesTarget_ReturnsTrue
**Frontend Test**: annotation-form.service.spec.ts → "should show conditional child when parent is true"

Rollout Strategy¶

Phase 1: Logging Only (Week 1-2)¶

Deploy validation service with Mode = Compatibility
All submissions logged but never rejected
Monitor logs for violation patterns
Analyze: What percentage would fail strict validation?

Success Criteria:

< 5% of submissions have violations
No false positives identified

Phase 2: Soft Launch (Week 3)¶

Enable Mode = Strict for new seeded environments only
Continue Mode = Compatibility for production
Verify seeding works correctly with validation
Fix any issues discovered

Success Criteria:

All seeded data passes validation
No issues reported from preview environments

Phase 3: Production Rollout (Week 4)¶

Enable Mode = Strict for staging
Monitor for 1 week
If no issues, enable for production
Keep compatibility mode as feature flag fallback

Success Criteria:

No increase in support tickets
No data loss or submission failures
Logging shows only legitimate violations

Rollback Plan¶

If issues arise:

Set SYRF_STRICT_ANNOTATION_VALIDATION=false
Restart affected services
Validation returns to logging-only mode
Investigate and fix issues
Re-enable when confident

Files to Create/Modify¶

New Files¶

File	Purpose
`Core/Services/IQuestionVisibilityService.cs`	Interface for visibility logic
`Core/Services/QuestionVisibilityService.cs`	Shared visibility implementation
`Core/Services/IAnnotationValidationService.cs`	Interface for validation
`Core/Services/AnnotationValidationService.cs`	Comprehensive validation
`Core/Services/AnnotationValidationOptions.cs`	Validation configuration
`Core/Services/ValidationResult.cs`	Validation result types
`Core/Seeding/AnnotationGenerator.cs`	Uses visibility service for seeding
`Core/Seeding/SeedDataProfile.cs`	Configuration for seed data values
`Core.Tests/Services/QuestionVisibilityServiceTests.cs`	Unit tests
`Core.Tests/Services/AnnotationValidationServiceTests.cs`	Unit tests

Modified Files¶

File	Changes
`Endpoint/Controllers/ReviewController.cs`	Add validation before submission
`Endpoint/DependencyInjection/ServiceCollectionExtensions.cs`	Register services
`Core/Model/StudyAggregate/Extraction/TimePoint.cs`	Add validation in constructor
`Core/Seeding/DatabaseSeeder.cs`	Use AnnotationGenerator
`Core/Seeding/SeedDataConstants.cs`	Add session IDs for seeded annotations

Appendices¶

Appendix A: Data Extraction Hierarchy¶

From user-guide/data-extraction.md:

Experiment (groups cohorts being compared)
    └── Cohort (group of animals receiving same treatment)
            ├── Disease Model (what condition they have) + IsControl flag
            ├── Treatment (what intervention they receive) + IsControl flag
            └── Outcome (what is measured) + AverageType, ErrorType, Units, GreaterIsWorse
                    └── TimePoints (time, average, error values)

Appendix B: System Questions (18 total)¶

From AnnotationQuestion.cs - auto-added for extraction stages:

Category	Question GUID	Purpose
Labels	ExperimentLabelGuid	Experiment container
	CohortLabelQuestionGuid	Cohort container
	DiseaseModelInductionLabelGuid	DMI container
	TreatmentLabelGuid	Treatment container
	OutcomeAssessmentLabelGuid	Outcome container
Controls	DiseaseModelInductionControlQuestionGuid	"Is this a control?"
	TreatmentControlQuestionGuid	"Is this a control?"
Lookups	CohortModelInductionGuid	Link DM to Cohort
	CohortTreatmentGuid	Link Treatment to Cohort
	CohortOutcomeGuid	Link Outcome to Cohort
	ExperimentCohortGuid	Link Cohort to Experiment
	CohortNumberOfAnimalsGuid	Animals per cohort
Outcome	OutcomeAverageTypeGuid	Mean/Median
	OutcomeErrorTypeGuid	SD/SEM/IQR
	OutcomeUnitsGuid	Unit of measurement
	OutcomeGreaterIsWorseGuid	Interpretation flag
	PdfReferencesGuid	PDF references
	OutcomePdfRefGuid	PDF graph links

Appendix C: Summary of What Gets Seeded Per Study¶

Each fully extracted study will have:

1 Study-level annotation (Species)
2 Disease Model labels with IsControl flags and custom questions
2 Treatment labels with IsControl flags and custom questions
2 Outcome Assessment labels with AverageType, ErrorType, Units, GreaterIsWorse
2 Cohort labels linking to DM, Treatment, Outcome + NumberOfAnimals
1 Experiment label linking to both Cohorts
4 OutcomeData entities (2 cohorts × 2 outcomes) each with 3 TimePoints (24h, 48h, 72h)

Total annotations per study: ~35 Total OutcomeData per study: 4 (with 12 total TimePoints)

OutcomeData Validation (Detailed)¶

Overview¶

OutcomeData is the most complex entity in the annotation system. It represents quantitative outcome measurements extracted from studies, linking together experiments, cohorts, outcomes, and time-series data points.

OutcomeData Structure¶

public class OutcomeData
{
    public Guid Id { get; }
    public Guid StageId { get; }
    public Guid AnnotatorId { get; }
    public Guid ExperimentId { get; }      // → References Experiment Label annotation
    public Guid CohortId { get; }          // → References Cohort Label annotation
    public Guid OutcomeId { get; }         // → References Outcome Assessment annotation
    public int NumberOfAnimals { get; }
    public IReadOnlyList<TimePoint> TimePoints { get; }
}

public class TimePoint : ValueObject<TimePoint>
{
    public double Time { get; }      // Time in hours (e.g., 24, 48, 72)
    public double Average { get; }   // Mean/median value
    public double Error { get; }     // SD/SEM/IQR value
}

Validation Categories¶

Category 1: Structural Validation (TimePoints)¶

These validations ensure TimePoint data is mathematically valid.

Rule	Error Code	Description	Current State
TimePoints not empty	`EMPTY_TIMEPOINTS`	At least one TimePoint required	❌ Not validated
Time non-negative	`INVALID_TIME`	TimePoint.Time >= 0	❌ Not validated
Average valid number	`INVALID_AVERAGE`	Not NaN or Infinity	❌ Not validated
Error valid number	`INVALID_ERROR`	Not NaN or Infinity	❌ Not validated
Error non-negative	`NEGATIVE_ERROR`	Error values typically >= 0	❌ Not validated

Implementation:

public class TimePointValidationRule
{
    public ValidationResult Validate(TimePoint tp)
    {
        var errors = new List<ValidationError>();

        if (tp.Time < 0)
            errors.Add(new ValidationError("INVALID_TIME",
                $"TimePoint.Time must be non-negative, got {tp.Time}"));

        if (double.IsNaN(tp.Average) || double.IsInfinity(tp.Average))
            errors.Add(new ValidationError("INVALID_AVERAGE",
                "TimePoint.Average must be a valid number"));

        if (double.IsNaN(tp.Error) || double.IsInfinity(tp.Error))
            errors.Add(new ValidationError("INVALID_ERROR",
                "TimePoint.Error must be a valid number"));

        // Note: Error can be 0 (e.g., n=1 or exact measurement)
        // but negative error is typically invalid
        if (tp.Error < 0)
            errors.Add(new ValidationError("NEGATIVE_ERROR",
                $"TimePoint.Error should be non-negative, got {tp.Error}"));

        return errors.Any()
            ? ValidationResult.Failure(errors.ToArray())
            : ValidationResult.Success();
    }
}

Category 2: Referential Integrity Validation¶

OutcomeData references three annotation IDs that must exist and be of the correct type.

Reference	Target Annotation Type	System Question GUID	Validation
`ExperimentId`	Experiment Label	`ExperimentLabelGuid`	Must exist, must be label type
`CohortId`	Cohort Label	`CohortLabelQuestionGuid`	Must exist, must be label type
`OutcomeId`	Outcome Assessment	`OutcomeAssessmentLabelGuid`	Must exist, must be label type

Implementation:

public class OutcomeDataReferentialIntegrityRule
{
    public ValidationResult Validate(
        OutcomeData outcomeData,
        IReadOnlyDictionary<Guid, Annotation> annotationsById,
        Stage stage)
    {
        var errors = new List<ValidationError>();

        // Validate ExperimentId references a valid Experiment Label annotation
        if (!annotationsById.TryGetValue(outcomeData.ExperimentId, out var experimentAnn))
        {
            errors.Add(new ValidationError("MISSING_EXPERIMENT_REFERENCE",
                $"OutcomeData references non-existent Experiment annotation {outcomeData.ExperimentId}",
                Context: $"OutcomeDataId: {outcomeData.Id}"));
        }
        else if (!IsExperimentLabel(experimentAnn, stage))
        {
            errors.Add(new ValidationError("INVALID_EXPERIMENT_REFERENCE",
                $"ExperimentId must reference an Experiment Label annotation",
                Context: $"Referenced annotation type: {experimentAnn.GetType().Name}"));
        }

        // Validate CohortId references a valid Cohort Label annotation
        if (!annotationsById.TryGetValue(outcomeData.CohortId, out var cohortAnn))
        {
            errors.Add(new ValidationError("MISSING_COHORT_REFERENCE",
                $"OutcomeData references non-existent Cohort annotation {outcomeData.CohortId}",
                Context: $"OutcomeDataId: {outcomeData.Id}"));
        }
        else if (!IsCohortLabel(cohortAnn, stage))
        {
            errors.Add(new ValidationError("INVALID_COHORT_REFERENCE",
                $"CohortId must reference a Cohort Label annotation"));
        }

        // Validate OutcomeId references a valid Outcome Assessment annotation
        if (!annotationsById.TryGetValue(outcomeData.OutcomeId, out var outcomeAnn))
        {
            errors.Add(new ValidationError("MISSING_OUTCOME_REFERENCE",
                $"OutcomeData references non-existent Outcome annotation {outcomeData.OutcomeId}",
                Context: $"OutcomeDataId: {outcomeData.Id}"));
        }
        else if (!IsOutcomeLabel(outcomeAnn, stage))
        {
            errors.Add(new ValidationError("INVALID_OUTCOME_REFERENCE",
                $"OutcomeId must reference an Outcome Assessment annotation"));
        }

        return errors.Any()
            ? ValidationResult.Failure(errors.ToArray())
            : ValidationResult.Success();
    }

    private bool IsExperimentLabel(Annotation ann, Stage stage)
    {
        var question = stage.AllStageAnnotationQuestions
            .FirstOrDefault(q => q.Id == ann.QuestionId);
        return question?.Id == AnnotationQuestion.ExperimentLabelGuid;
    }

    // Similar methods for IsCohortLabel, IsOutcomeLabel...
}

Category 3: Hierarchy Validation¶

The data extraction model enforces a strict hierarchy: Experiment → Cohort → (DM + Treatment + Outcome).

Validation Rules:

Cohort belongs to Experiment: The referenced Cohort must be linked to the referenced Experiment via ExperimentCohort lookup annotation
Outcome belongs to Cohort: The referenced Outcome must be linked to the referenced Cohort via CohortOutcome lookup annotation

Implementation:

public class HierarchyValidationRule
{
    public ValidationResult Validate(
        OutcomeData outcomeData,
        IReadOnlyDictionary<Guid, Annotation> annotationsById,
        Stage stage)
    {
        var errors = new List<ValidationError>();

        // Find the ExperimentCohort lookup annotation for this experiment
        var experimentCohortLookup = FindLookupAnnotation(
            annotationsById,
            stage,
            AnnotationQuestion.ExperimentCohortGuid,
            outcomeData.ExperimentId);

        if (experimentCohortLookup == null)
        {
            errors.Add(new ValidationError("MISSING_EXPERIMENT_COHORT_LINK",
                "No ExperimentCohort lookup found for referenced Experiment"));
        }
        else
        {
            // Check if this Cohort is in the Experiment's cohort list
            var linkedCohorts = GetLinkedAnnotationIds(experimentCohortLookup);
            if (!linkedCohorts.Contains(outcomeData.CohortId))
            {
                errors.Add(new ValidationError("COHORT_NOT_IN_EXPERIMENT",
                    $"Cohort {outcomeData.CohortId} is not linked to Experiment {outcomeData.ExperimentId}"));
            }
        }

        // Find the CohortOutcome lookup annotation for this cohort
        var cohortOutcomeLookup = FindLookupAnnotation(
            annotationsById,
            stage,
            AnnotationQuestion.CohortOutcomeGuid,
            outcomeData.CohortId);

        if (cohortOutcomeLookup == null)
        {
            errors.Add(new ValidationError("MISSING_COHORT_OUTCOME_LINK",
                "No CohortOutcome lookup found for referenced Cohort"));
        }
        else
        {
            // Check if this Outcome is linked to this Cohort
            var linkedOutcomes = GetLinkedAnnotationIds(cohortOutcomeLookup);
            if (!linkedOutcomes.Contains(outcomeData.OutcomeId))
            {
                errors.Add(new ValidationError("OUTCOME_NOT_IN_COHORT",
                    $"Outcome {outcomeData.OutcomeId} is not linked to Cohort {outcomeData.CohortId}"));
            }
        }

        return errors.Any()
            ? ValidationResult.Failure(errors.ToArray())
            : ValidationResult.Success();
    }

    private Annotation? FindLookupAnnotation(
        IReadOnlyDictionary<Guid, Annotation> annotations,
        Stage stage,
        Guid lookupQuestionGuid,
        Guid parentAnnotationId)
    {
        // Lookup annotations have Target.AnnotationId pointing to their parent
        return annotations.Values.FirstOrDefault(a =>
            a.QuestionId == lookupQuestionGuid &&
            GetParentAnnotationId(a, stage) == parentAnnotationId);
    }

    private IEnumerable<Guid> GetLinkedAnnotationIds(Annotation lookupAnnotation)
    {
        // Lookup annotations store AnnotationIDs as their answer
        if (lookupAnnotation is AnnotationIdArrayAnnotation arrAnn)
            return arrAnn.Answer;
        if (lookupAnnotation is AnnotationIdAnnotation idAnn)
            return [idAnn.Answer];
        return [];
    }
}

Category 4: Consistency Validation¶

OutcomeData should be consistent with the referenced Outcome annotation's metadata.

Field in OutcomeData	Source in Outcome Annotation	Validation
`Units` (if stored)	OutcomeUnits subquestion	Must match
`AverageType` (if stored)	OutcomeAverageType subquestion	Must match
`ErrorType` (if stored)	OutcomeErrorType subquestion	Must match

Note: These fields may be derived at runtime from the Outcome annotation rather than stored redundantly in OutcomeData. The validation approach depends on the data model.

OutcomeData Validation Integration¶

public class OutcomeDataValidationService
{
    public ValidationResult ValidateOutcomeData(
        IEnumerable<OutcomeData> outcomeDataList,
        IEnumerable<Annotation> annotations,
        Stage stage)
    {
        var annotationsById = annotations.ToDictionary(a => a.Id);
        var allErrors = new List<ValidationError>();

        foreach (var outcomeData in outcomeDataList)
        {
            // 1. Structural validation
            if (!outcomeData.TimePoints.Any())
            {
                allErrors.Add(new ValidationError("EMPTY_TIMEPOINTS",
                    "OutcomeData must have at least one TimePoint",
                    Context: $"OutcomeDataId: {outcomeData.Id}"));
            }

            foreach (var tp in outcomeData.TimePoints)
            {
                var tpResult = _timePointValidator.Validate(tp);
                allErrors.AddRange(tpResult.Errors);
            }

            // 2. Referential integrity
            var refResult = _referentialIntegrityValidator.Validate(
                outcomeData, annotationsById, stage);
            allErrors.AddRange(refResult.Errors);

            // 3. Hierarchy validation
            var hierarchyResult = _hierarchyValidator.Validate(
                outcomeData, annotationsById, stage);
            allErrors.AddRange(hierarchyResult.Errors);

            // 4. NumberOfAnimals validation
            if (outcomeData.NumberOfAnimals < 0)
            {
                allErrors.Add(new ValidationError("INVALID_ANIMAL_COUNT",
                    $"NumberOfAnimals must be non-negative, got {outcomeData.NumberOfAnimals}"));
            }
        }

        return allErrors.Any()
            ? ValidationResult.Failure(allErrors.ToArray())
            : ValidationResult.Success();
    }
}

Frontend OutcomeData Behavior (Reference)¶

The frontend has specific handling for OutcomeData that must be understood:

// From stage-review.component.ts:578-590
// Frontend converts undefined timepoint values to 0
timePoints.map((tp) => ({
  time: tp.time || 0,        // undefined → 0
  average: tp.average || 0,  // undefined → 0
  error: tp.error || 0,      // undefined → 0
}))

Backend Decision: The backend should:

NOT accept undefined/null values for TimePoint fields
Reject submissions where frontend failed to transform values
Log warning if receiving 0 values that might indicate transformation issues

This ensures data quality and catches frontend bugs early.

Keeping Frontend/Backend Logic in Sync¶

The Synchronization Challenge¶

The core problem: business logic for question visibility and annotation validation must execute identically in two different technology stacks (Angular/TypeScript and .NET/C#).

Consequences of divergence:

Frontend shows question as visible → Backend rejects annotation as "for hidden question"
Frontend allows submission → Backend rejects with validation error
Seeded data passes backend validation → Frontend displays incorrectly
User frustration, data loss, support tickets

Approach Analysis¶

Option 1: Code Generation from Shared Schema¶

Concept: Define validation rules in a technology-agnostic DSL (Domain-Specific Language) or schema, then generate both TypeScript and C# implementations.

Tools to Consider:

Tool	Approach	Pros	Cons
TypeSpec	Microsoft's API definition language	Generates OpenAPI + client code, strong typing	Learning curve, limited to API contracts
JSON Schema	Define rules in JSON, interpret at runtime	Universal, tool-rich ecosystem	Runtime interpretation slower than compiled code
Custom DSL	Create project-specific rule language	Perfectly tailored to needs	High development cost, maintenance burden
Protocol Buffers	Google's serialization format	Fast, cross-language	Primarily for data, not business logic

Code Generation Implementation Sketch:

# conditionality-rules.yaml (source of truth)
rules:
  - id: species-conditional-details
    parent_question: Species
    child_question: SpeciesDetails
    condition:
      type: option
      target_options: ["Other rodent", "Non-rodent"]

  - id: control-description-required
    parent_question: IsControl
    child_question: ControlDescription
    condition:
      type: boolean
      target_value: true

// Generated TypeScript
export function isQuestionVisible(
  questionId: string,
  parentAnnotation: Annotation | null
): boolean {
  const rule = RULES[questionId];
  if (!rule) return true;
  // ... generated evaluation logic
}

// Generated C#
public bool IsQuestionVisible(Guid questionId, Annotation? parentAnnotation)
{
    var rule = _rules[questionId];
    if (rule == null) return true;
    // ... generated evaluation logic
}

Verdict: 🔴 NOT RECOMMENDED for this project

Rationale:

Complexity vs. Benefit: The visibility logic is ~200 lines of code. A code generator adds more complexity than it saves.
Build Pipeline Complication: Requires generator to run before both frontend and backend builds
Debugging Difficulty: Generated code is harder to debug than handwritten code
Edge Cases: Generators struggle with nuanced behavior (default checkbox status, reconciliation mode)
Team Familiarity: Team would need to learn and maintain the generator

Option 2: Shared JSON Configuration (Runtime Interpretation)¶

Concept: Store conditionality rules in JSON, load at runtime in both frontend and backend, interpret with thin platform-specific wrappers.

Implementation:

// conditionality-rules.json (shared, checked into repo)
{
  "version": "1.0.0",
  "rules": {
    "species-details": {
      "parentQuestionId": "00000000-0000-0000-0000-000000000300",
      "conditionType": "option",
      "targetOptions": ["Other rodent", "Non-rodent"]
    },
    "control-description": {
      "parentQuestionId": "...",
      "conditionType": "boolean",
      "targetValue": true
    }
  }
}

// Frontend: rule-interpreter.service.ts
@Injectable()
export class RuleInterpreterService {
  constructor(private http: HttpClient) {}

  async loadRules(): Promise<ConditionRules> {
    return this.http.get<ConditionRules>('/assets/conditionality-rules.json');
  }

  isVisible(questionId: string, parentAnswer: unknown): boolean {
    const rule = this.rules[questionId];
    return this.evaluateCondition(rule, parentAnswer);
  }
}

// Backend: RuleInterpreterService.cs
public class RuleInterpreterService : IQuestionVisibilityService
{
    private readonly ConditionRules _rules;

    public RuleInterpreterService()
    {
        var json = File.ReadAllText("conditionality-rules.json");
        _rules = JsonSerializer.Deserialize<ConditionRules>(json);
    }

    public bool IsQuestionVisible(Guid questionId, Annotation? parentAnnotation)
    {
        var rule = _rules.Rules.GetValueOrDefault(questionId.ToString());
        return EvaluateCondition(rule, parentAnnotation);
    }
}

Verdict: 🟡 PARTIALLY RECOMMENDED - Good for simple static rules

Pros:

Single source of truth for conditionality rules
Changes to rules don't require code changes
Easy to understand and maintain
Works well for static, declarative rules

Cons:

Dynamic logic (reconciliation mode, lookup options) can't be expressed in JSON
Thin interpreter must still be written and tested in both languages
Runtime parsing adds small overhead
Some behaviors depend on question type, not just rule configuration

Best For: System question conditionality rules that rarely change.

Option 3: Contract/Reference Tests (Recommended)¶

Concept: Accept that implementations will be independent, but enforce correctness through shared test specifications that both implementations must pass.

Implementation:

# conditionality-test-specification.md

## Test Case TC-001: Boolean Condition - True Match
**Setup**:
- Parent question type: Boolean (checkbox)
- Parent annotation: answer = true
- Child question condition: targetParentBoolean = true

**Expected Result**: Child question IS visible

---

## Test Case TC-002: Boolean Condition - False No Match
**Setup**:
- Parent question type: Boolean (checkbox)
- Parent annotation: answer = false
- Child question condition: targetParentBoolean = true

**Expected Result**: Child question IS NOT visible

---

## Test Case TC-003: Option Condition - Array Contains Match
**Setup**:
- Parent question type: Checklist (multiple selection)
- Parent annotation: answer = ["Option A", "Option C"]
- Child question condition: targetParentOptions = ["Option A", "Option B"]

**Expected Result**: Child question IS visible (at least one match)

---

## Test Case TC-015: Parent Not Answered - Default Hidden
**Setup**:
- Parent question type: Any
- Parent annotation: null (not answered)
- Child question condition: any

**Expected Result**: Child question IS NOT visible (unless parent has default)

// Frontend: conditionality.spec.ts
describe('Conditionality - TC-001', () => {
  it('Boolean Condition - True Match shows child', () => {
    const parentAnn = { answer: true };
    const childQuestion = {
      target: { conditionalParentAnswers: { targetParentBoolean: true } }
    };
    expect(service.isQuestionVisible(childQuestion, parentAnn)).toBe(true);
  });
});

describe('Conditionality - TC-002', () => {
  it('Boolean Condition - False No Match hides child', () => {
    const parentAnn = { answer: false };
    const childQuestion = {
      target: { conditionalParentAnswers: { targetParentBoolean: true } }
    };
    expect(service.isQuestionVisible(childQuestion, parentAnn)).toBe(false);
  });
});

// Backend: ConditionialityTests.cs
[Fact]
public void TC001_BooleanCondition_TrueMatch_ShowsChild()
{
    var parentAnn = new BoolAnnotation(answer: true, ...);
    var childQuestion = CreateQuestion(new Target {
        ConditionalParentAnswers = new BooleanConditionalTargetParentOptions {
            TargetParentBoolean = true
        }
    });
    Assert.True(_service.IsQuestionVisible(childQuestion, parentAnn));
}

[Fact]
public void TC002_BooleanCondition_FalseNoMatch_HidesChild()
{
    var parentAnn = new BoolAnnotation(answer: false, ...);
    var childQuestion = CreateQuestion(new Target {
        ConditionalParentAnswers = new BooleanConditionalTargetParentOptions {
            TargetParentBoolean = true
        }
    });
    Assert.False(_service.IsQuestionVisible(childQuestion, parentAnn));
}

Verdict: ✅ RECOMMENDED as primary approach

Pros:

No build-time dependencies or generators
Each platform uses idiomatic code (easier to debug and maintain)
Test specification serves as documentation
Captures nuanced edge cases that DSLs can't express
CI catches divergence immediately when tests fail
Works well with existing test infrastructure

Cons:

Requires discipline to maintain parallel test suites
Initial effort to write comprehensive test specification
Tests must be kept in sync manually

Implementation Strategy:

Create /docs/features/annotation-questions/conditionality-test-specification.md
Document every conditionality rule as a test case with ID (TC-XXX)
Write backend tests first (since that's the new code)
Verify frontend tests cover same cases (add if missing)
CI runs both test suites - divergence causes build failure

Option 4: Backend as Source of Truth (API-Driven)¶

Concept: Frontend asks backend which questions are visible, eliminating duplicate logic.

Implementation:

// Frontend service
async getVisibleQuestions(
  stageId: string,
  currentAnnotations: Annotation[]
): Promise<QuestionVisibility[]> {
  return this.http.post<QuestionVisibility[]>(
    `/api/stages/${stageId}/visible-questions`,
    { annotations: currentAnnotations }
  );
}

// Backend endpoint
[HttpPost("stages/{stageId}/visible-questions")]
public IActionResult GetVisibleQuestions(
    Guid stageId,
    [FromBody] VisibilityRequest request)
{
    var stage = _stageRepository.Get(stageId);
    var visibleIds = _visibilityService.GetVisibleQuestionIds(
        stage, request.Annotations);

    return Ok(visibleIds.Select(id => new QuestionVisibility(id, true)));
}

Verdict: 🔴 NOT RECOMMENDED for this project

Rationale:

Performance: Every keystroke or option selection would require API call
User Experience: Noticeable latency when toggling checkboxes
Offline Degradation: Form becomes unusable without connectivity
Complexity: Requires optimistic UI updates with reconciliation
Existing Architecture: Frontend is designed for offline-first form editing

When This Works: Single-page apps with simple forms and always-online requirements.

Option 5: Shared Library via WebAssembly¶

Concept: Write visibility logic once in a language that compiles to both .NET and WebAssembly (WASM), share the same binary.

Options:

C# compiled to WASM (Blazor)
Rust compiled to WASM + linked from .NET
AssemblyScript (TypeScript-like → WASM)

Verdict: 🔴 NOT RECOMMENDED for this project

Rationale:

Massive Overhead: Blazor WASM bundles are 5-10MB
Build Complexity: Dual-target compilation is fragile
Angular Integration: WASM interop from Angular is awkward
Debugging Nightmare: WASM stack traces are nearly unreadable
Team Expertise: Requires specialized knowledge

When This Works: Projects already using Blazor WebAssembly or performance-critical algorithms.

Recommended Approach: Hybrid Strategy¶

Based on the analysis, the recommended approach combines Contract Tests as the primary strategy with JSON Configuration for stable system question rules:

Tier 1: Contract Tests (Primary)¶

For all conditionality logic:

Maintain /docs/features/annotation-questions/conditionality-test-specification.md
Backend: QuestionVisibilityServiceTests.cs with TC-XXX naming
Frontend: annotation-form.service.spec.ts with TC-XXX naming
CI validates both pass

Test Specification Coverage:

15-20 test cases covering all conditionality scenarios
Edge cases: null parents, default values, reconciliation mode
Array handling: any match, case sensitivity
System question special behaviors

Tier 2: JSON Configuration (Optional Enhancement)¶

For system question static rules only:

If the team later wants to reduce duplication for the 18 system questions:

Create system-question-rules.json
Both platforms load and interpret at startup
Custom questions still use code-based evaluation
Saves ~50 lines of duplicated rule definition

Not recommended initially - adds complexity for marginal benefit.

Tier 3: Validation at API Boundary¶

Catch divergence at submission time:

Backend validates all submissions
Detailed error messages help debug frontend issues
Logging captures patterns of failed validations
Alerts if frontend is sending invalid data

Sync Maintenance Process¶

When modifying conditionality logic:

## Pre-Implementation Checklist

- [ ] Update test specification document (TC-XXX)
- [ ] Write/update backend unit test
- [ ] Write/update frontend unit test
- [ ] Verify both tests pass locally
- [ ] PR includes all three changes together
- [ ] CI validates cross-platform consistency

## Post-Implementation Verification

- [ ] Deploy to staging
- [ ] Manual test: Create annotation matching the rule
- [ ] Verify visibility matches expectation in UI
- [ ] Verify no validation errors on submission

Summary: Recommended Sync Strategy¶

Aspect	Recommendation	Rationale
Primary Approach	Contract/Reference Tests	Proven, low overhead, catches divergence
Documentation	Test Specification Document	Serves as both spec and documentation
Build Integration	CI runs both test suites	Automatic divergence detection
Code Generation	NOT recommended	Overkill for ~200 lines of logic
Shared Runtime	NOT recommended	Performance and complexity concerns
API-Driven Visibility	NOT recommended	Latency unacceptable for form UX

Annotation Questions Business Logic - Comprehensive domain analysis
Formal Specification - Precise rule definitions
Architecture Analysis - Where logic should live
Enhanced Database Seeding - Seeding strategy overview

Backend Annotation Validation Service¶

Executive Summary¶

Table of Contents¶

Problem Analysis¶

Current State¶

Validation Gaps (Security Risk)¶

Why Backend Validation is Required¶

Architectural Approach¶

Target Architecture¶

Key Design Decisions¶

Decision 1: Application Layer Validation (Not Domain)¶

Decision 2: Validation Levels (Warn vs Reject)¶

Decision 3: Recursive Tree Traversal for Visibility¶

Risk Assessment¶

Risk 1: Breaking Existing Frontend Submissions¶

Risk 2: Performance Impact¶

Risk 3: Divergence from Frontend Logic¶

Risk 4: Reconciliation Session Complexity¶

Risk 5: System Questions Special Behavior¶

Risk 6: Existing Data Compatibility¶

Implementation Plan¶

Phase 1: Question Visibility Service¶

Phase 2: Annotation Validation Service¶

Phase 3: API Integration¶

Phase 4: Seeding Integration¶

Phase 5: TimePoint Validation¶

Testing Strategy¶

Unit Test Categories¶

1. QuestionVisibilityService Tests¶

2. AnnotationValidationService Tests¶

3. Integration Tests¶

Cross-Language Validation Tests¶

Rollout Strategy¶

Phase 1: Logging Only (Week 1-2)¶

Phase 2: Soft Launch (Week 3)¶

Phase 3: Production Rollout (Week 4)¶

Rollback Plan¶

Files to Create/Modify¶

New Files¶

Modified Files¶

Appendices¶

Appendix A: Data Extraction Hierarchy¶

Appendix B: System Questions (18 total)¶

Appendix C: Summary of What Gets Seeded Per Study¶

OutcomeData Validation (Detailed)¶

Overview¶

OutcomeData Structure¶

Validation Categories¶

Category 1: Structural Validation (TimePoints)¶

Category 2: Referential Integrity Validation¶

Category 3: Hierarchy Validation¶

Category 4: Consistency Validation¶

OutcomeData Validation Integration¶

Frontend OutcomeData Behavior (Reference)¶

Keeping Frontend/Backend Logic in Sync¶

The Synchronization Challenge¶

Approach Analysis¶

Option 1: Code Generation from Shared Schema¶

Option 2: Shared JSON Configuration (Runtime Interpretation)¶

Option 3: Contract/Reference Tests (Recommended)¶

Option 4: Backend as Source of Truth (API-Driven)¶

Option 5: Shared Library via WebAssembly¶

Recommended Approach: Hybrid Strategy¶

Tier 1: Contract Tests (Primary)¶

Tier 2: JSON Configuration (Optional Enhancement)¶

Tier 3: Validation at API Boundary¶

Sync Maintenance Process¶

Summary: Recommended Sync Strategy¶

Related Documentation¶