Skip to content

Annotation Questions Formal Specification

Overview

This document provides a formal specification of annotation question business rules with precise definitions, cross-language validation strategies, and referential integrity requirements. It addresses the need for a single source of truth that can be enforced consistently across frontend (TypeScript), backend (C#), and database seeding.


1. Terminology Glossary

Entity Definitions

Term Definition Example
Annotation Question A form field definition that collects data from reviewers during systematic review annotation "What was the sample size?"
Annotation A recorded answer to an annotation question for a specific study Answer: "42" for study ABC
Unit A named instance created by answering a Label Question "Aspirin 100mg" (a Treatment unit)
Category A logical grouping of annotation questions by experimental domain "Treatment", "Outcome Assessment"
Stage A review workflow phase where questions are presented. Has an Extraction boolean that controls whether system questions (data extraction questions) are shown. "Screening" (extraction=false), "Data Extraction" (extraction=true)
ExtractionInfo Container within a Study that holds all annotations, sessions, and outcome data Study.ExtractionInfo.Annotations
OutcomeData Quantitative data extracted from studies, stored per Experiment-Cohort-Outcome combination. Contains TimePoints, statistical metadata, and unit references. Study.ExtractionInfo.OutcomeData
TimePoint A measurement at a specific time, containing time, average (mean/median), and error (SD/SEM/IQR) values { time: 24, average: 45.3, error: 12.1 }

Question Types

Term Definition Identifying Property
System Question Pre-defined, immutable question essential for domain structure system: true
Custom Question User-created question to capture project-specific data system: false
Label Question System question that creates named unit instances labelQuestion: true
Control Question Boolean question indicating whether a unit is a "control" questionType: boolean, specific GUIDs
Lookup Question Question that cross-references units from another category annotationLookup: true
Root Question A question with no parent (top-level in category) target: null OR target.parentId: null
Child Question A question nested under a parent question target.parentId: {guid}

Relationship Terms

Term Definition
Parent Question The question under which a child question is nested
Conditional Parent When a child only displays based on parent's answer value
Lookup Source The Label Question whose annotations populate a Lookup Question
Lookup Target The Lookup Question that displays annotations from a source

2. Formal Rule Specification

2.1 Category Placement Rules

These rules define WHERE custom questions can be placed within each category's hierarchy.

Key Concept: Custom questions form trees within each category. The rules below apply to first-level custom questions (direct children of system questions). Nested custom questions (children of other custom questions) parent to their custom question parent instead.

# annotation-question-rules.yaml
# Single source of truth for category placement rules

categories:
  Study:
    # Study is unique: no system parent, custom questions are true root-level
    customQuestionPlacement:
      type: "root"
      constraint: "First-level custom questions MUST have target = null"
      systemParent: null
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Study category"
    hasLabelQuestion: false
    hasControlQuestion: false

  "Disease Model Induction":
    customQuestionPlacement:
      type: "control-descendant"  # descendants, not just direct children
      constraint: "First-level questions parent to modelControl; nested parent to custom questions"
      systemParent: "b18aa936-a4c6-446b-ac98-88ac38930878"  # modelControl
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Disease Model Induction category"
    controlParameterRule:
      type: "required-for-first-level"
      constraint: "First-level questions MUST specify conditionalParentAnswers (true/false/null)"
      allowedValues: [true, false, null]
      semantics:
        true: "Show only when parent checkbox is CHECKED (is control procedure)"
        false: "Show only when parent checkbox is UNCHECKED (is not control)"
        null: "Show regardless of parent checkbox value"
    hasLabelQuestion: true
    labelQuestionGuid: "bdb6e257-5a08-42ef-aad0-829668679b0e"
    hasControlQuestion: true
    controlQuestionGuid: "b18aa936-a4c6-446b-ac98-88ac38930878"

  Treatment:
    customQuestionPlacement:
      type: "control-descendant"
      constraint: "First-level questions parent to treatmentControl; nested parent to custom questions"
      systemParent: "d04ec2d7-3e10-4847-9999-befe7ee4c454"  # treatmentControl
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Treatment category"
    controlParameterRule:
      type: "required-for-first-level"
      constraint: "First-level questions MUST specify conditionalParentAnswers (true/false/null)"
      allowedValues: [true, false, null]
      semantics:
        true: "Show only when parent checkbox is CHECKED (is control procedure)"
        false: "Show only when parent checkbox is UNCHECKED (is not control)"
        null: "Show regardless of parent checkbox value"
    hasLabelQuestion: true
    labelQuestionGuid: "b02e3072-74f0-44e0-a468-f472b3b09991"
    hasControlQuestion: true
    controlQuestionGuid: "d04ec2d7-3e10-4847-9999-befe7ee4c454"

  "Outcome Assessment":
    customQuestionPlacement:
      type: "label-descendant"
      constraint: "First-level questions parent to outcomeLabel; nested parent to custom questions"
      systemParent: "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"  # outcomeLabel
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Outcome Assessment category"
    controlParameterRule:
      type: "forbidden"
      constraint: "MUST NOT specify control-based conditionalParentAnswers"
    hasLabelQuestion: true
    labelQuestionGuid: "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"
    hasControlQuestion: false

  Cohort:
    customQuestionPlacement:
      type: "label-descendant"
      constraint: "First-level questions parent to cohortLabel; nested parent to custom questions"
      systemParent: "62c852ad-3390-48a4-ac13-439bf6b6587f"  # cohortLabel
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Cohort category"
    controlParameterRule:
      type: "forbidden"
      constraint: "MUST NOT specify control-based conditionalParentAnswers"
    hasLabelQuestion: true
    labelQuestionGuid: "62c852ad-3390-48a4-ac13-439bf6b6587f"
    hasControlQuestion: false

  Experiment:
    customQuestionPlacement:
      type: "label-descendant"
      constraint: "First-level questions parent to experimentLabel; nested parent to custom questions"
      systemParent: "7c555b6e-1fb6-4036-9982-c09a5db82ace"  # experimentLabel
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Experiment category"
    controlParameterRule:
      type: "forbidden"
      constraint: "MUST NOT specify control-based conditionalParentAnswers"
    hasLabelQuestion: true
    labelQuestionGuid: "7c555b6e-1fb6-4036-9982-c09a5db82ace"
    hasControlQuestion: false

  Hidden:
    customQuestionPlacement:
      type: "none"
      constraint: "Custom questions NOT ALLOWED in Hidden category"
      nestedQuestions:
        allowed: false
    hasLabelQuestion: true
    labelQuestionGuid: "7ee21ff9-e309-4387-8d30-719201497682"
    hasControlQuestion: false

2.2 Custom Question Placement Rules (Precise)

CRITICAL DISTINCTION: First-Level vs Nested Custom Questions

Custom questions in each category form trees that must be rooted at specific system questions. The rules differ depending on whether you're creating a first-level custom question (direct child of system question) or a nested custom question (child of another custom question).

┌─────────────────────────────────────────────────────────────────────────────┐
│                    CUSTOM QUESTION TREE STRUCTURE                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  System Question (e.g., Treatment Control)                                   │
│       │                                                                      │
│       ├── First-Level Custom Question A  ← Created via category header button│
│       │       │                                                              │
│       │       ├── Nested Custom Question A1  ← Created via "Add Related"    │
│       │       │       │                                                      │
│       │       │       └── Deeply Nested A1a  ← Created via "Add Related"    │
│       │       │                                                              │
│       │       └── Nested Custom Question A2                                  │
│       │                                                                      │
│       └── First-Level Custom Question B                                      │
│               │                                                              │
│               └── Nested Custom Question B1                                  │
│                                                                              │
│  KEY INSIGHT: Custom questions can nest ARBITRARILY DEEP within a category. │
│  The only constraint is that the ROOT of each tree is a system question.     │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Rule Summary:

  • First-level custom questions (created via category header "Add Question" button): target.parentId MUST equal the category's system question GUID
  • Nested custom questions (created via "Add Related" button on any custom question): target.parentId equals the parent custom question's ID - NOT the system question GUID
  • All custom questions must be descendants of their category's system question (transitively through the parent chain), but only first-level questions directly reference the system question
╔═══════════════════════════════════════════════════════════════════════════════╗
║                    FIRST-LEVEL CUSTOM QUESTION RULES                           ║
║          (Questions created via category header "Add Question" button)         ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Study                                                               ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  RULE: target MUST be null OR target.parentId MUST be null                     ║
║  WHY:  Study has no system parent - custom questions are root-level            ║
║                                                                                ║
║  VALID:   { target: null }                                                     ║
║  VALID:   { target: { parentId: null } }                                       ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Disease Model Induction                                             ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "b18aa936-a4c6-446b-ac98-88ac38930878"           ║
║    (modelControl GUID)                                                         ║
║    AND conditionalParentAnswers specifies control visibility                   ║
║                                                                                ║
║  VALID:   { target: { parentId: "b18aa936...", conditionalParentAnswers:       ║
║              { conditionType: 0, targetParentBoolean: true } } }  # Control    ║
║  VALID:   { target: { parentId: "b18aa936...", conditionalParentAnswers:       ║
║              { conditionType: 0, targetParentBoolean: false } } } # Non-control║
║  VALID:   { target: { parentId: "b18aa936...", conditionalParentAnswers: null }}# Both ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Treatment                                                           ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "d04ec2d7-3e10-4847-9999-befe7ee4c454"           ║
║    (treatmentControl GUID)                                                     ║
║    AND conditionalParentAnswers specifies control visibility                   ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Outcome Assessment                                                  ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"           ║
║    (outcomeLabel GUID)                                                         ║
║    AND conditionalParentAnswers MUST be null (no control parameter)            ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Cohort                                                              ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "62c852ad-3390-48a4-ac13-439bf6b6587f"           ║
║    (cohortLabel GUID)                                                          ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Experiment                                                          ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "7c555b6e-1fb6-4036-9982-c09a5db82ace"           ║
║    (experimentLabel GUID)                                                      ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Hidden                                                              ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  RULE: Custom questions NOT ALLOWED                                            ║
║                                                                                ║
╚═══════════════════════════════════════════════════════════════════════════════╝

2.3 Nested Custom Question Rules

Nested custom questions (created via "Add Related" button on an existing question) follow a different set of rules:

╔═══════════════════════════════════════════════════════════════════════════════╗
║  CRITICAL: NESTED QUESTIONS DO NOT PARENT TO SYSTEM QUESTIONS                 ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  When a custom question is created via "Add Related" on another custom        ║
║  question, it parents DIRECTLY to that custom question.                       ║
║                                                                                ║
║  The "required system parent" rules in section 2.2 DO NOT APPLY to nested     ║
║  questions. Those rules only apply to FIRST-LEVEL custom questions.           ║
║                                                                                ║
║  Example (Treatment category):                                                 ║
║  ─────────────────────────────                                                 ║
║  • First-level: target.parentId = treatmentControl (system GUID)              ║
║  • Nested:      target.parentId = {custom-question-guid} (NOT system GUID)    ║
║                                                                                ║
╚═══════════════════════════════════════════════════════════════════════════════╝
# Nested custom question rules
nestedQuestionRules:
  # Key rule: Nested questions parent to CUSTOM QUESTIONS, not system questions
  parentingRule:
    constraint: "target.parentId references the parent custom question's ID (NOT the system question)"
    validation:
      - "Parent question MUST exist"
      - "Parent question MUST be in the SAME category"
      - "Parent question MUST be a custom question (system: false)"
      - "Parent question MUST NOT be a system question"
      - "Circular references MUST NOT exist"
      - "Category of child MUST match category of parent"

  # Category is INHERITED from parent, not specified independently
  categoryInheritance:
    constraint: "Child question's category MUST equal parent question's category"
    reason: "Questions cannot cross category boundaries"

  # Conditional display based on parent answer
  conditionalDisplay:
    forBooleanParent:
      constraint: "conditionalParentAnswers.targetParentBoolean = true | false | null"
      semantics:
        true: "Show when parent checkbox is checked"
        false: "Show when parent checkbox is unchecked"
        null: "Show always regardless of parent answer"

    forDropdownParent:
      constraint: "conditionalParentAnswers.targetParentOptions = [option-values]"
      semantics: "Show when parent answer matches one of the specified options"

    forTextParent:
      constraint: "conditionalParentAnswers typically null or omitted"
      semantics: "Text questions generally don't support conditional children"

Frontend Implementation Reference:

The dialog data structure in create-question.component.ts controls this behavior:

// CreateQuestionDialogData interface (lines 87-91)
export interface CreateQuestionDialogData {
  parentQuestion: IAnnotationQuestion | null;  // <-- KEY: null = first-level, non-null = nested
  category: Category | null;                    // Only needed when parentQuestion is null
  control: boolean | null;                      // Only for Treatment/ModelInduction first-level
}

Parent Assignment Logic (from getCreateQuestion() method, lines 370-402):

const target: Target | null = this.data.parentQuestion
  ? createTarget(
      this.data.parentQuestion.id,  // ← Nested: parent is the selected custom question
      // ... conditional logic based on parent's question type
    )
  : category === categories.cohort  // ← First-level: use category-specific system parent
      ? createTarget(systemAnnotationQuestionGuids.cohortLabel)
      : category === categories.modelInduction
        ? createTarget(systemAnnotationQuestionGuids.modelControl, this.data.control)
        : category === categories.treatment
          ? createTarget(systemAnnotationQuestionGuids.treatmentControl, this.data.control)
        // ... other categories

2.4 Frontend UI Validation Rules

The following validation rules are enforced in the frontend create-question.component.ts:

# Frontend UI validation rules
# Source: create-question.component.ts (lines 742-779, 580-712)

fieldValidation:
  questionText:
    maxLength: 80  # Line 745: Validators.maxLength(80)
    required: true
    errorMessage: "Please enter a maximum of 80 characters"

  description:
    required: false
    maxLength: null  # No explicit limit

  options:
    required: "When controlType is dropdown, radio, checklist, or autocomplete"
    unique: true  # Lines 640-712: duplicateValidator
    errorMessages:
      empty: "Required. Please enter at least one value"
      duplicate: "Duplicate values. Please enter unique values"
      duplicateFilter: "Same option cannot both be always shown and shown for filtered options"

  numericOptions:
    # When questionType is 'integer' or 'decimal', options must be valid numbers
    integer:
      constraint: "All option values must be parseable as integers"
      errorMessage: "Please enter an integer value"
    decimal:
      constraint: "All option values must be parseable as numbers"
      errorMessage: "Please enter a number"

controlTypeBehavior:
  checkbox:
    # Lines 481-492: Boolean questions become checkbox with multiple=false
    questionType: "boolean"
    multipleForced: false
    optionalForced: false  # Lines 421-423: checkbox questions are always required
    defaultStatus: "Unchecked"  # Line 770

  dropdown:
    requiresOptions: true  # Lines 184-199: groupValidate
    supportsMultiple: true
    supportsConditionalChildren: true

  radio:
    requiresOptions: true
    multipleForced: false  # Radio implies single selection
    supportsConditionalChildren: true

  autocomplete:
    requiresOptions: true
    supportsMultiple: true
    supportsConditionalChildren: true

  checklist:
    requiresOptions: true
    supportsMultiple: true
    supportsConditionalChildren: true

  textbox:
    requiresOptions: false
    supportsConditionalChildren: false  # Line 337-338: typically no conditional

categoryInheritance:
  # Lines 748-751: Category is inherited from parent for nested questions
  nested:
    constraint: "Category is inherited from parent and disabled (read-only)"
    source: "this.parentQuestion.category"
  firstLevel:
    constraint: "Category is provided via dialog data"
    source: "this.data.category"

conditionalParentAnswers:
  # Lines 723-740: validateConditionalParentAnswers
  schemaVersion0:
    type: "string | null"
    constraint: "Must be null or match one of parent's answers"
  schemaVersion1Plus:
    type: "string[]"
    constraint: "Must contain at least one answer when conditional is true"

2.5 Stage Context Rules

The Stage.Extraction boolean is a critical control that determines which questions are available for annotation in a given stage.

The Master Switch: Stage.Extraction

# Stage extraction rules
stage:
  extraction:
    type: boolean
    default: false
    semantics:
      true: "Enable data extraction - show system questions + custom questions"
      false: "Disable data extraction - show only custom questions"

  availableQuestions:
    rule: |
      IF stage.extraction = true:
        questions = SystemQuestionIds ∪ stage.annotationQuestions
      ELSE:
        questions = stage.annotationQuestions

Backend Implementation

Location: Stage.cs:93-96

public ImmutableHashSet<Guid> AllStageAnnotationQuestions =>
    ImmutableHashSet.CreateRange(AnnotationQuestions).Union(Extraction
        ? AnnotationQuestion.SystemQuestionIds
        : ImmutableHashSet<Guid>.Empty);

Frontend Implementation

Location: annotation-question.selectors.ts:133-154

// Parallel logic in NgRx selector
return stage.extraction
  ? _.union(sysAnnotationQuestions, stageQuestions)
  : stageQuestions;

Category Visibility Rules

Category extraction=false extraction=true
Study Custom questions only Custom questions only
Disease Model Induction Hidden Label + Control + Custom
Treatment Hidden Label + Control + Custom
Outcome Assessment Hidden Label + Outcome Qs + Custom
Cohort Hidden Label + Lookups + Custom
Experiment Hidden Label + Lookups + Custom
Hidden Hidden System questions only

Validation Rule

stageContextValidation:
  rule: |
    WHEN validating an AnnotationQuestion for a Stage:
      IF question.system = true:
        REQUIRE stage.extraction = true
        REASON: "System questions are only available when data extraction is enabled"
      IF question.category IN ["Disease Model Induction", "Treatment", "Outcome Assessment", "Cohort", "Experiment"]:
        REQUIRE stage.extraction = true
        REASON: "Unit-based categories are only available when data extraction is enabled"

2.6 OutcomeData Field Mapping

This section documents how system questions populate OutcomeData fields. Understanding this mapping is essential for data extraction workflows.

Field Mapping Table

OutcomeData Field Source System Question Category Question GUID
units Outcome Units Outcome Assessment 66eb1736-a838-4692-a78b-96b0671a377c
averageType Average Type Outcome Assessment 3a287115-5000-4d3f-8c41-7c46fae9adcf
errorType Error Type Outcome Assessment 8dbea59f-54d2-4e41-87e7-fde9e73a72d5
greaterIsWorse Greater Is Worse Outcome Assessment 45351e04-47b2-4785-9a72-713284e917b8
numberOfAnimals Number of Animals Cohort 83caa64f-86a1-4f6e-a278-ebbd25297677
graphId PDF Graphs (lookup) Outcome Assessment 016278e8-7e60-40d4-9568-d7fa42670c32
experimentId (from annotation form) Experiment Links to Experiment unit
cohortId (from annotation form) Cohort Links to Cohort unit
outcomeId (from annotation form) Outcome Assessment Links to Outcome unit

Note: numberOfAnimals comes from the Cohort category, not Outcome Assessment. This allows different animal numbers per cohort.

TimePoint Field Semantics

Field Meaning Interpretation
time Time point of measurement Units depend on study (e.g., 24 = 24 hours post-treatment)
average Central tendency value Mean if averageType = "Mean", Median if averageType = "Median"
error Variability measure SD/SEM if averageType = "Mean", IQR if averageType = "Median"

Formal Mapping Rule

outcomeDataFieldMapping:
  rule: |
    FOR each OutcomeData entry:
      # Unit references (from annotation linking)
      outcomeData.experimentId = annotationForm.selectedExperiment.id
      outcomeData.cohortId = annotationForm.selectedCohort.id
      outcomeData.outcomeId = annotationForm.selectedOutcome.id

      # Statistical metadata (from system question answers)
      outcomeData.units = answerFor(GUID: "66eb1736-a838-4692-a78b-96b0671a377c")
      outcomeData.averageType = answerFor(GUID: "3a287115-5000-4d3f-8c41-7c46fae9adcf")
      outcomeData.errorType = answerFor(GUID: "8dbea59f-54d2-4e41-87e7-fde9e73a72d5")
      outcomeData.greaterIsWorse = answerFor(GUID: "45351e04-47b2-4785-9a72-713284e917b8")
      outcomeData.numberOfAnimals = answerFor(GUID: "83caa64f-86a1-4f6e-a278-ebbd25297677")
      outcomeData.graphId = answerFor(GUID: "016278e8-7e60-40d4-9568-d7fa42670c32")  # Optional

      # TimePoints (entered directly in OutcomeData UI)
      outcomeData.timePoints = userEnteredTimePoints[]

3. Cross-Language Validation Strategies

3.1 The Challenge

┌─────────────────────────────────────────────────────────────────────────────┐
│                    THE CROSS-LANGUAGE CHALLENGE                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  FRONTEND (TypeScript)                 BACKEND (C#)                         │
│  ══════════════════════               ════════════                          │
│  • Pre-validation for UX               • Authoritative validation            │
│  • Immediate feedback                  • Database seeding                    │
│  • UI constraints                      • API enforcement                     │
│                                                                              │
│                    ┌─────────────────────────┐                               │
│                    │   RULES MUST BE         │                               │
│                    │   IDENTICAL             │                               │
│                    │   IN BOTH LANGUAGES     │                               │
│                    └─────────────────────────┘                               │
│                                                                              │
│  OPTIONS:                                                                    │
│  ────────                                                                    │
│  1. Schema-First (JSON Schema) - Generate validators for both languages     │
│  2. TypeSpec - Microsoft's API design language with code generation         │
│  3. JSON Typedef (JTD) - Portable schemas with multi-language codegen       │
│  4. Manual Sync - Keep rules in both places manually (error-prone)          │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Why JSON Schema?

  • Industry standard with excellent tooling
  • NJsonSchema for C# (generate validators and DTOs)
  • AJV for TypeScript (fast validation)
  • Human-readable as documentation
  • Supports custom validation keywords for complex rules

Implementation Architecture:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    JSON SCHEMA CODE GENERATION PIPELINE                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│                   ┌──────────────────────────────────┐                       │
│                   │  annotation-question.schema.json │ ◄── Single Source     │
│                   │  ─────────────────────────────── │     of Truth          │
│                   │  • Type definitions               │                       │
│                   │  • Category rules                 │                       │
│                   │  • GUID constants                 │                       │
│                   │  • Validation constraints         │                       │
│                   └───────────────┬──────────────────┘                       │
│                                   │                                          │
│               ┌───────────────────┼───────────────────┐                      │
│               │                   │                   │                      │
│               ▼                   ▼                   ▼                      │
│  ┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐       │
│  │ C# Generation      │ │ TypeScript Gen     │ │ Documentation      │       │
│  │ (NJsonSchema)      │ │ (json-schema-to-ts)│ │ (docgen)           │       │
│  ├────────────────────┤ ├────────────────────┤ ├────────────────────┤       │
│  │ • DTOs             │ │ • Type interfaces  │ │ • Markdown docs    │       │
│  │ • Validators       │ │ • Zod/Yup schemas  │ │ • Rule tables      │       │
│  │ • Constants        │ │ • Constants        │ │ • Examples         │       │
│  └────────────────────┘ └────────────────────┘ └────────────────────┘       │
│               │                   │                   │                      │
│               ▼                   ▼                   ▼                      │
│  ┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐       │
│  │ SyRF.Validation    │ │ @syrf/validation   │ │ /docs/generated    │       │
│  │ .Generated.dll     │ │ npm package        │ │ rules.md           │       │
│  └────────────────────┘ └────────────────────┘ └────────────────────┘       │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

3.3 Schema Example

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://syrf.org/schemas/annotation-question.json",
  "title": "AnnotationQuestion",
  "description": "Schema for annotation question validation",

  "definitions": {
    "SystemQuestionGuid": {
      "type": "string",
      "enum": [
        "bdb6e257-5a08-42ef-aad0-829668679b0e",
        "b18aa936-a4c6-446b-ac98-88ac38930878",
        "b02e3072-74f0-44e0-a468-f472b3b09991",
        "d04ec2d7-3e10-4847-9999-befe7ee4c454",
        "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f",
        "62c852ad-3390-48a4-ac13-439bf6b6587f",
        "7c555b6e-1fb6-4036-9982-c09a5db82ace"
      ],
      "x-enum-descriptions": {
        "bdb6e257-5a08-42ef-aad0-829668679b0e": "diseaseModelInductionLabel",
        "b18aa936-a4c6-446b-ac98-88ac38930878": "modelControl",
        "b02e3072-74f0-44e0-a468-f472b3b09991": "treatmentLabel",
        "d04ec2d7-3e10-4847-9999-befe7ee4c454": "treatmentControl",
        "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f": "outcomeAssessmentLabel",
        "62c852ad-3390-48a4-ac13-439bf6b6587f": "cohortLabel",
        "7c555b6e-1fb6-4036-9982-c09a5db82ace": "experimentLabel"
      }
    },

    "Category": {
      "type": "string",
      "enum": [
        "Study",
        "Disease Model Induction",
        "Treatment",
        "Outcome Assessment",
        "Cohort",
        "Experiment",
        "Hidden"
      ]
    },

    "CategoryRules": {
      "description": "Rules for FIRST-LEVEL custom questions. Nested questions parent to other custom questions instead.",
      "type": "object",
      "properties": {
        "Study": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "type": "null", "description": "Study has no system parent - first-level questions are root" },
            "controlParameterRequired": { "const": false }
          }
        },
        "Disease Model Induction": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "b18aa936-a4c6-446b-ac98-88ac38930878", "description": "First-level parent to modelControl; nested parent to custom questions" },
            "controlParameterRequired": { "const": true, "description": "Only for first-level questions" }
          }
        },
        "Treatment": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "d04ec2d7-3e10-4847-9999-befe7ee4c454", "description": "First-level parent to treatmentControl; nested parent to custom questions" },
            "controlParameterRequired": { "const": true, "description": "Only for first-level questions" }
          }
        },
        "Outcome Assessment": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f", "description": "First-level parent to outcomeLabel; nested parent to custom questions" },
            "controlParameterRequired": { "const": false }
          }
        },
        "Cohort": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "62c852ad-3390-48a4-ac13-439bf6b6587f", "description": "First-level parent to cohortLabel; nested parent to custom questions" },
            "controlParameterRequired": { "const": false }
          }
        },
        "Experiment": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "7c555b6e-1fb6-4036-9982-c09a5db82ace", "description": "First-level parent to experimentLabel; nested parent to custom questions" },
            "controlParameterRequired": { "const": false }
          }
        }
      }
    },

    "ConditionalParentAnswers": {
      "oneOf": [
        { "type": "null" },
        {
          "type": "object",
          "properties": {
            "conditionType": {
              "type": "integer",
              "enum": [0, 1],
              "description": "0 = Boolean, 1 = Option"
            },
            "targetParentBoolean": {
              "type": ["boolean", "null"]
            },
            "targetParentOptions": {
              "type": "array",
              "items": { "type": "string" }
            }
          }
        }
      ]
    },

    "Target": {
      "oneOf": [
        { "type": "null" },
        {
          "type": "object",
          "properties": {
            "parentId": {
              "oneOf": [
                { "type": "null" },
                { "type": "string", "format": "uuid" }
              ]
            },
            "conditionalParentAnswers": {
              "$ref": "#/definitions/ConditionalParentAnswers"
            }
          },
          "required": ["parentId"]
        }
      ]
    }
  },

  "type": "object",
  "properties": {
    "id": { "type": "string", "format": "uuid" },
    "category": { "$ref": "#/definitions/Category" },
    "system": { "type": "boolean" },
    "target": { "$ref": "#/definitions/Target" },
    "text": { "type": "string", "minLength": 1 },
    "questionType": { "type": "string" },
    "controlType": { "type": "string" }
  },
  "required": ["category", "text"],

  "x-validation-note": "IMPORTANT: JSON Schema alone cannot fully validate parent rules because nested questions can parent to ANY custom question in the same category. The allOf rules below only validate FIRST-LEVEL questions. Full validation requires runtime checking that the parentId references either (a) the system parent GUID for first-level questions, or (b) a valid custom question in the same category for nested questions. See section 3.4 for complete validation logic.",

  "allOf": [
    {
      "description": "Study: first-level questions must be root (target null)",
      "if": {
        "properties": {
          "system": { "const": false },
          "category": { "const": "Study" }
        }
      },
      "then": {
        "description": "Study allows root questions OR nested questions parenting to custom questions",
        "properties": {
          "target": {
            "oneOf": [
              { "type": "null", "description": "Root question (first-level)" },
              {
                "type": "object",
                "properties": {
                  "parentId": {
                    "oneOf": [
                      { "type": "null", "description": "Root question" },
                      { "type": "string", "format": "uuid", "description": "Nested question - parentId is custom question ID" }
                    ]
                  }
                }
              }
            ]
          }
        }
      }
    },
    {
      "description": "Disease Model Induction: first-level questions parent to modelControl, nested parent to custom questions",
      "if": {
        "properties": {
          "system": { "const": false },
          "category": { "const": "Disease Model Induction" }
        }
      },
      "then": {
        "properties": {
          "target": {
            "type": "object",
            "properties": {
              "parentId": {
                "type": "string",
                "format": "uuid",
                "description": "FIRST-LEVEL: must be b18aa936-... (modelControl). NESTED: can be any custom question ID. Runtime validation required."
              }
            },
            "required": ["parentId"]
          }
        },
        "required": ["target"]
      }
    }
  ],

  "x-nested-question-validation": {
    "description": "For nested questions, parentId validation must be done at runtime",
    "rules": [
      "Parent question must exist in the question collection",
      "Parent question must have system: false (is a custom question)",
      "Parent question must have the same category as the child",
      "No circular references allowed"
    ]
  }
}

3.4 Generated Code Examples

C# (via NJsonSchema):

// Auto-generated from annotation-question.schema.json
// DO NOT EDIT - regenerate with: dotnet run --project tools/schema-gen

namespace SyRF.Validation.Generated;

public static class CategoryRules
{
    /// <summary>
    /// Rules for FIRST-LEVEL custom questions only.
    /// Nested questions parent to other custom questions - validated separately.
    /// </summary>
    public static readonly IReadOnlyDictionary<string, CategoryRule> Rules =
        new Dictionary<string, CategoryRule>
        {
            ["Study"] = new CategoryRule(
                FirstLevelParent: null,  // Study first-level questions are root
                ControlParameterRequired: false
            ),
            ["Disease Model Induction"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("b18aa936-a4c6-446b-ac98-88ac38930878"),
                ControlParameterRequired: true  // Only for first-level
            ),
            ["Treatment"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("d04ec2d7-3e10-4847-9999-befe7ee4c454"),
                ControlParameterRequired: true  // Only for first-level
            ),
            ["Outcome Assessment"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"),
                ControlParameterRequired: false
            ),
            ["Cohort"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("62c852ad-3390-48a4-ac13-439bf6b6587f"),
                ControlParameterRequired: false
            ),
            ["Experiment"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("7c555b6e-1fb6-4036-9982-c09a5db82ace"),
                ControlParameterRequired: false
            )
        };
}

public record CategoryRule(
    Guid? FirstLevelParent,  // Renamed: only applies to first-level questions
    bool ControlParameterRequired
);

public class AnnotationQuestionValidator
{
    private readonly IReadOnlyCollection<AnnotationQuestionDto> _allQuestions;

    public AnnotationQuestionValidator(IReadOnlyCollection<AnnotationQuestionDto> allQuestions)
    {
        _allQuestions = allQuestions;
    }

    public ValidationResult Validate(AnnotationQuestionDto dto)
    {
        if (dto.System) return ValidationResult.Success;

        if (!CategoryRules.Rules.TryGetValue(dto.Category, out var rule))
            return ValidationResult.Error($"Unknown category: {dto.Category}");

        var actualParent = dto.Target?.ParentId;

        // CASE 1: Study category - first-level must be root, nested can parent to custom
        if (rule.FirstLevelParent == null)
        {
            if (actualParent == null)
                return ValidationResult.Success;  // Valid first-level (root)

            // Has a parent - must be a valid nested question
            return ValidateNestedQuestion(dto, actualParent.Value);
        }

        // CASE 2: Other categories - check if first-level or nested
        if (actualParent == null)
        {
            return ValidationResult.Error(
                $"Questions in {dto.Category} must have a parent");
        }

        if (actualParent == rule.FirstLevelParent)
        {
            // Valid FIRST-LEVEL question - parents to system question
            // Validate control parameter for first-level only
            if (rule.ControlParameterRequired)
            {
                // conditionalParentAnswers must be explicitly set (can be null for "both")
            }
            return ValidationResult.Success;
        }

        // Parent is not the system question - must be a valid NESTED question
        return ValidateNestedQuestion(dto, actualParent.Value);
    }

    /// <summary>
    /// Validates nested questions that parent to other custom questions.
    /// </summary>
    private ValidationResult ValidateNestedQuestion(AnnotationQuestionDto dto, Guid parentId)
    {
        var parentQuestion = _allQuestions.FirstOrDefault(q => q.Id == parentId);

        if (parentQuestion == null)
            return ValidationResult.Error($"Parent question {parentId} not found");

        if (parentQuestion.System)
            return ValidationResult.Error(
                $"Nested questions cannot parent to system questions. " +
                $"Use first-level rules for parenting to system questions.");

        if (parentQuestion.Category != dto.Category)
            return ValidationResult.Error(
                $"Parent question must be in same category ({dto.Category}), " +
                $"but parent is in {parentQuestion.Category}");

        // TODO: Check for circular references

        return ValidationResult.Success;
    }
}

TypeScript (via json-schema-to-typescript + custom transform):

// Auto-generated from annotation-question.schema.json
// DO NOT EDIT - regenerate with: npm run generate:validation

/**
 * Rules for FIRST-LEVEL custom questions only.
 * Nested questions parent to other custom questions - validated separately.
 */
export const categoryRules = {
  Study: {
    firstLevelParent: null,  // Study first-level questions are root
    controlParameterRequired: false,
  },
  'Disease Model Induction': {
    firstLevelParent: 'b18aa936-a4c6-446b-ac98-88ac38930878',
    controlParameterRequired: true,  // Only for first-level
  },
  Treatment: {
    firstLevelParent: 'd04ec2d7-3e10-4847-9999-befe7ee4c454',
    controlParameterRequired: true,  // Only for first-level
  },
  'Outcome Assessment': {
    firstLevelParent: 'dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f',
    controlParameterRequired: false,
  },
  Cohort: {
    firstLevelParent: '62c852ad-3390-48a4-ac13-439bf6b6587f',
    controlParameterRequired: false,
  },
  Experiment: {
    firstLevelParent: '7c555b6e-1fb6-4036-9982-c09a5db82ace',
    controlParameterRequired: false,
  },
} as const;

export type Category = keyof typeof categoryRules;

/**
 * Validates annotation question placement rules.
 * Handles both first-level questions (parent to system) and nested questions (parent to custom).
 */
export function validateAnnotationQuestion(
  dto: AnnotationQuestionDto,
  allQuestions: AnnotationQuestionDto[]  // Needed to validate nested question parents
): ValidationResult {
  if (dto.system) return { valid: true };

  const rule = categoryRules[dto.category as Category];
  if (!rule) {
    return { valid: false, error: `Unknown category: ${dto.category}` };
  }

  const actualParent = dto.target?.parentId ?? null;

  // CASE 1: Study category - first-level must be root, nested can parent to custom
  if (rule.firstLevelParent === null) {
    if (actualParent === null) {
      return { valid: true };  // Valid first-level (root)
    }
    // Has a parent - must be a valid nested question
    return validateNestedQuestion(dto, actualParent, allQuestions);
  }

  // CASE 2: Other categories - check if first-level or nested
  if (actualParent === null) {
    return {
      valid: false,
      error: `Questions in ${dto.category} must have a parent`,
    };
  }

  if (actualParent === rule.firstLevelParent) {
    // Valid FIRST-LEVEL question - parents to system question
    return { valid: true };
  }

  // Parent is not the system question - must be a valid NESTED question
  return validateNestedQuestion(dto, actualParent, allQuestions);
}

/**
 * Validates nested questions that parent to other custom questions.
 */
function validateNestedQuestion(
  dto: AnnotationQuestionDto,
  parentId: string,
  allQuestions: AnnotationQuestionDto[]
): ValidationResult {
  const parentQuestion = allQuestions.find(q => q.id === parentId);

  if (!parentQuestion) {
    return { valid: false, error: `Parent question ${parentId} not found` };
  }

  if (parentQuestion.system) {
    return {
      valid: false,
      error: `Nested questions cannot parent to system questions. ` +
             `Use first-level rules for parenting to system questions.`,
    };
  }

  if (parentQuestion.category !== dto.category) {
    return {
      valid: false,
      error: `Parent question must be in same category (${dto.category}), ` +
             `but parent is in ${parentQuestion.category}`,
    };
  }

  // TODO: Check for circular references

  return { valid: true };
}

4. Lookup Referential Integrity

4.1 The Problem

Lookup questions reference annotations from other categories. When source annotations are deleted, lookups can reference non-existent data (orphaned references).

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LOOKUP REFERENTIAL INTEGRITY PROBLEM                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  1. User creates Treatment annotation: "Aspirin 100mg"                       │
│                                                                              │
│  2. User creates Cohort with lookup referencing "Aspirin 100mg"              │
│     cohortTreatments: ["aspirin-100mg-annotation-id"]                        │
│                                                                              │
│  3. User DELETES the Treatment annotation "Aspirin 100mg"                    │
│                                                                              │
│  4. PROBLEM: Cohort still references deleted annotation!                     │
│     cohortTreatments: ["aspirin-100mg-annotation-id"]  ← ORPHAN!             │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

4.2 Unified Schema: Placement + Referential Integrity

Key Design Decision: Referential integrity rules are part of the SAME schema as placement rules (section 3). This ensures:

  • Single source of truth for ALL annotation question rules
  • One code generation pipeline produces ALL validators
  • Consistent validation across frontend/backend

The YAML schema from section 2.1 is extended to include lookup relationships:

# annotation-question-rules.yaml (EXTENDED from section 2.1)
# This is the COMPLETE schema covering placement AND referential integrity

categories:
  Treatment:
    # Placement rules (from section 2.1)
    customQuestionPlacement:
      type: "control-descendant"
      systemParent: "d04ec2d7-3e10-4847-9999-befe7ee4c454"
      nestedQuestions:
        allowed: true
    controlParameterRule:
      type: "required-for-first-level"

    # NEW: Referential integrity - who references this category's labels?
    labelReferentialIntegrity:
      labelQuestionGuid: "b02e3072-74f0-44e0-a468-f472b3b09991"
      referencedBy:
        - lookupQuestionGuid: "a3f2e5bb-3ade-4830-bb66-b5550a3cc85b"  # cohortTreatments
          lookupCategory: "Cohort"
          relationship: "many-to-many"
          onSourceDelete: "prevent"  # prevent | cascade-nullify | soft-delete
      referencesTo: []  # Treatment doesn't lookup from other categories

  Cohort:
    customQuestionPlacement:
      type: "label-descendant"
      systemParent: "62c852ad-3390-48a4-ac13-439bf6b6587f"
      nestedQuestions:
        allowed: true
    controlParameterRule:
      type: "forbidden"

    # Cohort's labels are referenced by Experiment
    labelReferentialIntegrity:
      labelQuestionGuid: "62c852ad-3390-48a4-ac13-439bf6b6587f"
      referencedBy:
        - lookupQuestionGuid: "e7a84ba2-4ef2-4a14-83cb-7decf469d1a2"  # experimentCohorts
          lookupCategory: "Experiment"
          onSourceDelete: "prevent"

      # Cohort LOOKUPS from these categories
      referencesTo:
        - lookupQuestionGuid: "ecb550a5-ed95-473f-84bf-262c9faa7541"  # cohortDiseaseModels
          sourceLabelGuid: "bdb6e257-5a08-42ef-aad0-829668679b0e"
          sourceCategory: "Disease Model Induction"
          validation:
            - "Referenced annotations must exist"
            - "Referenced annotations must be from same study"
            - "Referenced annotations must be from diseaseModelLabel question"

        - lookupQuestionGuid: "a3f2e5bb-3ade-4830-bb66-b5550a3cc85b"  # cohortTreatments
          sourceLabelGuid: "b02e3072-74f0-44e0-a468-f472b3b09991"
          sourceCategory: "Treatment"

        - lookupQuestionGuid: "12ecd826-85a4-499a-844c-bd35ea6624ad"  # cohortOutcomes
          sourceLabelGuid: "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"
          sourceCategory: "Outcome Assessment"

4.3 Generated Code: Unified Validator

The code generator (section 3.4) produces a single validator class that handles BOTH placement AND referential integrity:

// Auto-generated from annotation-question-rules.yaml
// Handles BOTH placement rules AND referential integrity
namespace SyRF.Validation.Generated;

public class AnnotationQuestionValidator
{
    private readonly IReadOnlyCollection<AnnotationQuestionDto> _allQuestions;
    private readonly IReadOnlyCollection<AnnotationDto> _allAnnotations;

    public AnnotationQuestionValidator(
        IReadOnlyCollection<AnnotationQuestionDto> allQuestions,
        IReadOnlyCollection<AnnotationDto> allAnnotations)
    {
        _allQuestions = allQuestions;
        _allAnnotations = allAnnotations;
    }

    // === PLACEMENT VALIDATION (from section 3.4) ===

    public ValidationResult ValidatePlacement(AnnotationQuestionDto dto)
    {
        // ... same as section 3.4 ...
    }

    // === REFERENTIAL INTEGRITY VALIDATION ===

    /// <summary>
    /// Validates a lookup answer references valid, existing annotations.
    /// Called when saving a Cohort/Experiment with lookup selections.
    /// </summary>
    public ValidationResult ValidateLookupAnswer(LookupAnswerDto dto)
    {
        var lookupRule = LookupRules.Rules.GetValueOrDefault(dto.LookupQuestionGuid);
        if (lookupRule == null)
            return ValidationResult.Error($"Unknown lookup question: {dto.LookupQuestionGuid}");

        foreach (var annotationId in dto.SelectedAnnotationIds)
        {
            var annotation = _allAnnotations.FirstOrDefault(a => a.Id == annotationId);

            if (annotation == null)
                return ValidationResult.Error(
                    $"Referenced annotation {annotationId} does not exist",
                    "AQ007");

            if (annotation.QuestionId != lookupRule.SourceLabelGuid)
                return ValidationResult.Error(
                    $"Annotation {annotationId} is not from expected source " +
                    $"({lookupRule.SourceCategory} label question)",
                    "AQ008");

            if (annotation.StudyId != dto.StudyId)
                return ValidationResult.Error(
                    $"Annotation {annotationId} belongs to different study",
                    "AQ009");
        }

        return ValidationResult.Success;
    }

    /// <summary>
    /// Checks if a label annotation can be deleted (referential integrity).
    /// Called before deleting a Treatment/DiseaseModel/Outcome/Cohort annotation.
    /// </summary>
    public ValidationResult CanDeleteLabelAnnotation(
        Guid annotationId,
        Guid labelQuestionGuid)
    {
        var categoryRule = CategoryRules.Rules.Values
            .FirstOrDefault(r => r.LabelQuestionGuid == labelQuestionGuid);

        if (categoryRule?.ReferencedBy == null)
            return ValidationResult.Success;  // No dependents

        // Check if any lookup answers reference this annotation
        var dependentLookups = _allAnnotations
            .Where(a => categoryRule.ReferencedBy
                .Any(r => a.QuestionId == r.LookupQuestionGuid))
            .Where(a => a.SelectedAnnotationIds?.Contains(annotationId) == true)
            .ToList();

        if (dependentLookups.Any())
        {
            var onDelete = categoryRule.ReferencedBy.First().OnSourceDelete;
            return onDelete switch
            {
                DeleteBehavior.Prevent => ValidationResult.Error(
                    $"Cannot delete: referenced by {dependentLookups.Count} lookup(s)",
                    "AQ010"),
                DeleteBehavior.CascadeNullify => ValidationResult.Warning(
                    $"Will remove from {dependentLookups.Count} lookup(s)"),
                DeleteBehavior.SoftDelete => ValidationResult.Success,
                _ => ValidationResult.Success
            };
        }

        return ValidationResult.Success;
    }
}

// Generated lookup rules (alongside CategoryRules from section 3.4)
public static class LookupRules
{
    public static readonly IReadOnlyDictionary<Guid, LookupRule> Rules =
        new Dictionary<Guid, LookupRule>
        {
            // cohortDiseaseModels
            [Guid.Parse("ecb550a5-ed95-473f-84bf-262c9faa7541")] = new LookupRule(
                SourceLabelGuid: Guid.Parse("bdb6e257-5a08-42ef-aad0-829668679b0e"),
                SourceCategory: "Disease Model Induction"),

            // cohortTreatments
            [Guid.Parse("a3f2e5bb-3ade-4830-bb66-b5550a3cc85b")] = new LookupRule(
                SourceLabelGuid: Guid.Parse("b02e3072-74f0-44e0-a468-f472b3b09991"),
                SourceCategory: "Treatment"),

            // cohortOutcomes
            [Guid.Parse("12ecd826-85a4-499a-844c-bd35ea6624ad")] = new LookupRule(
                SourceLabelGuid: Guid.Parse("dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"),
                SourceCategory: "Outcome Assessment"),

            // experimentCohorts
            [Guid.Parse("e7a84ba2-4ef2-4a14-83cb-7decf469d1a2")] = new LookupRule(
                SourceLabelGuid: Guid.Parse("62c852ad-3390-48a4-ac13-439bf6b6587f"),
                SourceCategory: "Cohort"),
        };
}

public record LookupRule(Guid SourceLabelGuid, string SourceCategory);

4.4 UI Implementation: Cascade Delete with User Confirmation

EXISTING IMPLEMENTATION: The frontend already implements Option B (cascade delete with confirmation).

Source: annotation-unit.component.ts:386-466

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LOOKUP REFERENTIAL INTEGRITY (IMPLEMENTED)                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  SCENARIO: User deletes a Treatment/DiseaseModel/Cohort/Outcome annotation  │
│            that is referenced by a lookup question in Cohort/Experiment     │
│                                                                              │
│  CURRENT BEHAVIOR (IMPLEMENTED in annotation-unit.component.ts):             │
│  ═════════════════════════════════════════════════════════════════════════  │
│                                                                              │
│  1. User clicks delete on "Aspirin 100mg" treatment                         │
│                                                                              │
│  2. Component checks `this._containedIn` (populated by ngrx selector)       │
│     - Uses `categoryMap` to find lookup relationships                       │
│     - Queries `selectExtractionCharacteristicsForCurrentStudy` selector     │
│     - Returns list of containing units (cohorts referencing this treatment) │
│                                                                              │
│  3. If references exist, shows confirmation dialog:                          │
│  ┌──────────────────────────────────────────────────────────────────┐       │
│  │ Delete 'Aspirin 100mg' treatment                                  │       │
│  │                                                                   │       │
│  │ This treatment is currently used in the following cohort(s):     │       │
│  │ • 'Control Group'                                                 │       │
│  │ • 'Treatment Group'                                               │       │
│  │                                                                   │       │
│  │ Deleting this treatment will also remove it from each cohort     │       │
│  │ above.                                                            │       │
│  │                                                                   │       │
│  │  [DELETE TREATMENT]  [Cancel]                                     │       │
│  └──────────────────────────────────────────────────────────────────┘       │
│                                                                              │
│  4. On confirmation, cascades removal through form state:                    │
│     - Finds each containing unit's lookup question (e.g., cohortTreatments) │
│     - Filters out the deleted unit's ID from the answer array               │
│     - Then emits removeGroup to delete the annotation itself                │
│                                                                              │
│  5. If no references exist, deletes immediately (no dialog)                  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Implementation Details:

// annotation-unit.component.ts lines 386-466
removeAnswerGroup() {
  if (this._containedIn && this._containedIn.length > 0) {
    // Show confirmation dialog with list of containing units
    this._dialog.open(ConfirmationDialogComponent, {
      data: {
        title: `Delete '${label}' ${unitType}`,
        contentHtml: `This ${unitType} is currently used in the following ${cuType}(s):
                      <ul>${containingUnitsList}</ul>
                      Deleting this ${unitType} will also remove it from each ${cuType} above.`,
        okText: `DELETE ${unitType.toUpperCase()}`,
      }
    }).afterClosed().subscribe((ok) => {
      if (ok) {
        // CASCADE: Remove from all containing units' lookup answers
        this._containedIn.forEach((cu) => {
          const lookupAnswerControl = rootAnnotationGroup.get([
            categoryMap[this.category].cuCategory,        // e.g., "Cohort"
            categoryMap[this.category].cuLabelQId,        // cohortLabel GUID
            'answers', cuIndex, 'subquestions',
            categoryMap[this.category].subqIdOnCu,        // e.g., cohortTreatments
            'answers', 0, 'answer'
          ]);
          // Filter out the deleted unit's ID
          lookupAnswerControl.setValue(
            ansValue.filter((id) => id !== deletedUnitId)
          );
        });
        // Then delete the annotation itself
        this.removeGroup.emit();
      }
    });
  } else {
    // No references - delete immediately
    this.removeGroup.emit();
  }
}

Category Mapping (for lookup traversal):

The categoryMap object (lines 79-121) defines the relationships:

Deleted Unit Type Lookup Category Lookup Question GUID
Treatment Cohort cohortTreatments (a3f2e5bb-3ade-4830-bb66-b5550a3cc85b)
Disease Model Cohort cohortDiseaseModels (ecb550a5-ed95-473f-84bf-262c9faa7541)
Outcome Cohort cohortOutcomes (12ecd826-85a4-499a-844c-bd35ea6624ad)
Cohort Experiment experimentCohorts (e7a84ba2-4ef2-4a14-83cb-7decf469d1a2)

Scope Limitation: This cascade logic currently operates only within the annotation form's reactive form state. It does NOT:

  • Query the database for cross-study references
  • Persist to backend independently (changes are saved when the form is submitted)
  • Prevent deletion at the API level

Future Enhancement: Consider implementing backend validation using the generated validator (section 4.3) to enforce referential integrity at the API level as well


5. Code Generation Strategy

5.1 Clarification: Compile-Time vs Runtime

Key Decision: This specification uses compile-time code generation, NOT runtime schema loading.

┌─────────────────────────────────────────────────────────────────────────────┐
│                    CODE GENERATION STRATEGY CLARIFICATION                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ❌ NOT THIS: Runtime Schema Loading                                         │
│  ─────────────────────────────────────────────────────────────────────────  │
│  • Load YAML at application startup                                          │
│  • Parse rules dynamically                                                   │
│  • Lose compile-time type safety                                             │
│  • "CategoryService.loadCategoryRules()" pattern                             │
│                                                                              │
│  ✅ THIS: Compile-Time Code Generation                                       │
│  ─────────────────────────────────────────────────────────────────────────  │
│  • YAML schema is SOURCE, consumed at BUILD TIME only                        │
│  • Generator produces STATIC C# classes and TypeScript types                 │
│  • Full compile-time type safety in both languages                           │
│  • YAML never loaded at runtime - it's "baked in" to generated code          │
│                                                                              │
│  WHY COMPILE-TIME?                                                           │
│  ─────────────────────────────────────────────────────────────────────────  │
│  • Strong typing catches errors at compile time, not runtime                 │
│  • IDE autocomplete for category names, GUIDs, rules                         │
│  • No YAML parsing overhead at application startup                           │
│  • Rules are immutable constants - exactly what we want for domain rules     │
│  • Categories rarely change - no need for runtime flexibility                │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

5.2 What Gets Generated (Compile-Time)

The YAML schema is the single source of truth, but it's only consumed at build time by the code generator. The output is strongly-typed code:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    COMPILE-TIME CODE GENERATION PIPELINE                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │  annotation-question-rules.yaml                                       │   │
│  │  ═══════════════════════════════                                      │   │
│  │  SINGLE SOURCE OF TRUTH (human-editable, version-controlled)          │   │
│  │                                                                       │   │
│  │  • Category definitions (Study, Treatment, Cohort, etc.)              │   │
│  │  • Placement rules (systemParent GUIDs, control requirements)         │   │
│  │  • Referential integrity (lookup dependencies, onDelete behavior)     │   │
│  │  • GUID constants (all system question GUIDs)                         │   │
│  └────────────────────────────────┬─────────────────────────────────────┘   │
│                                   │                                          │
│                                   │  BUILD TIME                              │
│                                   │  (npm run generate:validation)           │
│                                   │  (dotnet run tools/schema-gen)           │
│                                   │                                          │
│            ┌──────────────────────┼──────────────────────┐                  │
│            │                      │                      │                  │
│            ▼                      ▼                      ▼                  │
│  ┌───────────────────┐  ┌───────────────────┐  ┌───────────────────┐       │
│  │ C# (Backend)      │  │ TypeScript (FE)   │  │ Documentation     │       │
│  │ ═══════════════   │  │ ═══════════════   │  │ ═══════════════   │       │
│  │                   │  │                   │  │                   │       │
│  │ CategoryRules.cs  │  │ categoryRules.ts  │  │ rules.md          │       │
│  │ LookupRules.cs    │  │ lookupRules.ts    │  │                   │       │
│  │ SystemGuids.cs    │  │ systemGuids.ts    │  │ Auto-generated    │       │
│  │ Validator.cs      │  │ validator.ts      │  │ rule tables       │       │
│  │                   │  │                   │  │                   │       │
│  │ STATIC, TYPED     │  │ STATIC, TYPED     │  │                   │       │
│  │ Compile-time safe │  │ Compile-time safe │  │                   │       │
│  └───────────────────┘  └───────────────────┘  └───────────────────┘       │
│            │                      │                                          │
│            │     RUNTIME          │                                          │
│            │  (Application runs)  │                                          │
│            │                      │                                          │
│            ▼                      ▼                                          │
│  ┌───────────────────────────────────────────────────────────────────────┐  │
│  │  Application uses GENERATED CODE directly                              │  │
│  │  ─────────────────────────────────────────────────────────────────────│  │
│  │                                                                        │  │
│  │  C#:  CategoryRules.Rules["Treatment"].FirstLevelParent                │  │
│  │       → Guid (compile-time checked)                                    │  │
│  │                                                                        │  │
│  │  TS:  categoryRules.Treatment.firstLevelParent                         │  │
│  │       → string literal type (compile-time checked)                     │  │
│  │                                                                        │  │
│  │  NO YAML PARSING AT RUNTIME. NO "CategoryService.load()".              │  │
│  │  Rules are CONSTANTS compiled into the application.                    │  │
│  │                                                                        │  │
│  └───────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

5.3 Generated Code Structure

The generated code provides strongly-typed constants - there's no generic "CategoryService" that loads rules at runtime:

C# Generated Code:

// ============================================================================
// AUTO-GENERATED FROM annotation-question-rules.yaml
// DO NOT EDIT - Regenerate with: dotnet run --project tools/schema-gen
// ============================================================================

namespace SyRF.Validation.Generated;

/// <summary>
/// Static category rules - no runtime loading, no service pattern.
/// These are compile-time constants generated from the YAML schema.
/// </summary>
public static class CategoryRules
{
    // Strongly-typed dictionary - category string → rule record
    public static readonly IReadOnlyDictionary<string, CategoryRule> Rules = ...;

    // Direct accessors for common use cases (compile-time checked)
    public static class Treatment
    {
        public static readonly Guid FirstLevelParent =
            Guid.Parse("d04ec2d7-3e10-4847-9999-befe7ee4c454");
        public const bool ControlParameterRequired = true;
    }

    public static class Cohort
    {
        public static readonly Guid FirstLevelParent =
            Guid.Parse("62c852ad-3390-48a4-ac13-439bf6b6587f");
        public const bool ControlParameterRequired = false;
    }
    // ... etc for each category
}

/// <summary>
/// Strongly-typed GUID constants for system questions.
/// IDE autocomplete, compile-time checking, no magic strings.
/// </summary>
public static class SystemQuestionGuids
{
    public static readonly Guid TreatmentLabel =
        Guid.Parse("b02e3072-74f0-44e0-a468-f472b3b09991");
    public static readonly Guid TreatmentControl =
        Guid.Parse("d04ec2d7-3e10-4847-9999-befe7ee4c454");
    // ... all GUIDs
}

TypeScript Generated Code:

// ============================================================================
// AUTO-GENERATED FROM annotation-question-rules.yaml
// DO NOT EDIT - Regenerate with: npm run generate:validation
// ============================================================================

/**
 * Static category rules as const object.
 * Provides compile-time type checking and IDE autocomplete.
 */
export const categoryRules = {
  Treatment: {
    firstLevelParent: 'd04ec2d7-3e10-4847-9999-befe7ee4c454',
    controlParameterRequired: true,
  },
  Cohort: {
    firstLevelParent: '62c852ad-3390-48a4-ac13-439bf6b6587f',
    controlParameterRequired: false,
  },
  // ... etc
} as const;

// Type-safe category name (not just 'string')
export type Category = keyof typeof categoryRules;

// Compile-time checked GUID constants
export const systemQuestionGuids = {
  treatmentLabel: 'b02e3072-74f0-44e0-a468-f472b3b09991',
  treatmentControl: 'd04ec2d7-3e10-4847-9999-befe7ee4c454',
  // ... all GUIDs
} as const;

5.4 Why NOT Runtime Loading?

┌─────────────────────────────────────────────────────────────────────────────┐
│                    RUNTIME LOADING ANTI-PATTERN                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ❌ ANTI-PATTERN: Runtime Schema Loading                                     │
│                                                                              │
│  // DON'T DO THIS:                                                           │
│  class CategoryService {                                                     │
│    loadCategoryRules(): Promise<CategoryRules> {                             │
│      return fetch('/schemas/categories.yaml').then(yaml.parse);              │
│    }                                                                         │
│                                                                              │
│    getRequiredParent(category: string): string | null {                      │
│      return this.rules[category]?.systemParent;  // Runtime lookup           │
│    }                                                                         │
│  }                                                                           │
│                                                                              │
│  PROBLEMS:                                                                   │
│  ─────────────────────────────────────────────────────────────────────────  │
│  1. ❌ Type safety lost - 'category' is just a string at compile time       │
│  2. ❌ Typos not caught - getRequiredParent("Treament") compiles fine       │
│  3. ❌ Runtime parsing overhead - YAML parsed on every app start            │
│  4. ❌ Async complexity - must await rules before validation                 │
│  5. ❌ No IDE autocomplete for category names or GUIDs                      │
│  6. ❌ Testing harder - must mock service instead of importing constants    │
│                                                                              │
│  ✅ CORRECT: Generated Static Constants                                      │
│                                                                              │
│  // DO THIS:                                                                 │
│  import { categoryRules, Category } from './generated/category-rules';      │
│                                                                              │
│  function getRequiredParent(category: Category): string | null {            │
│    return categoryRules[category].firstLevelParent;                         │
│  }                                                                           │
│                                                                              │
│  BENEFITS:                                                                   │
│  ─────────────────────────────────────────────────────────────────────────  │
│  1. ✅ Type safety - Category is a union type, not string                   │
│  2. ✅ Typos caught at compile time - "Treament" is a type error            │
│  3. ✅ No runtime overhead - constants are in the compiled bundle           │
│  4. ✅ Synchronous - no async, just direct property access                  │
│  5. ✅ Full IDE autocomplete for categories and properties                  │
│  6. ✅ Easy testing - just import and use, no mocking                       │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

5.5 Future Consideration: Dynamic Categories

If SyRF ever needs user-defined categories (unlikely for domain reasons), THAT would require runtime loading. But for the current fixed set of 7 categories with stable rules, compile-time generation is the correct approach.

The YAML schema provides:

  • Human-readable single source of truth
  • Version-controlled rule changes
  • Cross-language consistency via generation

But the YAML is consumed at build time, not runtime.


6. Implementation Recommendations

6.1 Phased Approach

CRITICAL: Each phase includes mandatory testing requirements. Code changes are NOT complete until tests pass with required coverage.

╔═══════════════════════════════════════════════════════════════════════════════╗
║                    IMPLEMENTATION PHASES                                       ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 1: FORMALIZE RULES (Low Risk, High Value)                               ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  DELIVERABLES:                                                                 ║
║  • Create annotation-question-rules.yaml as single source of truth             ║
║  • Document all rules precisely (this document)                                ║
║  • No code changes yet, just documentation                                     ║
║                                                                                ║
║  TESTING (Preparation):                                                        ║
║  • Create test case matrix document (all rules → test cases)                   ║
║  • Create test data fixtures for each category (YAML format)                   ║
║  • Create example valid/invalid DTOs for each rule                             ║
║                                                                                ║
║  Effort: 1-2 days                                                              ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 2: BACKEND VALIDATION (Medium Risk, High Value) ✅ COMPLETE             ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  DELIVERABLES:                                                                 ║
║  ✅ AnnotationQuestionPlacementRules.cs - category rules with GUIDs           ║
║  ✅ AnnotationQuestionPlacementValidator.cs - validation logic                 ║
║  ✅ ValidationResult.cs - validation result type                               ║
║  ✅ Integration with Project.UpsertCustomAnnotationQuestion()                  ║
║  ✅ Characterization tests for existing invalid data (see below)               ║
║                                                                                ║
║  TESTING: ✅ PASSED                                                            ║
║  • 31 unit tests for AnnotationQuestionPlacementValidator                      ║
║  • 24 integration tests for Project.UpsertCustomAnnotationQuestion()           ║
║  • All categories covered: Study, DMI, Treatment, Outcome, Cohort, Experiment  ║
║  • Valid + invalid cases for first-level and nested questions                  ║
║  • Hidden category rejection                                                   ║
║                                                                                ║
║  CHARACTERIZATION TEST RESULTS (2025-12-27):                                   ║
║  • Study questions with system parent: 0 violations ✅                         ║
║  • Hidden category custom questions: 0 violations ✅                           ║
║  • Questions without required parent: 32 violations in 24 projects             ║
║    - Experiment: 28 (mostly test projects)                                     ║
║    - Disease Model Induction: 3                                                ║
║    - Outcome Assessment: 1                                                     ║
║  • Impact: Low - mostly test/dev projects, questions still functional          ║
║  • Action: No migration needed - new validation prevents future violations     ║
║                                                                                ║
║  Completed: 2025-12-27                                                         ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 3: REFERENTIAL INTEGRITY - SCOPE REDUCED                                ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  STATUS: Largely not needed due to existing architecture                       ║
║                                                                                ║
║  EXISTING PROTECTIONS:                                                         ║
║  ✅ Annotations are scoped to individual Studies (no cross-study leakage)      ║
║  ✅ Annotations are scoped to individual Reviewers (no cross-reviewer leakage) ║
║  ✅ Frontend cascade delete with confirmation (annotation-unit.ts)             ║
║  ✅ Lookup selections only show annotations from current study context         ║
║                                                                                ║
║  REMAINING (Low Priority - only if data issues found):                         ║
║  • Orphan detection script for historical data cleanup                         ║
║  • Backend validation would duplicate frontend logic (not recommended)         ║
║                                                                                ║
║  FUTURE ENHANCEMENT (Defensive Programming):                                   ║
║  • Add validation that annotations belong to correct Study/Reviewer            ║
║  • Purpose: Catch potential bugs where annotations leak between contexts       ║
║  • Would throw if annotation references wrong study or wrong reviewer          ║
║  • Not urgent since architecture prevents this, but useful for bug detection   ║
║                                                                                ║
║  REMOVED FROM SCOPE:                                                           ║
║  ✗ Cross-study reference checking - Not possible due to data model             ║
║  ✗ Backend delete validation - Frontend already handles cascade properly       ║
║                                                                                ║
║  Effort: Minimal (only if orphan cleanup needed)
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 4: CODE GENERATION PIPELINE (Higher Effort, Long-term Value)            ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  DELIVERABLES:                                                                 ║
║  • Set up JSON Schema as authoritative definition                              ║
║  • Add code generation for TypeScript types                                    ║
║  • Add code generation for C# validators                                       ║
║  • Integrate into build pipeline                                               ║
║                                                                                ║
║  TESTING (Required - 100% generated code coverage):                            ║
║  • Generator produces valid C#/TypeScript syntax                               ║
║  • Generated code compiles without errors                                      ║
║  • Snapshot tests: generated code matches expected output                      ║
║  • Property-based tests: any valid DTO passes, invalid DTOs fail               ║
║                                                                                ║
║  Effort: 5-8 days (includes generator tests)                                   ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 5: DECLARATIVE CATEGORIES (Major Refactor - Optional)                   ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  DELIVERABLES:                                                                 ║
║  • Move to YAML-based category definitions                                     ║
║  • Generate all code from schemas                                              ║
║  • Enable runtime category extensibility                                       ║
║                                                                                ║
║  TESTING: Full regression suite + new extensibility tests                      ║
║                                                                                ║
║  Effort: 2-4 weeks                                                             ║
║                                                                                ║
╚═══════════════════════════════════════════════════════════════════════════════╝

6.2 Decision Matrix

Approach Complexity Value Risk Recommendation
Manual sync (current) Low Low High (drift) Avoid
Backend validation only Medium High Low Do first
JSON Schema codegen Medium-High High Medium Phase 4
Declarative YAML categories High Very High Medium Future consideration

6.3 Testing Strategy: Philosophy and Examples

NOTE: Phase-specific testing requirements are defined in section 6.1. This section provides the testing philosophy, example test structures, and supplementary reference material.

┌─────────────────────────────────────────────────────────────────────────────┐
│                    TESTING PHILOSOPHY                                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  1. TEST-FIRST DEVELOPMENT                                                   │
│     Write tests BEFORE implementing validation logic.                        │
│     Each rule in this document = at least one test case.                     │
│                                                                              │
│  2. COVERAGE REQUIREMENTS                                                    │
│     • Backend validators: 95-100% branch coverage (95% CI gate)              │
│     • Frontend validators: 90-100% branch coverage (90% CI gate)             │
│     • Generated code: 100% line coverage (generated, so easier)              │
│                                                                              │
│  3. TEST CATEGORIES                                                          │
│     • Unit tests: Each validation rule in isolation                          │
│     • Integration tests: Validator + real domain objects                     │
│     • Characterization tests: Capture existing (possibly invalid) behavior   │
│     • Property-based tests: Fuzz testing for edge cases                      │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

6.3.1 Example Test Structures

Backend Placement Validator Tests (C#)

// ============================================================================
// TEST STRUCTURE: AnnotationQuestionValidatorTests.cs
// ============================================================================

namespace SyRF.ProjectManagement.Core.Tests.Validation;

[TestFixture]
public class AnnotationQuestionPlacementValidatorTests
{
    // ========================================================================
    // CATEGORY PLACEMENT TESTS - One test class per category
    // ========================================================================

    [TestFixture]
    public class TreatmentCategoryTests
    {
        // VALID CASES
        [Test]
        public void FirstLevel_WithTreatmentControlParent_IsValid()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.TreatmentControl,
                controlParam: true);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        [Test]
        public void FirstLevel_WithControlParamFalse_IsValid()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.TreatmentControl,
                controlParam: false);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        [Test]
        public void FirstLevel_WithControlParamNull_IsValid()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.TreatmentControl,
                controlParam: null);  // Shows for both control and non-control

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        [Test]
        public void Nested_WithCustomQuestionParent_IsValid()
        {
            var customParentId = Guid.NewGuid();
            var dto = CreateTreatmentQuestion(
                parentId: customParentId,  // NOT treatmentControl
                controlParam: null);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        // INVALID CASES
        [Test]
        public void FirstLevel_WithWrongParent_ReturnsError()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.CohortLabel,  // Wrong!
                controlParam: true);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.False);
            Assert.That(result.ErrorCode, Is.EqualTo("AQ003"));
        }

        [Test]
        public void FirstLevel_WithNullParent_ReturnsError()
        {
            var dto = CreateTreatmentQuestion(
                parentId: null,  // Treatment can't be root
                controlParam: true);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.False);
            Assert.That(result.ErrorCode, Is.EqualTo("AQ002"));
        }

        [Test]
        public void FirstLevel_MissingControlParam_ReturnsError()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.TreatmentControl,
                controlParam: null,
                hasConditionalParentAnswers: false);  // Missing entirely

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.False);
            Assert.That(result.ErrorCode, Is.EqualTo("AQ005"));
        }
    }

    [TestFixture]
    public class StudyCategoryTests
    {
        // Study is special - can have null parent
        [Test]
        public void FirstLevel_WithNullParent_IsValid()
        {
            var dto = CreateStudyQuestion(parentId: null);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        [Test]
        public void FirstLevel_WithAnyParent_ReturnsError()
        {
            var dto = CreateStudyQuestion(
                parentId: SystemGuids.TreatmentLabel);  // Wrong!

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.False);
        }
    }

    // Similar test classes for: CohortCategoryTests, ExperimentCategoryTests,
    // OutcomeCategoryTests, DiseaseModelCategoryTests, HiddenCategoryTests
}

Frontend Cascade Delete Tests (TypeScript)

// ============================================================================
// FRONTEND TEST STRUCTURE: annotation-unit.component.spec.ts
// ============================================================================

describe('AnnotationUnitComponent - Cascade Delete', () => {
  // ========================================================================
  // EXISTING BEHAVIOR TESTS (characterization tests)
  // ========================================================================

  describe('when deleting a Treatment with no references', () => {
    it('should delete immediately without dialog', () => {
      const treatment = createTreatmentUnit('Aspirin 100mg');
      component.containedIn = [];  // No references

      component.removeAnswerGroup();

      expect(dialogService.open).not.toHaveBeenCalled();
      expect(component.removeGroup.emit).toHaveBeenCalled();
    });
  });

  describe('when deleting a Treatment referenced by Cohorts', () => {
    it('should show confirmation dialog with reference list', () => {
      const treatment = createTreatmentUnit('Aspirin 100mg');
      component.containedIn = [
        { id: 'cohort-1', name: 'Control Group' },
        { id: 'cohort-2', name: 'Treatment Group' }
      ];

      component.removeAnswerGroup();

      expect(dialogService.open).toHaveBeenCalledWith(
        ConfirmationDialogComponent,
        expect.objectContaining({
          data: expect.objectContaining({
            contentHtml: expect.stringContaining('Control Group')
          })
        })
      );
    });

    it('should cascade remove references when user confirms', fakeAsync(() => {
      setupCascadeScenario();
      dialogResult$.next(true);  // User clicks DELETE

      tick();

      // Verify Treatment removed from Cohort 1's cohortTreatments
      const cohort1Treatments = getCohortTreatments('cohort-1');
      expect(cohort1Treatments).not.toContain(treatment.annotationId);

      // Verify Treatment removed from Cohort 2's cohortTreatments
      const cohort2Treatments = getCohortTreatments('cohort-2');
      expect(cohort2Treatments).not.toContain(treatment.annotationId);

      // Verify Treatment unit itself deleted
      expect(component.removeGroup.emit).toHaveBeenCalled();
    }));

    it('should NOT delete when user cancels', fakeAsync(() => {
      setupCascadeScenario();
      dialogResult$.next(false);  // User clicks Cancel

      tick();

      expect(component.removeGroup.emit).not.toHaveBeenCalled();
    }));
  });

  // Test all category combinations
  describe.each([
    ['Treatment', 'Cohort', 'cohortTreatments'],
    ['Disease Model', 'Cohort', 'cohortDiseaseModels'],
    ['Outcome', 'Cohort', 'cohortOutcomes'],
    ['Cohort', 'Experiment', 'experimentCohorts'],
  ])('when deleting %s referenced by %s', (unitType, containerType, lookupQuestion) => {
    it(`should remove from ${lookupQuestion} on cascade`, () => {
      // Parameterized test for each lookup relationship
    });
  });
});

6.3.2 Test Data Fixtures

# test-fixtures/annotation-questions.yaml
# Reusable test data for all validation tests

validQuestions:
  treatment_first_level:
    id: "11111111-1111-1111-1111-111111111111"
    category: "Treatment"
    system: false
    target:
      parentId: "d04ec2d7-3e10-4847-9999-befe7ee4c454"  # treatmentControl
      conditionalParentAnswers:
        conditionType: 0
        targetParentBoolean: false

  treatment_nested:
    id: "22222222-2222-2222-2222-222222222222"
    category: "Treatment"
    system: false
    target:
      parentId: "11111111-1111-1111-1111-111111111111"  # custom parent
      conditionalParentAnswers: null  # Not required for nested

  study_root:
    id: "33333333-3333-3333-3333-333333333333"
    category: "Study"
    system: false
    target: null  # Study can be root

invalidQuestions:
  treatment_wrong_parent:
    id: "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
    category: "Treatment"
    target:
      parentId: "62c852ad-3390-48a4-ac13-439bf6b6587f"  # cohortLabel - WRONG!
    expectedError: "AQ003"

  cohort_missing_parent:
    id: "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb"
    category: "Cohort"
    target: null  # Cohort cannot be root
    expectedError: "AQ002"

  treatment_missing_control_param:
    id: "cccccccc-cccc-cccc-cccc-cccccccccccc"
    category: "Treatment"
    target:
      parentId: "d04ec2d7-3e10-4847-9999-befe7ee4c454"
      conditionalParentAnswers: null  # Missing when parent is control
    expectedError: "AQ005"

6.3.3 CI/CD Integration

# .github/workflows/test-coverage.yml (additions)
annotation-question-validation-tests:
  runs-on: ubuntu-latest
  steps:
    - name: Run Backend Validation Tests
      run: |
        dotnet test \
          --filter "Category=AnnotationQuestionValidation" \
          --collect:"XPlat Code Coverage" \
          --results-directory ./coverage

    - name: Check Coverage Threshold
      run: |
        COVERAGE=$(grep -oP 'line-rate="\K[^"]+' coverage/*/coverage.cobertura.xml)
        if (( $(echo "$COVERAGE < 0.95" | bc -l) )); then
          echo "Coverage $COVERAGE is below 95% threshold"
          exit 1
        fi

    - name: Run Frontend Validation Tests
      working-directory: src/services/web
      run: |
        npx ng test --no-watch --code-coverage \
          --include="**/annotation-*.spec.ts"

    - name: Upload to SonarCloud
      uses: SonarSource/sonarqube-scan-action@v2
      with:
        args: >
          -Dsonar.projectKey=syrf-annotation-questions
          -Dsonar.coverage.exclusions=**/*.generated.ts,**/*.generated.cs

6.3.4 Test Coverage Matrix

Rule Category Backend Tests Frontend Tests Coverage Target
Category Placement (7 categories) 21+ tests N/A 100%
First-Level vs Nested 14+ tests N/A 100%
Control Parameter Rules 8+ tests N/A 100%
Lookup Referential Integrity 12+ tests 8+ tests 95%
Cascade Delete UI N/A 16+ tests 90%
Generated Validators Property-based Property-based 100%

Total Estimated Tests: 79+ unit tests, 12+ integration tests


7. Appendix: Complete Rule Reference

7.1 Category Parent Rules (First-Level Custom Questions)

IMPORTANT: These rules apply ONLY to first-level custom questions (created via category header "Add Question" button).

Nested custom questions (created via "Add Related") have target.parentId set to the parent custom question's ID, NOT the system question GUID. See section 2.3.

Category First-Level Parent System Parent GUID Control Param Has System Questions
Study null (true root) N/A N/A No
Disease Model Induction modelControl b18aa936-... Required Yes
Treatment treatmentControl d04ec2d7-... Required Yes
Outcome Assessment outcomeLabel dbe2720c-... Forbidden Yes
Cohort cohortLabel 62c852ad-... Forbidden Yes
Experiment experimentLabel 7c555b6e-... Forbidden Yes
Hidden N/A (no custom) N/A N/A Yes (system-only)

Study Category Special Case:

  • Study has no system questions - it is the only category where custom questions are true root-level
  • First-level custom questions in Study have target: null
  • Nested questions in Study parent to other custom questions (like all other categories)

Example - Treatment Category:

First-level question:  target.parentId = "d04ec2d7-3e10-4847-9999-befe7ee4c454" (treatmentControl GUID)
Nested question:       target.parentId = "abc123..." (parent custom question's ID, NOT treatmentControl)

7.2 Lookup Dependency Matrix

Lookup Question Source Category Source Question Multi-Select
cohortDiseaseModels Disease Model Induction diseaseModelLabel Yes
cohortTreatments Treatment treatmentLabel Yes
cohortOutcomes Outcome Assessment outcomeLabel Yes
experimentCohorts Cohort cohortLabel Yes
outcomePdfGraphs Hidden pdfReferences Yes

7.3 Validation Error Messages

validationErrors:
  INVALID_CATEGORY:
    code: "AQ001"
    message: "Unknown category: {category}"

  MISSING_REQUIRED_PARENT:
    code: "AQ002"
    message: "Questions in {category} must have parent {expectedParent}"

  WRONG_PARENT:
    code: "AQ003"
    message: "Questions in {category} must have parent {expectedParent}, got {actualParent}"

  ROOT_NOT_ALLOWED:
    code: "AQ004"
    message: "Questions in {category} cannot be root-level"

  MISSING_CONTROL_PARAMETER:
    code: "AQ005"
    message: "{category} questions must specify control parameter (true/false/null)"

  CONTROL_PARAMETER_NOT_ALLOWED:
    code: "AQ006"
    message: "{category} questions must not have control parameter"

  ORPHANED_LOOKUP_REFERENCE:
    code: "AQ007"
    message: "Lookup references non-existent annotation: {annotationId}"

  WRONG_LOOKUP_SOURCE:
    code: "AQ008"
    message: "Annotation {annotationId} is not from expected source question"

  CROSS_STUDY_REFERENCE:
    code: "AQ009"
    message: "Cannot reference annotation from different study"

  # Frontend UI validation errors (create-question.component.ts)
  QUESTION_TEXT_TOO_LONG:
    code: "AQ010"
    message: "Question text exceeds maximum length of 80 characters"

  QUESTION_TEXT_REQUIRED:
    code: "AQ011"
    message: "Question text is required"

  OPTIONS_REQUIRED:
    code: "AQ012"
    message: "Options are required for dropdown/radio/checklist/autocomplete controls"

  DUPLICATE_OPTIONS:
    code: "AQ013"
    message: "Option values must be unique"

  INVALID_NUMERIC_OPTION:
    code: "AQ014"
    message: "Options must be valid {type} values"

  CONDITIONAL_ANSWERS_REQUIRED:
    code: "AQ015"
    message: "At least one conditional parent answer must be selected"

  INVALID_CONDITIONAL_ANSWER:
    code: "AQ016"
    message: "Conditional parent answer must match one of parent's options"

Appendix A: Characterization Test Results (2025-12-27)

This appendix documents the results of running characterization tests against the production MongoDB database (syrftest) to identify existing annotation questions that violate the placement rules defined in this specification.

Summary

Validation Check Result
Study questions with system parent 0 violations ✅
Hidden category custom questions 0 violations ✅
Questions without required parent 32 violations ⚠️

Violation Details

32 custom questions in categories that require a parent (DMI, Treatment, Outcome Assessment, Cohort, Experiment) were found with Target = null (no parent).

Disease Model Induction (3 violations)

Project Question Created
Internal Testing 2024 tryingout 2024-09-02
Internal testing Aug 2024 Somethingelse 2024-08-15
TestMVP1508 new 2024-08-22

Experiment (28 violations)

Project Question Created
Cannabinoids What was the time (post model induction;hours) of outcome assessment measurement where the difference between control and treatment is greatest? 2019-04-10
Cannabinoids What was the time (post model induction;hours) of the last outcome assessment measurement? 2019-04-10
ChABC SCI Number of Controls 2018-07-04
ChABC SCI Number in Treatment Group 2018-07-04
ChABC SCI Total Number of Animals? 2018-12-13
ChABC SCI Number of Sham Animals 2019-01-16
ChABC SCI Number of Comparisons 2019-01-16
Comparison of brain serotonin levels between light and dark when are the lights on ? 2018-01-15
For Alex Again hn 2019-01-20
Is the result of intra-peritoneal coated meshes comparable for hernia in all research groups in animals? Does it contain an animal experiment? 2018-03-26
NAFLD test 3 How many developed HCC? 2018-01-16
NAFLD1 test Were any side effects noted? 2017-07-10
NAFLD1 test test 2017-07-10
PC12 OGD Data Extraction Statistical Test Used 2019-03-05
Pre-clinical studies in tissue-engineering for long-gap oesophageal atresia in pigs. A systematic literature review. Pre seeding 2019-03-29
Preclinical and Clinical Studies on the Use of Stem cells in Craniofacial Bone Regeneration: A Systematic Review. M 2018-04-15
ProjectName experiment 2019-06-24
SR Tool Feature Analysis Is HPLC used? 2018-06-29
Systematic review and meta-analysis of preclinical studies of efficacya and safety of M. charantia on type 2 diabetes mellitus design 2019-01-03
Systematic review of drug delivery methods to the inner ear Is there a comparator? 2018-01-09
TLR4/liver regeneration was TLR4 measured 2017-05-31
Test test question 2018-07-19
Teste What is the figure? 2017-05-03
Teste What is the figure (alternate)? 2017-05-03
The influene of bioinorganic elements included in calcium phosphate based bone subsitutes species? 2017-07-07
VGF derived neuropeptides and pain Was VGF measured and was is up regulated or down regulated 2017-09-21
exposed to A have effects on B is this a trial or observation study 2019-03-10
gfsdkfgi hhch? 2018-10-29

Outcome Assessment (1 violation)

Project Question Created
Internal Testing 2024 question 2024-09-02

Analysis

  1. Most violations are in test/development projects: Project names like "Test", "Teste", "NAFLD1 test", "Internal Testing 2024", "TestMVP1508", "For Alex Again", "gfsdkfgi" indicate these are not production reviews.

  2. All violations are historical: The newest violations are from September 2024 (internal testing), with most from 2017-2019.

  3. Questions still function: Despite missing the required parent, these questions are still usable in the UI. The parent relationship affects hierarchy display but does not break core functionality.

  4. No data integrity issues: The questions are self-contained and do not reference non-existent entities.

Recommendation

No migration or cleanup is required.

  • The new validation in Project.UpsertCustomAnnotationQuestion() prevents future violations
  • Existing questions continue to function
  • Most affected projects are test/development projects
  • Cleaning up would require project owner coordination with minimal benefit

MongoDB Query Used

db.pmProject.aggregate([
  { $unwind: "$AnnotationQuestions" },
  { $match: {
      "AnnotationQuestions.System": false,
      "AnnotationQuestions.Category": {
        $in: ["Disease Model Induction", "Treatment", "Outcome Assessment", "Cohort", "Experiment"]
      },
      "AnnotationQuestions.Target": null
  }},
  { $project: {
      projectName: "$Name",
      category: "$AnnotationQuestions.Category",
      question: "$AnnotationQuestions.Question",
      createdDate: "$AnnotationQuestions.DateTimeCreated"
  }},
  { $sort: { category: 1, projectName: 1 }}
])