Annotation Questions Formal Specification¶

Overview¶

This document provides a formal specification of annotation question business rules with precise definitions, cross-language validation strategies, and referential integrity requirements. It addresses the need for a single source of truth that can be enforced consistently across frontend (TypeScript), backend (C#), and database seeding.

Annotation Questions Business Logic - Business context
Architecture Analysis - Architectural options
Category Question Structure - Category details
Question Hierarchy Diagrams - Visual representations

1. Terminology Glossary¶

Entity Definitions¶

Term	Definition	Example
Annotation Question	A form field definition that collects data from reviewers during systematic review annotation	"What was the sample size?"
Annotation	A recorded answer to an annotation question for a specific study	Answer: "42" for study ABC
Unit	A named instance created by answering a Label Question	"Aspirin 100mg" (a Treatment unit)
Category	A logical grouping of annotation questions by experimental domain	"Treatment", "Outcome Assessment"
Stage	A review workflow phase where questions are presented. Has an `Extraction` boolean that controls whether system questions (data extraction questions) are shown.	"Screening" (`extraction=false`), "Data Extraction" (`extraction=true`)
ExtractionInfo	Container within a Study that holds all annotations, sessions, and outcome data	`Study.ExtractionInfo.Annotations`
OutcomeData	Quantitative data extracted from studies, stored per Experiment-Cohort-Outcome combination. Contains TimePoints, statistical metadata, and unit references.	`Study.ExtractionInfo.OutcomeData`
TimePoint	A measurement at a specific time, containing `time`, `average` (mean/median), and `error` (SD/SEM/IQR) values	`{ time: 24, average: 45.3, error: 12.1 }`

Question Types¶

Term	Definition	Identifying Property
System Question	Pre-defined, immutable question essential for domain structure	`system: true`
Custom Question	User-created question to capture project-specific data	`system: false`
Label Question	System question that creates named unit instances	`labelQuestion: true`
Control Question	Boolean question indicating whether a unit is a "control"	`questionType: boolean`, specific GUIDs
Lookup Question	Question that cross-references units from another category	`annotationLookup: true`
Root Question	A question with no parent (top-level in category)	`target: null` OR `target.parentId: null`
Child Question	A question nested under a parent question	`target.parentId: {guid}`

Relationship Terms¶

Term	Definition
Parent Question	The question under which a child question is nested
Conditional Parent	When a child only displays based on parent's answer value
Lookup Source	The Label Question whose annotations populate a Lookup Question
Lookup Target	The Lookup Question that displays annotations from a source

2. Formal Rule Specification¶

2.1 Category Placement Rules¶

These rules define WHERE custom questions can be placed within each category's hierarchy.

Key Concept: Custom questions form trees within each category. The rules below apply to first-level custom questions (direct children of system questions). Nested custom questions (children of other custom questions) parent to their custom question parent instead.

# annotation-question-rules.yaml
# Single source of truth for category placement rules

categories:
  Study:
    # Study is unique: no system parent, custom questions are true root-level
    customQuestionPlacement:
      type: "root"
      constraint: "First-level custom questions MUST have target = null"
      systemParent: null
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Study category"
    hasLabelQuestion: false
    hasControlQuestion: false

  "Disease Model Induction":
    customQuestionPlacement:
      type: "control-descendant"  # descendants, not just direct children
      constraint: "First-level questions parent to modelControl; nested parent to custom questions"
      systemParent: "b18aa936-a4c6-446b-ac98-88ac38930878"  # modelControl
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Disease Model Induction category"
    controlParameterRule:
      type: "required-for-first-level"
      constraint: "First-level questions MUST specify conditionalParentAnswers (true/false/null)"
      allowedValues: [true, false, null]
      semantics:
        true: "Show only when parent checkbox is CHECKED (is control procedure)"
        false: "Show only when parent checkbox is UNCHECKED (is not control)"
        null: "Show regardless of parent checkbox value"
    hasLabelQuestion: true
    labelQuestionGuid: "bdb6e257-5a08-42ef-aad0-829668679b0e"
    hasControlQuestion: true
    controlQuestionGuid: "b18aa936-a4c6-446b-ac98-88ac38930878"

  Treatment:
    customQuestionPlacement:
      type: "control-descendant"
      constraint: "First-level questions parent to treatmentControl; nested parent to custom questions"
      systemParent: "d04ec2d7-3e10-4847-9999-befe7ee4c454"  # treatmentControl
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Treatment category"
    controlParameterRule:
      type: "required-for-first-level"
      constraint: "First-level questions MUST specify conditionalParentAnswers (true/false/null)"
      allowedValues: [true, false, null]
      semantics:
        true: "Show only when parent checkbox is CHECKED (is control procedure)"
        false: "Show only when parent checkbox is UNCHECKED (is not control)"
        null: "Show regardless of parent checkbox value"
    hasLabelQuestion: true
    labelQuestionGuid: "b02e3072-74f0-44e0-a468-f472b3b09991"
    hasControlQuestion: true
    controlQuestionGuid: "d04ec2d7-3e10-4847-9999-befe7ee4c454"

  "Outcome Assessment":
    customQuestionPlacement:
      type: "label-descendant"
      constraint: "First-level questions parent to outcomeLabel; nested parent to custom questions"
      systemParent: "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"  # outcomeLabel
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Outcome Assessment category"
    controlParameterRule:
      type: "forbidden"
      constraint: "MUST NOT specify control-based conditionalParentAnswers"
    hasLabelQuestion: true
    labelQuestionGuid: "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"
    hasControlQuestion: false

  Cohort:
    customQuestionPlacement:
      type: "label-descendant"
      constraint: "First-level questions parent to cohortLabel; nested parent to custom questions"
      systemParent: "62c852ad-3390-48a4-ac13-439bf6b6587f"  # cohortLabel
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Cohort category"
    controlParameterRule:
      type: "forbidden"
      constraint: "MUST NOT specify control-based conditionalParentAnswers"
    hasLabelQuestion: true
    labelQuestionGuid: "62c852ad-3390-48a4-ac13-439bf6b6587f"
    hasControlQuestion: false

  Experiment:
    customQuestionPlacement:
      type: "label-descendant"
      constraint: "First-level questions parent to experimentLabel; nested parent to custom questions"
      systemParent: "7c555b6e-1fb6-4036-9982-c09a5db82ace"  # experimentLabel
      nestedQuestions:
        allowed: true
        parentTo: "any custom question in Experiment category"
    controlParameterRule:
      type: "forbidden"
      constraint: "MUST NOT specify control-based conditionalParentAnswers"
    hasLabelQuestion: true
    labelQuestionGuid: "7c555b6e-1fb6-4036-9982-c09a5db82ace"
    hasControlQuestion: false

  Hidden:
    customQuestionPlacement:
      type: "none"
      constraint: "Custom questions NOT ALLOWED in Hidden category"
      nestedQuestions:
        allowed: false
    hasLabelQuestion: true
    labelQuestionGuid: "7ee21ff9-e309-4387-8d30-719201497682"
    hasControlQuestion: false

2.2 Custom Question Placement Rules (Precise)¶

CRITICAL DISTINCTION: First-Level vs Nested Custom Questions

Custom questions in each category form trees that must be rooted at specific system questions. The rules differ depending on whether you're creating a first-level custom question (direct child of system question) or a nested custom question (child of another custom question).

┌─────────────────────────────────────────────────────────────────────────────┐
│                    CUSTOM QUESTION TREE STRUCTURE                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  System Question (e.g., Treatment Control)                                   │
│       │                                                                      │
│       ├── First-Level Custom Question A  ← Created via category header button│
│       │       │                                                              │
│       │       ├── Nested Custom Question A1  ← Created via "Add Related"    │
│       │       │       │                                                      │
│       │       │       └── Deeply Nested A1a  ← Created via "Add Related"    │
│       │       │                                                              │
│       │       └── Nested Custom Question A2                                  │
│       │                                                                      │
│       └── First-Level Custom Question B                                      │
│               │                                                              │
│               └── Nested Custom Question B1                                  │
│                                                                              │
│  KEY INSIGHT: Custom questions can nest ARBITRARILY DEEP within a category. │
│  The only constraint is that the ROOT of each tree is a system question.     │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Rule Summary:

First-level custom questions (created via category header "Add Question" button): target.parentId MUST equal the category's system question GUID
Nested custom questions (created via "Add Related" button on any custom question): target.parentId equals the parent custom question's ID - NOT the system question GUID
All custom questions must be descendants of their category's system question (transitively through the parent chain), but only first-level questions directly reference the system question

╔═══════════════════════════════════════════════════════════════════════════════╗
║                    FIRST-LEVEL CUSTOM QUESTION RULES                           ║
║          (Questions created via category header "Add Question" button)         ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Study                                                               ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  RULE: target MUST be null OR target.parentId MUST be null                     ║
║  WHY:  Study has no system parent - custom questions are root-level            ║
║                                                                                ║
║  VALID:   { target: null }                                                     ║
║  VALID:   { target: { parentId: null } }                                       ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Disease Model Induction                                             ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "b18aa936-a4c6-446b-ac98-88ac38930878"           ║
║    (modelControl GUID)                                                         ║
║    AND conditionalParentAnswers specifies control visibility                   ║
║                                                                                ║
║  VALID:   { target: { parentId: "b18aa936...", conditionalParentAnswers:       ║
║              { conditionType: 0, targetParentBoolean: true } } }  # Control    ║
║  VALID:   { target: { parentId: "b18aa936...", conditionalParentAnswers:       ║
║              { conditionType: 0, targetParentBoolean: false } } } # Non-control║
║  VALID:   { target: { parentId: "b18aa936...", conditionalParentAnswers: null }}# Both ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Treatment                                                           ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "d04ec2d7-3e10-4847-9999-befe7ee4c454"           ║
║    (treatmentControl GUID)                                                     ║
║    AND conditionalParentAnswers specifies control visibility                   ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Outcome Assessment                                                  ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"           ║
║    (outcomeLabel GUID)                                                         ║
║    AND conditionalParentAnswers MUST be null (no control parameter)            ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Cohort                                                              ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "62c852ad-3390-48a4-ac13-439bf6b6587f"           ║
║    (cohortLabel GUID)                                                          ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Experiment                                                          ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  FIRST-LEVEL RULE:                                                             ║
║    target.parentId MUST equal "7c555b6e-1fb6-4036-9982-c09a5db82ace"           ║
║    (experimentLabel GUID)                                                      ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  CATEGORY: Hidden                                                              ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  RULE: Custom questions NOT ALLOWED                                            ║
║                                                                                ║
╚═══════════════════════════════════════════════════════════════════════════════╝

2.3 Nested Custom Question Rules¶

Nested custom questions (created via "Add Related" button on an existing question) follow a different set of rules:

╔═══════════════════════════════════════════════════════════════════════════════╗
║  CRITICAL: NESTED QUESTIONS DO NOT PARENT TO SYSTEM QUESTIONS                 ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  When a custom question is created via "Add Related" on another custom        ║
║  question, it parents DIRECTLY to that custom question.                       ║
║                                                                                ║
║  The "required system parent" rules in section 2.2 DO NOT APPLY to nested     ║
║  questions. Those rules only apply to FIRST-LEVEL custom questions.           ║
║                                                                                ║
║  Example (Treatment category):                                                 ║
║  ─────────────────────────────                                                 ║
║  • First-level: target.parentId = treatmentControl (system GUID)              ║
║  • Nested:      target.parentId = {custom-question-guid} (NOT system GUID)    ║
║                                                                                ║
╚═══════════════════════════════════════════════════════════════════════════════╝

# Nested custom question rules
nestedQuestionRules:
  # Key rule: Nested questions parent to CUSTOM QUESTIONS, not system questions
  parentingRule:
    constraint: "target.parentId references the parent custom question's ID (NOT the system question)"
    validation:
      - "Parent question MUST exist"
      - "Parent question MUST be in the SAME category"
      - "Parent question MUST be a custom question (system: false)"
      - "Parent question MUST NOT be a system question"
      - "Circular references MUST NOT exist"
      - "Category of child MUST match category of parent"

  # Category is INHERITED from parent, not specified independently
  categoryInheritance:
    constraint: "Child question's category MUST equal parent question's category"
    reason: "Questions cannot cross category boundaries"

  # Conditional display based on parent answer
  conditionalDisplay:
    forBooleanParent:
      constraint: "conditionalParentAnswers.targetParentBoolean = true | false | null"
      semantics:
        true: "Show when parent checkbox is checked"
        false: "Show when parent checkbox is unchecked"
        null: "Show always regardless of parent answer"

    forDropdownParent:
      constraint: "conditionalParentAnswers.targetParentOptions = [option-values]"
      semantics: "Show when parent answer matches one of the specified options"

    forTextParent:
      constraint: "conditionalParentAnswers typically null or omitted"
      semantics: "Text questions generally don't support conditional children"

Frontend Implementation Reference:

The dialog data structure in create-question.component.ts controls this behavior:

// CreateQuestionDialogData interface (lines 87-91)
export interface CreateQuestionDialogData {
  parentQuestion: IAnnotationQuestion | null;  // <-- KEY: null = first-level, non-null = nested
  category: Category | null;                    // Only needed when parentQuestion is null
  control: boolean | null;                      // Only for Treatment/ModelInduction first-level
}

Parent Assignment Logic (from getCreateQuestion() method, lines 370-402):

const target: Target | null = this.data.parentQuestion
  ? createTarget(
      this.data.parentQuestion.id,  // ← Nested: parent is the selected custom question
      // ... conditional logic based on parent's question type
    )
  : category === categories.cohort  // ← First-level: use category-specific system parent
      ? createTarget(systemAnnotationQuestionGuids.cohortLabel)
      : category === categories.modelInduction
        ? createTarget(systemAnnotationQuestionGuids.modelControl, this.data.control)
        : category === categories.treatment
          ? createTarget(systemAnnotationQuestionGuids.treatmentControl, this.data.control)
        // ... other categories

2.4 Frontend UI Validation Rules¶

The following validation rules are enforced in the frontend create-question.component.ts:

# Frontend UI validation rules
# Source: create-question.component.ts (lines 742-779, 580-712)

fieldValidation:
  questionText:
    maxLength: 80  # Line 745: Validators.maxLength(80)
    required: true
    errorMessage: "Please enter a maximum of 80 characters"

  description:
    required: false
    maxLength: null  # No explicit limit

  options:
    required: "When controlType is dropdown, radio, checklist, or autocomplete"
    unique: true  # Lines 640-712: duplicateValidator
    errorMessages:
      empty: "Required. Please enter at least one value"
      duplicate: "Duplicate values. Please enter unique values"
      duplicateFilter: "Same option cannot both be always shown and shown for filtered options"

  numericOptions:
    # When questionType is 'integer' or 'decimal', options must be valid numbers
    integer:
      constraint: "All option values must be parseable as integers"
      errorMessage: "Please enter an integer value"
    decimal:
      constraint: "All option values must be parseable as numbers"
      errorMessage: "Please enter a number"

controlTypeBehavior:
  checkbox:
    # Lines 481-492: Boolean questions become checkbox with multiple=false
    questionType: "boolean"
    multipleForced: false
    optionalForced: false  # Lines 421-423: checkbox questions are always required
    defaultStatus: "Unchecked"  # Line 770

  dropdown:
    requiresOptions: true  # Lines 184-199: groupValidate
    supportsMultiple: true
    supportsConditionalChildren: true

  radio:
    requiresOptions: true
    multipleForced: false  # Radio implies single selection
    supportsConditionalChildren: true

  autocomplete:
    requiresOptions: true
    supportsMultiple: true
    supportsConditionalChildren: true

  checklist:
    requiresOptions: true
    supportsMultiple: true
    supportsConditionalChildren: true

  textbox:
    requiresOptions: false
    supportsConditionalChildren: false  # Line 337-338: typically no conditional

categoryInheritance:
  # Lines 748-751: Category is inherited from parent for nested questions
  nested:
    constraint: "Category is inherited from parent and disabled (read-only)"
    source: "this.parentQuestion.category"
  firstLevel:
    constraint: "Category is provided via dialog data"
    source: "this.data.category"

conditionalParentAnswers:
  # Lines 723-740: validateConditionalParentAnswers
  schemaVersion0:
    type: "string | null"
    constraint: "Must be null or match one of parent's answers"
  schemaVersion1Plus:
    type: "string[]"
    constraint: "Must contain at least one answer when conditional is true"

2.5 Stage Context Rules¶

The Stage.Extraction boolean is a critical control that determines which questions are available for annotation in a given stage.

The Master Switch: `Stage.Extraction`¶

# Stage extraction rules
stage:
  extraction:
    type: boolean
    default: false
    semantics:
      true: "Enable data extraction - show system questions + custom questions"
      false: "Disable data extraction - show only custom questions"

  availableQuestions:
    rule: |
      IF stage.extraction = true:
        questions = SystemQuestionIds ∪ stage.annotationQuestions
      ELSE:
        questions = stage.annotationQuestions

Backend Implementation¶

Location: Stage.cs:93-96

public ImmutableHashSet<Guid> AllStageAnnotationQuestions =>
    ImmutableHashSet.CreateRange(AnnotationQuestions).Union(Extraction
        ? AnnotationQuestion.SystemQuestionIds
        : ImmutableHashSet<Guid>.Empty);

Frontend Implementation¶

Location: annotation-question.selectors.ts:133-154

// Parallel logic in NgRx selector
return stage.extraction
  ? _.union(sysAnnotationQuestions, stageQuestions)
  : stageQuestions;

Category Visibility Rules¶

Category	`extraction=false`	`extraction=true`
Study	Custom questions only	Custom questions only
Disease Model Induction	Hidden	Label + Control + Custom
Treatment	Hidden	Label + Control + Custom
Outcome Assessment	Hidden	Label + Outcome Qs + Custom
Cohort	Hidden	Label + Lookups + Custom
Experiment	Hidden	Label + Lookups + Custom
Hidden	Hidden	System questions only

Validation Rule¶

stageContextValidation:
  rule: |
    WHEN validating an AnnotationQuestion for a Stage:
      IF question.system = true:
        REQUIRE stage.extraction = true
        REASON: "System questions are only available when data extraction is enabled"
      IF question.category IN ["Disease Model Induction", "Treatment", "Outcome Assessment", "Cohort", "Experiment"]:
        REQUIRE stage.extraction = true
        REASON: "Unit-based categories are only available when data extraction is enabled"

2.6 OutcomeData Field Mapping¶

This section documents how system questions populate OutcomeData fields. Understanding this mapping is essential for data extraction workflows.

Field Mapping Table¶

OutcomeData Field	Source System Question	Category	Question GUID
`units`	Outcome Units	Outcome Assessment	`66eb1736-a838-4692-a78b-96b0671a377c`
`averageType`	Average Type	Outcome Assessment	`3a287115-5000-4d3f-8c41-7c46fae9adcf`
`errorType`	Error Type	Outcome Assessment	`8dbea59f-54d2-4e41-87e7-fde9e73a72d5`
`greaterIsWorse`	Greater Is Worse	Outcome Assessment	`45351e04-47b2-4785-9a72-713284e917b8`
`numberOfAnimals`	Number of Animals	Cohort	`83caa64f-86a1-4f6e-a278-ebbd25297677`
`graphId`	PDF Graphs (lookup)	Outcome Assessment	`016278e8-7e60-40d4-9568-d7fa42670c32`
`experimentId`	(from annotation form)	Experiment	Links to Experiment unit
`cohortId`	(from annotation form)	Cohort	Links to Cohort unit
`outcomeId`	(from annotation form)	Outcome Assessment	Links to Outcome unit

Note: numberOfAnimals comes from the Cohort category, not Outcome Assessment. This allows different animal numbers per cohort.

TimePoint Field Semantics¶

Field	Meaning	Interpretation
`time`	Time point of measurement	Units depend on study (e.g., 24 = 24 hours post-treatment)
`average`	Central tendency value	Mean if `averageType = "Mean"`, Median if `averageType = "Median"`
`error`	Variability measure	SD/SEM if `averageType = "Mean"`, IQR if `averageType = "Median"`

Formal Mapping Rule¶

outcomeDataFieldMapping:
  rule: |
    FOR each OutcomeData entry:
      # Unit references (from annotation linking)
      outcomeData.experimentId = annotationForm.selectedExperiment.id
      outcomeData.cohortId = annotationForm.selectedCohort.id
      outcomeData.outcomeId = annotationForm.selectedOutcome.id

      # Statistical metadata (from system question answers)
      outcomeData.units = answerFor(GUID: "66eb1736-a838-4692-a78b-96b0671a377c")
      outcomeData.averageType = answerFor(GUID: "3a287115-5000-4d3f-8c41-7c46fae9adcf")
      outcomeData.errorType = answerFor(GUID: "8dbea59f-54d2-4e41-87e7-fde9e73a72d5")
      outcomeData.greaterIsWorse = answerFor(GUID: "45351e04-47b2-4785-9a72-713284e917b8")
      outcomeData.numberOfAnimals = answerFor(GUID: "83caa64f-86a1-4f6e-a278-ebbd25297677")
      outcomeData.graphId = answerFor(GUID: "016278e8-7e60-40d4-9568-d7fa42670c32")  # Optional

      # TimePoints (entered directly in OutcomeData UI)
      outcomeData.timePoints = userEnteredTimePoints[]

3. Cross-Language Validation Strategies¶

3.1 The Challenge¶

┌─────────────────────────────────────────────────────────────────────────────┐
│                    THE CROSS-LANGUAGE CHALLENGE                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  FRONTEND (TypeScript)                 BACKEND (C#)                         │
│  ══════════════════════               ════════════                          │
│  • Pre-validation for UX               • Authoritative validation            │
│  • Immediate feedback                  • Database seeding                    │
│  • UI constraints                      • API enforcement                     │
│                                                                              │
│                    ┌─────────────────────────┐                               │
│                    │   RULES MUST BE         │                               │
│                    │   IDENTICAL             │                               │
│                    │   IN BOTH LANGUAGES     │                               │
│                    └─────────────────────────┘                               │
│                                                                              │
│  OPTIONS:                                                                    │
│  ────────                                                                    │
│  1. Schema-First (JSON Schema) - Generate validators for both languages     │
│  2. TypeSpec - Microsoft's API design language with code generation         │
│  3. JSON Typedef (JTD) - Portable schemas with multi-language codegen       │
│  4. Manual Sync - Keep rules in both places manually (error-prone)          │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

3.2 Recommended: JSON Schema with Code Generation¶

Why JSON Schema?

Industry standard with excellent tooling
NJsonSchema for C# (generate validators and DTOs)
AJV for TypeScript (fast validation)
Human-readable as documentation
Supports custom validation keywords for complex rules

Implementation Architecture:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    JSON SCHEMA CODE GENERATION PIPELINE                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│                   ┌──────────────────────────────────┐                       │
│                   │  annotation-question.schema.json │ ◄── Single Source     │
│                   │  ─────────────────────────────── │     of Truth          │
│                   │  • Type definitions               │                       │
│                   │  • Category rules                 │                       │
│                   │  • GUID constants                 │                       │
│                   │  • Validation constraints         │                       │
│                   └───────────────┬──────────────────┘                       │
│                                   │                                          │
│               ┌───────────────────┼───────────────────┐                      │
│               │                   │                   │                      │
│               ▼                   ▼                   ▼                      │
│  ┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐       │
│  │ C# Generation      │ │ TypeScript Gen     │ │ Documentation      │       │
│  │ (NJsonSchema)      │ │ (json-schema-to-ts)│ │ (docgen)           │       │
│  ├────────────────────┤ ├────────────────────┤ ├────────────────────┤       │
│  │ • DTOs             │ │ • Type interfaces  │ │ • Markdown docs    │       │
│  │ • Validators       │ │ • Zod/Yup schemas  │ │ • Rule tables      │       │
│  │ • Constants        │ │ • Constants        │ │ • Examples         │       │
│  └────────────────────┘ └────────────────────┘ └────────────────────┘       │
│               │                   │                   │                      │
│               ▼                   ▼                   ▼                      │
│  ┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐       │
│  │ SyRF.Validation    │ │ @syrf/validation   │ │ /docs/generated    │       │
│  │ .Generated.dll     │ │ npm package        │ │ rules.md           │       │
│  └────────────────────┘ └────────────────────┘ └────────────────────┘       │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

3.3 Schema Example¶

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://syrf.org/schemas/annotation-question.json",
  "title": "AnnotationQuestion",
  "description": "Schema for annotation question validation",

  "definitions": {
    "SystemQuestionGuid": {
      "type": "string",
      "enum": [
        "bdb6e257-5a08-42ef-aad0-829668679b0e",
        "b18aa936-a4c6-446b-ac98-88ac38930878",
        "b02e3072-74f0-44e0-a468-f472b3b09991",
        "d04ec2d7-3e10-4847-9999-befe7ee4c454",
        "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f",
        "62c852ad-3390-48a4-ac13-439bf6b6587f",
        "7c555b6e-1fb6-4036-9982-c09a5db82ace"
      ],
      "x-enum-descriptions": {
        "bdb6e257-5a08-42ef-aad0-829668679b0e": "diseaseModelInductionLabel",
        "b18aa936-a4c6-446b-ac98-88ac38930878": "modelControl",
        "b02e3072-74f0-44e0-a468-f472b3b09991": "treatmentLabel",
        "d04ec2d7-3e10-4847-9999-befe7ee4c454": "treatmentControl",
        "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f": "outcomeAssessmentLabel",
        "62c852ad-3390-48a4-ac13-439bf6b6587f": "cohortLabel",
        "7c555b6e-1fb6-4036-9982-c09a5db82ace": "experimentLabel"
      }
    },

    "Category": {
      "type": "string",
      "enum": [
        "Study",
        "Disease Model Induction",
        "Treatment",
        "Outcome Assessment",
        "Cohort",
        "Experiment",
        "Hidden"
      ]
    },

    "CategoryRules": {
      "description": "Rules for FIRST-LEVEL custom questions. Nested questions parent to other custom questions instead.",
      "type": "object",
      "properties": {
        "Study": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "type": "null", "description": "Study has no system parent - first-level questions are root" },
            "controlParameterRequired": { "const": false }
          }
        },
        "Disease Model Induction": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "b18aa936-a4c6-446b-ac98-88ac38930878", "description": "First-level parent to modelControl; nested parent to custom questions" },
            "controlParameterRequired": { "const": true, "description": "Only for first-level questions" }
          }
        },
        "Treatment": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "d04ec2d7-3e10-4847-9999-befe7ee4c454", "description": "First-level parent to treatmentControl; nested parent to custom questions" },
            "controlParameterRequired": { "const": true, "description": "Only for first-level questions" }
          }
        },
        "Outcome Assessment": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f", "description": "First-level parent to outcomeLabel; nested parent to custom questions" },
            "controlParameterRequired": { "const": false }
          }
        },
        "Cohort": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "62c852ad-3390-48a4-ac13-439bf6b6587f", "description": "First-level parent to cohortLabel; nested parent to custom questions" },
            "controlParameterRequired": { "const": false }
          }
        },
        "Experiment": {
          "type": "object",
          "properties": {
            "firstLevelParent": { "const": "7c555b6e-1fb6-4036-9982-c09a5db82ace", "description": "First-level parent to experimentLabel; nested parent to custom questions" },
            "controlParameterRequired": { "const": false }
          }
        }
      }
    },

    "ConditionalParentAnswers": {
      "oneOf": [
        { "type": "null" },
        {
          "type": "object",
          "properties": {
            "conditionType": {
              "type": "integer",
              "enum": [0, 1],
              "description": "0 = Boolean, 1 = Option"
            },
            "targetParentBoolean": {
              "type": ["boolean", "null"]
            },
            "targetParentOptions": {
              "type": "array",
              "items": { "type": "string" }
            }
          }
        }
      ]
    },

    "Target": {
      "oneOf": [
        { "type": "null" },
        {
          "type": "object",
          "properties": {
            "parentId": {
              "oneOf": [
                { "type": "null" },
                { "type": "string", "format": "uuid" }
              ]
            },
            "conditionalParentAnswers": {
              "$ref": "#/definitions/ConditionalParentAnswers"
            }
          },
          "required": ["parentId"]
        }
      ]
    }
  },

  "type": "object",
  "properties": {
    "id": { "type": "string", "format": "uuid" },
    "category": { "$ref": "#/definitions/Category" },
    "system": { "type": "boolean" },
    "target": { "$ref": "#/definitions/Target" },
    "text": { "type": "string", "minLength": 1 },
    "questionType": { "type": "string" },
    "controlType": { "type": "string" }
  },
  "required": ["category", "text"],

  "x-validation-note": "IMPORTANT: JSON Schema alone cannot fully validate parent rules because nested questions can parent to ANY custom question in the same category. The allOf rules below only validate FIRST-LEVEL questions. Full validation requires runtime checking that the parentId references either (a) the system parent GUID for first-level questions, or (b) a valid custom question in the same category for nested questions. See section 3.4 for complete validation logic.",

  "allOf": [
    {
      "description": "Study: first-level questions must be root (target null)",
      "if": {
        "properties": {
          "system": { "const": false },
          "category": { "const": "Study" }
        }
      },
      "then": {
        "description": "Study allows root questions OR nested questions parenting to custom questions",
        "properties": {
          "target": {
            "oneOf": [
              { "type": "null", "description": "Root question (first-level)" },
              {
                "type": "object",
                "properties": {
                  "parentId": {
                    "oneOf": [
                      { "type": "null", "description": "Root question" },
                      { "type": "string", "format": "uuid", "description": "Nested question - parentId is custom question ID" }
                    ]
                  }
                }
              }
            ]
          }
        }
      }
    },
    {
      "description": "Disease Model Induction: first-level questions parent to modelControl, nested parent to custom questions",
      "if": {
        "properties": {
          "system": { "const": false },
          "category": { "const": "Disease Model Induction" }
        }
      },
      "then": {
        "properties": {
          "target": {
            "type": "object",
            "properties": {
              "parentId": {
                "type": "string",
                "format": "uuid",
                "description": "FIRST-LEVEL: must be b18aa936-... (modelControl). NESTED: can be any custom question ID. Runtime validation required."
              }
            },
            "required": ["parentId"]
          }
        },
        "required": ["target"]
      }
    }
  ],

  "x-nested-question-validation": {
    "description": "For nested questions, parentId validation must be done at runtime",
    "rules": [
      "Parent question must exist in the question collection",
      "Parent question must have system: false (is a custom question)",
      "Parent question must have the same category as the child",
      "No circular references allowed"
    ]
  }
}

3.4 Generated Code Examples¶

C# (via NJsonSchema):

// Auto-generated from annotation-question.schema.json
// DO NOT EDIT - regenerate with: dotnet run --project tools/schema-gen

namespace SyRF.Validation.Generated;

public static class CategoryRules
{
    /// <summary>
    /// Rules for FIRST-LEVEL custom questions only.
    /// Nested questions parent to other custom questions - validated separately.
    /// </summary>
    public static readonly IReadOnlyDictionary<string, CategoryRule> Rules =
        new Dictionary<string, CategoryRule>
        {
            ["Study"] = new CategoryRule(
                FirstLevelParent: null,  // Study first-level questions are root
                ControlParameterRequired: false
            ),
            ["Disease Model Induction"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("b18aa936-a4c6-446b-ac98-88ac38930878"),
                ControlParameterRequired: true  // Only for first-level
            ),
            ["Treatment"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("d04ec2d7-3e10-4847-9999-befe7ee4c454"),
                ControlParameterRequired: true  // Only for first-level
            ),
            ["Outcome Assessment"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"),
                ControlParameterRequired: false
            ),
            ["Cohort"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("62c852ad-3390-48a4-ac13-439bf6b6587f"),
                ControlParameterRequired: false
            ),
            ["Experiment"] = new CategoryRule(
                FirstLevelParent: Guid.Parse("7c555b6e-1fb6-4036-9982-c09a5db82ace"),
                ControlParameterRequired: false
            )
        };
}

public record CategoryRule(
    Guid? FirstLevelParent,  // Renamed: only applies to first-level questions
    bool ControlParameterRequired
);

public class AnnotationQuestionValidator
{
    private readonly IReadOnlyCollection<AnnotationQuestionDto> _allQuestions;

    public AnnotationQuestionValidator(IReadOnlyCollection<AnnotationQuestionDto> allQuestions)
    {
        _allQuestions = allQuestions;
    }

    public ValidationResult Validate(AnnotationQuestionDto dto)
    {
        if (dto.System) return ValidationResult.Success;

        if (!CategoryRules.Rules.TryGetValue(dto.Category, out var rule))
            return ValidationResult.Error($"Unknown category: {dto.Category}");

        var actualParent = dto.Target?.ParentId;

        // CASE 1: Study category - first-level must be root, nested can parent to custom
        if (rule.FirstLevelParent == null)
        {
            if (actualParent == null)
                return ValidationResult.Success;  // Valid first-level (root)

            // Has a parent - must be a valid nested question
            return ValidateNestedQuestion(dto, actualParent.Value);
        }

        // CASE 2: Other categories - check if first-level or nested
        if (actualParent == null)
        {
            return ValidationResult.Error(
                $"Questions in {dto.Category} must have a parent");
        }

        if (actualParent == rule.FirstLevelParent)
        {
            // Valid FIRST-LEVEL question - parents to system question
            // Validate control parameter for first-level only
            if (rule.ControlParameterRequired)
            {
                // conditionalParentAnswers must be explicitly set (can be null for "both")
            }
            return ValidationResult.Success;
        }

        // Parent is not the system question - must be a valid NESTED question
        return ValidateNestedQuestion(dto, actualParent.Value);
    }

    /// <summary>
    /// Validates nested questions that parent to other custom questions.
    /// </summary>
    private ValidationResult ValidateNestedQuestion(AnnotationQuestionDto dto, Guid parentId)
    {
        var parentQuestion = _allQuestions.FirstOrDefault(q => q.Id == parentId);

        if (parentQuestion == null)
            return ValidationResult.Error($"Parent question {parentId} not found");

        if (parentQuestion.System)
            return ValidationResult.Error(
                $"Nested questions cannot parent to system questions. " +
                $"Use first-level rules for parenting to system questions.");

        if (parentQuestion.Category != dto.Category)
            return ValidationResult.Error(
                $"Parent question must be in same category ({dto.Category}), " +
                $"but parent is in {parentQuestion.Category}");

        // TODO: Check for circular references

        return ValidationResult.Success;
    }
}

TypeScript (via json-schema-to-typescript + custom transform):

// Auto-generated from annotation-question.schema.json
// DO NOT EDIT - regenerate with: npm run generate:validation

/**
 * Rules for FIRST-LEVEL custom questions only.
 * Nested questions parent to other custom questions - validated separately.
 */
export const categoryRules = {
  Study: {
    firstLevelParent: null,  // Study first-level questions are root
    controlParameterRequired: false,
  },
  'Disease Model Induction': {
    firstLevelParent: 'b18aa936-a4c6-446b-ac98-88ac38930878',
    controlParameterRequired: true,  // Only for first-level
  },
  Treatment: {
    firstLevelParent: 'd04ec2d7-3e10-4847-9999-befe7ee4c454',
    controlParameterRequired: true,  // Only for first-level
  },
  'Outcome Assessment': {
    firstLevelParent: 'dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f',
    controlParameterRequired: false,
  },
  Cohort: {
    firstLevelParent: '62c852ad-3390-48a4-ac13-439bf6b6587f',
    controlParameterRequired: false,
  },
  Experiment: {
    firstLevelParent: '7c555b6e-1fb6-4036-9982-c09a5db82ace',
    controlParameterRequired: false,
  },
} as const;

export type Category = keyof typeof categoryRules;

/**
 * Validates annotation question placement rules.
 * Handles both first-level questions (parent to system) and nested questions (parent to custom).
 */
export function validateAnnotationQuestion(
  dto: AnnotationQuestionDto,
  allQuestions: AnnotationQuestionDto[]  // Needed to validate nested question parents
): ValidationResult {
  if (dto.system) return { valid: true };

  const rule = categoryRules[dto.category as Category];
  if (!rule) {
    return { valid: false, error: `Unknown category: ${dto.category}` };
  }

  const actualParent = dto.target?.parentId ?? null;

  // CASE 1: Study category - first-level must be root, nested can parent to custom
  if (rule.firstLevelParent === null) {
    if (actualParent === null) {
      return { valid: true };  // Valid first-level (root)
    }
    // Has a parent - must be a valid nested question
    return validateNestedQuestion(dto, actualParent, allQuestions);
  }

  // CASE 2: Other categories - check if first-level or nested
  if (actualParent === null) {
    return {
      valid: false,
      error: `Questions in ${dto.category} must have a parent`,
    };
  }

  if (actualParent === rule.firstLevelParent) {
    // Valid FIRST-LEVEL question - parents to system question
    return { valid: true };
  }

  // Parent is not the system question - must be a valid NESTED question
  return validateNestedQuestion(dto, actualParent, allQuestions);
}

/**
 * Validates nested questions that parent to other custom questions.
 */
function validateNestedQuestion(
  dto: AnnotationQuestionDto,
  parentId: string,
  allQuestions: AnnotationQuestionDto[]
): ValidationResult {
  const parentQuestion = allQuestions.find(q => q.id === parentId);

  if (!parentQuestion) {
    return { valid: false, error: `Parent question ${parentId} not found` };
  }

  if (parentQuestion.system) {
    return {
      valid: false,
      error: `Nested questions cannot parent to system questions. ` +
             `Use first-level rules for parenting to system questions.`,
    };
  }

  if (parentQuestion.category !== dto.category) {
    return {
      valid: false,
      error: `Parent question must be in same category (${dto.category}), ` +
             `but parent is in ${parentQuestion.category}`,
    };
  }

  // TODO: Check for circular references

  return { valid: true };
}

4. Lookup Referential Integrity¶

4.1 The Problem¶

Lookup questions reference annotations from other categories. When source annotations are deleted, lookups can reference non-existent data (orphaned references).

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LOOKUP REFERENTIAL INTEGRITY PROBLEM                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  1. User creates Treatment annotation: "Aspirin 100mg"                       │
│                                                                              │
│  2. User creates Cohort with lookup referencing "Aspirin 100mg"              │
│     cohortTreatments: ["aspirin-100mg-annotation-id"]                        │
│                                                                              │
│  3. User DELETES the Treatment annotation "Aspirin 100mg"                    │
│                                                                              │
│  4. PROBLEM: Cohort still references deleted annotation!                     │
│     cohortTreatments: ["aspirin-100mg-annotation-id"]  ← ORPHAN!             │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

4.2 Unified Schema: Placement + Referential Integrity¶

Key Design Decision: Referential integrity rules are part of the SAME schema as placement rules (section 3). This ensures:

Single source of truth for ALL annotation question rules
One code generation pipeline produces ALL validators
Consistent validation across frontend/backend

The YAML schema from section 2.1 is extended to include lookup relationships:

# annotation-question-rules.yaml (EXTENDED from section 2.1)
# This is the COMPLETE schema covering placement AND referential integrity

categories:
  Treatment:
    # Placement rules (from section 2.1)
    customQuestionPlacement:
      type: "control-descendant"
      systemParent: "d04ec2d7-3e10-4847-9999-befe7ee4c454"
      nestedQuestions:
        allowed: true
    controlParameterRule:
      type: "required-for-first-level"

    # NEW: Referential integrity - who references this category's labels?
    labelReferentialIntegrity:
      labelQuestionGuid: "b02e3072-74f0-44e0-a468-f472b3b09991"
      referencedBy:
        - lookupQuestionGuid: "a3f2e5bb-3ade-4830-bb66-b5550a3cc85b"  # cohortTreatments
          lookupCategory: "Cohort"
          relationship: "many-to-many"
          onSourceDelete: "prevent"  # prevent | cascade-nullify | soft-delete
      referencesTo: []  # Treatment doesn't lookup from other categories

  Cohort:
    customQuestionPlacement:
      type: "label-descendant"
      systemParent: "62c852ad-3390-48a4-ac13-439bf6b6587f"
      nestedQuestions:
        allowed: true
    controlParameterRule:
      type: "forbidden"

    # Cohort's labels are referenced by Experiment
    labelReferentialIntegrity:
      labelQuestionGuid: "62c852ad-3390-48a4-ac13-439bf6b6587f"
      referencedBy:
        - lookupQuestionGuid: "e7a84ba2-4ef2-4a14-83cb-7decf469d1a2"  # experimentCohorts
          lookupCategory: "Experiment"
          onSourceDelete: "prevent"

      # Cohort LOOKUPS from these categories
      referencesTo:
        - lookupQuestionGuid: "ecb550a5-ed95-473f-84bf-262c9faa7541"  # cohortDiseaseModels
          sourceLabelGuid: "bdb6e257-5a08-42ef-aad0-829668679b0e"
          sourceCategory: "Disease Model Induction"
          validation:
            - "Referenced annotations must exist"
            - "Referenced annotations must be from same study"
            - "Referenced annotations must be from diseaseModelLabel question"

        - lookupQuestionGuid: "a3f2e5bb-3ade-4830-bb66-b5550a3cc85b"  # cohortTreatments
          sourceLabelGuid: "b02e3072-74f0-44e0-a468-f472b3b09991"
          sourceCategory: "Treatment"

        - lookupQuestionGuid: "12ecd826-85a4-499a-844c-bd35ea6624ad"  # cohortOutcomes
          sourceLabelGuid: "dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"
          sourceCategory: "Outcome Assessment"

4.3 Generated Code: Unified Validator¶

The code generator (section 3.4) produces a single validator class that handles BOTH placement AND referential integrity:

// Auto-generated from annotation-question-rules.yaml
// Handles BOTH placement rules AND referential integrity
namespace SyRF.Validation.Generated;

public class AnnotationQuestionValidator
{
    private readonly IReadOnlyCollection<AnnotationQuestionDto> _allQuestions;
    private readonly IReadOnlyCollection<AnnotationDto> _allAnnotations;

    public AnnotationQuestionValidator(
        IReadOnlyCollection<AnnotationQuestionDto> allQuestions,
        IReadOnlyCollection<AnnotationDto> allAnnotations)
    {
        _allQuestions = allQuestions;
        _allAnnotations = allAnnotations;
    }

    // === PLACEMENT VALIDATION (from section 3.4) ===

    public ValidationResult ValidatePlacement(AnnotationQuestionDto dto)
    {
        // ... same as section 3.4 ...
    }

    // === REFERENTIAL INTEGRITY VALIDATION ===

    /// <summary>
    /// Validates a lookup answer references valid, existing annotations.
    /// Called when saving a Cohort/Experiment with lookup selections.
    /// </summary>
    public ValidationResult ValidateLookupAnswer(LookupAnswerDto dto)
    {
        var lookupRule = LookupRules.Rules.GetValueOrDefault(dto.LookupQuestionGuid);
        if (lookupRule == null)
            return ValidationResult.Error($"Unknown lookup question: {dto.LookupQuestionGuid}");

        foreach (var annotationId in dto.SelectedAnnotationIds)
        {
            var annotation = _allAnnotations.FirstOrDefault(a => a.Id == annotationId);

            if (annotation == null)
                return ValidationResult.Error(
                    $"Referenced annotation {annotationId} does not exist",
                    "AQ007");

            if (annotation.QuestionId != lookupRule.SourceLabelGuid)
                return ValidationResult.Error(
                    $"Annotation {annotationId} is not from expected source " +
                    $"({lookupRule.SourceCategory} label question)",
                    "AQ008");

            if (annotation.StudyId != dto.StudyId)
                return ValidationResult.Error(
                    $"Annotation {annotationId} belongs to different study",
                    "AQ009");
        }

        return ValidationResult.Success;
    }

    /// <summary>
    /// Checks if a label annotation can be deleted (referential integrity).
    /// Called before deleting a Treatment/DiseaseModel/Outcome/Cohort annotation.
    /// </summary>
    public ValidationResult CanDeleteLabelAnnotation(
        Guid annotationId,
        Guid labelQuestionGuid)
    {
        var categoryRule = CategoryRules.Rules.Values
            .FirstOrDefault(r => r.LabelQuestionGuid == labelQuestionGuid);

        if (categoryRule?.ReferencedBy == null)
            return ValidationResult.Success;  // No dependents

        // Check if any lookup answers reference this annotation
        var dependentLookups = _allAnnotations
            .Where(a => categoryRule.ReferencedBy
                .Any(r => a.QuestionId == r.LookupQuestionGuid))
            .Where(a => a.SelectedAnnotationIds?.Contains(annotationId) == true)
            .ToList();

        if (dependentLookups.Any())
        {
            var onDelete = categoryRule.ReferencedBy.First().OnSourceDelete;
            return onDelete switch
            {
                DeleteBehavior.Prevent => ValidationResult.Error(
                    $"Cannot delete: referenced by {dependentLookups.Count} lookup(s)",
                    "AQ010"),
                DeleteBehavior.CascadeNullify => ValidationResult.Warning(
                    $"Will remove from {dependentLookups.Count} lookup(s)"),
                DeleteBehavior.SoftDelete => ValidationResult.Success,
                _ => ValidationResult.Success
            };
        }

        return ValidationResult.Success;
    }
}

// Generated lookup rules (alongside CategoryRules from section 3.4)
public static class LookupRules
{
    public static readonly IReadOnlyDictionary<Guid, LookupRule> Rules =
        new Dictionary<Guid, LookupRule>
        {
            // cohortDiseaseModels
            [Guid.Parse("ecb550a5-ed95-473f-84bf-262c9faa7541")] = new LookupRule(
                SourceLabelGuid: Guid.Parse("bdb6e257-5a08-42ef-aad0-829668679b0e"),
                SourceCategory: "Disease Model Induction"),

            // cohortTreatments
            [Guid.Parse("a3f2e5bb-3ade-4830-bb66-b5550a3cc85b")] = new LookupRule(
                SourceLabelGuid: Guid.Parse("b02e3072-74f0-44e0-a468-f472b3b09991"),
                SourceCategory: "Treatment"),

            // cohortOutcomes
            [Guid.Parse("12ecd826-85a4-499a-844c-bd35ea6624ad")] = new LookupRule(
                SourceLabelGuid: Guid.Parse("dbe2720c-2e08-4f47-bcd0-3fe4ae8b8c7f"),
                SourceCategory: "Outcome Assessment"),

            // experimentCohorts
            [Guid.Parse("e7a84ba2-4ef2-4a14-83cb-7decf469d1a2")] = new LookupRule(
                SourceLabelGuid: Guid.Parse("62c852ad-3390-48a4-ac13-439bf6b6587f"),
                SourceCategory: "Cohort"),
        };
}

public record LookupRule(Guid SourceLabelGuid, string SourceCategory);

4.4 UI Implementation: Cascade Delete with User Confirmation¶

EXISTING IMPLEMENTATION: The frontend already implements Option B (cascade delete with confirmation).

Source: annotation-unit.component.ts:386-466

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LOOKUP REFERENTIAL INTEGRITY (IMPLEMENTED)                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  SCENARIO: User deletes a Treatment/DiseaseModel/Cohort/Outcome annotation  │
│            that is referenced by a lookup question in Cohort/Experiment     │
│                                                                              │
│  CURRENT BEHAVIOR (IMPLEMENTED in annotation-unit.component.ts):             │
│  ═════════════════════════════════════════════════════════════════════════  │
│                                                                              │
│  1. User clicks delete on "Aspirin 100mg" treatment                         │
│                                                                              │
│  2. Component checks `this._containedIn` (populated by ngrx selector)       │
│     - Uses `categoryMap` to find lookup relationships                       │
│     - Queries `selectExtractionCharacteristicsForCurrentStudy` selector     │
│     - Returns list of containing units (cohorts referencing this treatment) │
│                                                                              │
│  3. If references exist, shows confirmation dialog:                          │
│  ┌──────────────────────────────────────────────────────────────────┐       │
│  │ Delete 'Aspirin 100mg' treatment                                  │       │
│  │                                                                   │       │
│  │ This treatment is currently used in the following cohort(s):     │       │
│  │ • 'Control Group'                                                 │       │
│  │ • 'Treatment Group'                                               │       │
│  │                                                                   │       │
│  │ Deleting this treatment will also remove it from each cohort     │       │
│  │ above.                                                            │       │
│  │                                                                   │       │
│  │  [DELETE TREATMENT]  [Cancel]                                     │       │
│  └──────────────────────────────────────────────────────────────────┘       │
│                                                                              │
│  4. On confirmation, cascades removal through form state:                    │
│     - Finds each containing unit's lookup question (e.g., cohortTreatments) │
│     - Filters out the deleted unit's ID from the answer array               │
│     - Then emits removeGroup to delete the annotation itself                │
│                                                                              │
│  5. If no references exist, deletes immediately (no dialog)                  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Implementation Details:

// annotation-unit.component.ts lines 386-466
removeAnswerGroup() {
  if (this._containedIn && this._containedIn.length > 0) {
    // Show confirmation dialog with list of containing units
    this._dialog.open(ConfirmationDialogComponent, {
      data: {
        title: `Delete '${label}' ${unitType}`,
        contentHtml: `This ${unitType} is currently used in the following ${cuType}(s):
                      <ul>${containingUnitsList}</ul>
                      Deleting this ${unitType} will also remove it from each ${cuType} above.`,
        okText: `DELETE ${unitType.toUpperCase()}`,
      }
    }).afterClosed().subscribe((ok) => {
      if (ok) {
        // CASCADE: Remove from all containing units' lookup answers
        this._containedIn.forEach((cu) => {
          const lookupAnswerControl = rootAnnotationGroup.get([
            categoryMap[this.category].cuCategory,        // e.g., "Cohort"
            categoryMap[this.category].cuLabelQId,        // cohortLabel GUID
            'answers', cuIndex, 'subquestions',
            categoryMap[this.category].subqIdOnCu,        // e.g., cohortTreatments
            'answers', 0, 'answer'
          ]);
          // Filter out the deleted unit's ID
          lookupAnswerControl.setValue(
            ansValue.filter((id) => id !== deletedUnitId)
          );
        });
        // Then delete the annotation itself
        this.removeGroup.emit();
      }
    });
  } else {
    // No references - delete immediately
    this.removeGroup.emit();
  }
}

Category Mapping (for lookup traversal):

The categoryMap object (lines 79-121) defines the relationships:

Deleted Unit Type	Lookup Category	Lookup Question GUID
Treatment	Cohort	`cohortTreatments` (a3f2e5bb-3ade-4830-bb66-b5550a3cc85b)
Disease Model	Cohort	`cohortDiseaseModels` (ecb550a5-ed95-473f-84bf-262c9faa7541)
Outcome	Cohort	`cohortOutcomes` (12ecd826-85a4-499a-844c-bd35ea6624ad)
Cohort	Experiment	`experimentCohorts` (e7a84ba2-4ef2-4a14-83cb-7decf469d1a2)

Scope Limitation: This cascade logic currently operates only within the annotation form's reactive form state. It does NOT:

Query the database for cross-study references
Persist to backend independently (changes are saved when the form is submitted)
Prevent deletion at the API level

Future Enhancement: Consider implementing backend validation using the generated validator (section 4.3) to enforce referential integrity at the API level as well

5. Code Generation Strategy¶

5.1 Clarification: Compile-Time vs Runtime¶

Key Decision: This specification uses compile-time code generation, NOT runtime schema loading.

┌─────────────────────────────────────────────────────────────────────────────┐
│                    CODE GENERATION STRATEGY CLARIFICATION                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ❌ NOT THIS: Runtime Schema Loading                                         │
│  ─────────────────────────────────────────────────────────────────────────  │
│  • Load YAML at application startup                                          │
│  • Parse rules dynamically                                                   │
│  • Lose compile-time type safety                                             │
│  • "CategoryService.loadCategoryRules()" pattern                             │
│                                                                              │
│  ✅ THIS: Compile-Time Code Generation                                       │
│  ─────────────────────────────────────────────────────────────────────────  │
│  • YAML schema is SOURCE, consumed at BUILD TIME only                        │
│  • Generator produces STATIC C# classes and TypeScript types                 │
│  • Full compile-time type safety in both languages                           │
│  • YAML never loaded at runtime - it's "baked in" to generated code          │
│                                                                              │
│  WHY COMPILE-TIME?                                                           │
│  ─────────────────────────────────────────────────────────────────────────  │
│  • Strong typing catches errors at compile time, not runtime                 │
│  • IDE autocomplete for category names, GUIDs, rules                         │
│  • No YAML parsing overhead at application startup                           │
│  • Rules are immutable constants - exactly what we want for domain rules     │
│  • Categories rarely change - no need for runtime flexibility                │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

5.2 What Gets Generated (Compile-Time)¶

The YAML schema is the single source of truth, but it's only consumed at build time by the code generator. The output is strongly-typed code:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    COMPILE-TIME CODE GENERATION PIPELINE                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │  annotation-question-rules.yaml                                       │   │
│  │  ═══════════════════════════════                                      │   │
│  │  SINGLE SOURCE OF TRUTH (human-editable, version-controlled)          │   │
│  │                                                                       │   │
│  │  • Category definitions (Study, Treatment, Cohort, etc.)              │   │
│  │  • Placement rules (systemParent GUIDs, control requirements)         │   │
│  │  • Referential integrity (lookup dependencies, onDelete behavior)     │   │
│  │  • GUID constants (all system question GUIDs)                         │   │
│  └────────────────────────────────┬─────────────────────────────────────┘   │
│                                   │                                          │
│                                   │  BUILD TIME                              │
│                                   │  (npm run generate:validation)           │
│                                   │  (dotnet run tools/schema-gen)           │
│                                   │                                          │
│            ┌──────────────────────┼──────────────────────┐                  │
│            │                      │                      │                  │
│            ▼                      ▼                      ▼                  │
│  ┌───────────────────┐  ┌───────────────────┐  ┌───────────────────┐       │
│  │ C# (Backend)      │  │ TypeScript (FE)   │  │ Documentation     │       │
│  │ ═══════════════   │  │ ═══════════════   │  │ ═══════════════   │       │
│  │                   │  │                   │  │                   │       │
│  │ CategoryRules.cs  │  │ categoryRules.ts  │  │ rules.md          │       │
│  │ LookupRules.cs    │  │ lookupRules.ts    │  │                   │       │
│  │ SystemGuids.cs    │  │ systemGuids.ts    │  │ Auto-generated    │       │
│  │ Validator.cs      │  │ validator.ts      │  │ rule tables       │       │
│  │                   │  │                   │  │                   │       │
│  │ STATIC, TYPED     │  │ STATIC, TYPED     │  │                   │       │
│  │ Compile-time safe │  │ Compile-time safe │  │                   │       │
│  └───────────────────┘  └───────────────────┘  └───────────────────┘       │
│            │                      │                                          │
│            │     RUNTIME          │                                          │
│            │  (Application runs)  │                                          │
│            │                      │                                          │
│            ▼                      ▼                                          │
│  ┌───────────────────────────────────────────────────────────────────────┐  │
│  │  Application uses GENERATED CODE directly                              │  │
│  │  ─────────────────────────────────────────────────────────────────────│  │
│  │                                                                        │  │
│  │  C#:  CategoryRules.Rules["Treatment"].FirstLevelParent                │  │
│  │       → Guid (compile-time checked)                                    │  │
│  │                                                                        │  │
│  │  TS:  categoryRules.Treatment.firstLevelParent                         │  │
│  │       → string literal type (compile-time checked)                     │  │
│  │                                                                        │  │
│  │  NO YAML PARSING AT RUNTIME. NO "CategoryService.load()".              │  │
│  │  Rules are CONSTANTS compiled into the application.                    │  │
│  │                                                                        │  │
│  └───────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

5.3 Generated Code Structure¶

The generated code provides strongly-typed constants - there's no generic "CategoryService" that loads rules at runtime:

C# Generated Code:

// ============================================================================
// AUTO-GENERATED FROM annotation-question-rules.yaml
// DO NOT EDIT - Regenerate with: dotnet run --project tools/schema-gen
// ============================================================================

namespace SyRF.Validation.Generated;

/// <summary>
/// Static category rules - no runtime loading, no service pattern.
/// These are compile-time constants generated from the YAML schema.
/// </summary>
public static class CategoryRules
{
    // Strongly-typed dictionary - category string → rule record
    public static readonly IReadOnlyDictionary<string, CategoryRule> Rules = ...;

    // Direct accessors for common use cases (compile-time checked)
    public static class Treatment
    {
        public static readonly Guid FirstLevelParent =
            Guid.Parse("d04ec2d7-3e10-4847-9999-befe7ee4c454");
        public const bool ControlParameterRequired = true;
    }

    public static class Cohort
    {
        public static readonly Guid FirstLevelParent =
            Guid.Parse("62c852ad-3390-48a4-ac13-439bf6b6587f");
        public const bool ControlParameterRequired = false;
    }
    // ... etc for each category
}

/// <summary>
/// Strongly-typed GUID constants for system questions.
/// IDE autocomplete, compile-time checking, no magic strings.
/// </summary>
public static class SystemQuestionGuids
{
    public static readonly Guid TreatmentLabel =
        Guid.Parse("b02e3072-74f0-44e0-a468-f472b3b09991");
    public static readonly Guid TreatmentControl =
        Guid.Parse("d04ec2d7-3e10-4847-9999-befe7ee4c454");
    // ... all GUIDs
}

TypeScript Generated Code:

// ============================================================================
// AUTO-GENERATED FROM annotation-question-rules.yaml
// DO NOT EDIT - Regenerate with: npm run generate:validation
// ============================================================================

/**
 * Static category rules as const object.
 * Provides compile-time type checking and IDE autocomplete.
 */
export const categoryRules = {
  Treatment: {
    firstLevelParent: 'd04ec2d7-3e10-4847-9999-befe7ee4c454',
    controlParameterRequired: true,
  },
  Cohort: {
    firstLevelParent: '62c852ad-3390-48a4-ac13-439bf6b6587f',
    controlParameterRequired: false,
  },
  // ... etc
} as const;

// Type-safe category name (not just 'string')
export type Category = keyof typeof categoryRules;

// Compile-time checked GUID constants
export const systemQuestionGuids = {
  treatmentLabel: 'b02e3072-74f0-44e0-a468-f472b3b09991',
  treatmentControl: 'd04ec2d7-3e10-4847-9999-befe7ee4c454',
  // ... all GUIDs
} as const;

5.4 Why NOT Runtime Loading?¶

┌─────────────────────────────────────────────────────────────────────────────┐
│                    RUNTIME LOADING ANTI-PATTERN                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ❌ ANTI-PATTERN: Runtime Schema Loading                                     │
│                                                                              │
│  // DON'T DO THIS:                                                           │
│  class CategoryService {                                                     │
│    loadCategoryRules(): Promise<CategoryRules> {                             │
│      return fetch('/schemas/categories.yaml').then(yaml.parse);              │
│    }                                                                         │
│                                                                              │
│    getRequiredParent(category: string): string | null {                      │
│      return this.rules[category]?.systemParent;  // Runtime lookup           │
│    }                                                                         │
│  }                                                                           │
│                                                                              │
│  PROBLEMS:                                                                   │
│  ─────────────────────────────────────────────────────────────────────────  │
│  1. ❌ Type safety lost - 'category' is just a string at compile time       │
│  2. ❌ Typos not caught - getRequiredParent("Treament") compiles fine       │
│  3. ❌ Runtime parsing overhead - YAML parsed on every app start            │
│  4. ❌ Async complexity - must await rules before validation                 │
│  5. ❌ No IDE autocomplete for category names or GUIDs                      │
│  6. ❌ Testing harder - must mock service instead of importing constants    │
│                                                                              │
│  ✅ CORRECT: Generated Static Constants                                      │
│                                                                              │
│  // DO THIS:                                                                 │
│  import { categoryRules, Category } from './generated/category-rules';      │
│                                                                              │
│  function getRequiredParent(category: Category): string | null {            │
│    return categoryRules[category].firstLevelParent;                         │
│  }                                                                           │
│                                                                              │
│  BENEFITS:                                                                   │
│  ─────────────────────────────────────────────────────────────────────────  │
│  1. ✅ Type safety - Category is a union type, not string                   │
│  2. ✅ Typos caught at compile time - "Treament" is a type error            │
│  3. ✅ No runtime overhead - constants are in the compiled bundle           │
│  4. ✅ Synchronous - no async, just direct property access                  │
│  5. ✅ Full IDE autocomplete for categories and properties                  │
│  6. ✅ Easy testing - just import and use, no mocking                       │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

5.5 Future Consideration: Dynamic Categories¶

If SyRF ever needs user-defined categories (unlikely for domain reasons), THAT would require runtime loading. But for the current fixed set of 7 categories with stable rules, compile-time generation is the correct approach.

The YAML schema provides:

Human-readable single source of truth
Version-controlled rule changes
Cross-language consistency via generation

But the YAML is consumed at build time, not runtime.

6. Implementation Recommendations¶

6.1 Phased Approach¶

CRITICAL: Each phase includes mandatory testing requirements. Code changes are NOT complete until tests pass with required coverage.

╔═══════════════════════════════════════════════════════════════════════════════╗
║                    IMPLEMENTATION PHASES                                       ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 1: FORMALIZE RULES (Low Risk, High Value)                               ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  DELIVERABLES:                                                                 ║
║  • Create annotation-question-rules.yaml as single source of truth             ║
║  • Document all rules precisely (this document)                                ║
║  • No code changes yet, just documentation                                     ║
║                                                                                ║
║  TESTING (Preparation):                                                        ║
║  • Create test case matrix document (all rules → test cases)                   ║
║  • Create test data fixtures for each category (YAML format)                   ║
║  • Create example valid/invalid DTOs for each rule                             ║
║                                                                                ║
║  Effort: 1-2 days                                                              ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 2: BACKEND VALIDATION (Medium Risk, High Value) ✅ COMPLETE             ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  DELIVERABLES:                                                                 ║
║  ✅ AnnotationQuestionPlacementRules.cs - category rules with GUIDs           ║
║  ✅ AnnotationQuestionPlacementValidator.cs - validation logic                 ║
║  ✅ ValidationResult.cs - validation result type                               ║
║  ✅ Integration with Project.UpsertCustomAnnotationQuestion()                  ║
║  ✅ Characterization tests for existing invalid data (see below)               ║
║                                                                                ║
║  TESTING: ✅ PASSED                                                            ║
║  • 31 unit tests for AnnotationQuestionPlacementValidator                      ║
║  • 24 integration tests for Project.UpsertCustomAnnotationQuestion()           ║
║  • All categories covered: Study, DMI, Treatment, Outcome, Cohort, Experiment  ║
║  • Valid + invalid cases for first-level and nested questions                  ║
║  • Hidden category rejection                                                   ║
║                                                                                ║
║  CHARACTERIZATION TEST RESULTS (2025-12-27):                                   ║
║  • Study questions with system parent: 0 violations ✅                         ║
║  • Hidden category custom questions: 0 violations ✅                           ║
║  • Questions without required parent: 32 violations in 24 projects             ║
║    - Experiment: 28 (mostly test projects)                                     ║
║    - Disease Model Induction: 3                                                ║
║    - Outcome Assessment: 1                                                     ║
║  • Impact: Low - mostly test/dev projects, questions still functional          ║
║  • Action: No migration needed - new validation prevents future violations     ║
║                                                                                ║
║  Completed: 2025-12-27                                                         ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 3: REFERENTIAL INTEGRITY - SCOPE REDUCED                                ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  STATUS: Largely not needed due to existing architecture                       ║
║                                                                                ║
║  EXISTING PROTECTIONS:                                                         ║
║  ✅ Annotations are scoped to individual Studies (no cross-study leakage)      ║
║  ✅ Annotations are scoped to individual Reviewers (no cross-reviewer leakage) ║
║  ✅ Frontend cascade delete with confirmation (annotation-unit.ts)             ║
║  ✅ Lookup selections only show annotations from current study context         ║
║                                                                                ║
║  REMAINING (Low Priority - only if data issues found):                         ║
║  • Orphan detection script for historical data cleanup                         ║
║  • Backend validation would duplicate frontend logic (not recommended)         ║
║                                                                                ║
║  FUTURE ENHANCEMENT (Defensive Programming):                                   ║
║  • Add validation that annotations belong to correct Study/Reviewer            ║
║  • Purpose: Catch potential bugs where annotations leak between contexts       ║
║  • Would throw if annotation references wrong study or wrong reviewer          ║
║  • Not urgent since architecture prevents this, but useful for bug detection   ║
║                                                                                ║
║  REMOVED FROM SCOPE:                                                           ║
║  ✗ Cross-study reference checking - Not possible due to data model             ║
║  ✗ Backend delete validation - Frontend already handles cascade properly       ║
║                                                                                ║
║  Effort: Minimal (only if orphan cleanup needed)
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 4: CODE GENERATION PIPELINE (Higher Effort, Long-term Value)            ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  DELIVERABLES:                                                                 ║
║  • Set up JSON Schema as authoritative definition                              ║
║  • Add code generation for TypeScript types                                    ║
║  • Add code generation for C# validators                                       ║
║  • Integrate into build pipeline                                               ║
║                                                                                ║
║  TESTING (Required - 100% generated code coverage):                            ║
║  • Generator produces valid C#/TypeScript syntax                               ║
║  • Generated code compiles without errors                                      ║
║  • Snapshot tests: generated code matches expected output                      ║
║  • Property-based tests: any valid DTO passes, invalid DTOs fail               ║
║                                                                                ║
║  Effort: 5-8 days (includes generator tests)                                   ║
║                                                                                ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║                                                                                ║
║  PHASE 5: DECLARATIVE CATEGORIES (Major Refactor - Optional)                   ║
║  ─────────────────────────────────────────────────────────────────────────────║
║  DELIVERABLES:                                                                 ║
║  • Move to YAML-based category definitions                                     ║
║  • Generate all code from schemas                                              ║
║  • Enable runtime category extensibility                                       ║
║                                                                                ║
║  TESTING: Full regression suite + new extensibility tests                      ║
║                                                                                ║
║  Effort: 2-4 weeks                                                             ║
║                                                                                ║
╚═══════════════════════════════════════════════════════════════════════════════╝

6.2 Decision Matrix¶

Approach	Complexity	Value	Risk	Recommendation
Manual sync (current)	Low	Low	High (drift)	Avoid
Backend validation only	Medium	High	Low	Do first
JSON Schema codegen	Medium-High	High	Medium	Phase 4
Declarative YAML categories	High	Very High	Medium	Future consideration

6.3 Testing Strategy: Philosophy and Examples¶

NOTE: Phase-specific testing requirements are defined in section 6.1. This section provides the testing philosophy, example test structures, and supplementary reference material.

┌─────────────────────────────────────────────────────────────────────────────┐
│                    TESTING PHILOSOPHY                                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  1. TEST-FIRST DEVELOPMENT                                                   │
│     Write tests BEFORE implementing validation logic.                        │
│     Each rule in this document = at least one test case.                     │
│                                                                              │
│  2. COVERAGE REQUIREMENTS                                                    │
│     • Backend validators: 95-100% branch coverage (95% CI gate)              │
│     • Frontend validators: 90-100% branch coverage (90% CI gate)             │
│     • Generated code: 100% line coverage (generated, so easier)              │
│                                                                              │
│  3. TEST CATEGORIES                                                          │
│     • Unit tests: Each validation rule in isolation                          │
│     • Integration tests: Validator + real domain objects                     │
│     • Characterization tests: Capture existing (possibly invalid) behavior   │
│     • Property-based tests: Fuzz testing for edge cases                      │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

6.3.1 Example Test Structures¶

Backend Placement Validator Tests (C#)

// ============================================================================
// TEST STRUCTURE: AnnotationQuestionValidatorTests.cs
// ============================================================================

namespace SyRF.ProjectManagement.Core.Tests.Validation;

[TestFixture]
public class AnnotationQuestionPlacementValidatorTests
{
    // ========================================================================
    // CATEGORY PLACEMENT TESTS - One test class per category
    // ========================================================================

    [TestFixture]
    public class TreatmentCategoryTests
    {
        // VALID CASES
        [Test]
        public void FirstLevel_WithTreatmentControlParent_IsValid()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.TreatmentControl,
                controlParam: true);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        [Test]
        public void FirstLevel_WithControlParamFalse_IsValid()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.TreatmentControl,
                controlParam: false);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        [Test]
        public void FirstLevel_WithControlParamNull_IsValid()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.TreatmentControl,
                controlParam: null);  // Shows for both control and non-control

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        [Test]
        public void Nested_WithCustomQuestionParent_IsValid()
        {
            var customParentId = Guid.NewGuid();
            var dto = CreateTreatmentQuestion(
                parentId: customParentId,  // NOT treatmentControl
                controlParam: null);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        // INVALID CASES
        [Test]
        public void FirstLevel_WithWrongParent_ReturnsError()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.CohortLabel,  // Wrong!
                controlParam: true);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.False);
            Assert.That(result.ErrorCode, Is.EqualTo("AQ003"));
        }

        [Test]
        public void FirstLevel_WithNullParent_ReturnsError()
        {
            var dto = CreateTreatmentQuestion(
                parentId: null,  // Treatment can't be root
                controlParam: true);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.False);
            Assert.That(result.ErrorCode, Is.EqualTo("AQ002"));
        }

        [Test]
        public void FirstLevel_MissingControlParam_ReturnsError()
        {
            var dto = CreateTreatmentQuestion(
                parentId: SystemGuids.TreatmentControl,
                controlParam: null,
                hasConditionalParentAnswers: false);  // Missing entirely

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.False);
            Assert.That(result.ErrorCode, Is.EqualTo("AQ005"));
        }
    }

    [TestFixture]
    public class StudyCategoryTests
    {
        // Study is special - can have null parent
        [Test]
        public void FirstLevel_WithNullParent_IsValid()
        {
            var dto = CreateStudyQuestion(parentId: null);

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.True);
        }

        [Test]
        public void FirstLevel_WithAnyParent_ReturnsError()
        {
            var dto = CreateStudyQuestion(
                parentId: SystemGuids.TreatmentLabel);  // Wrong!

            var result = _validator.ValidatePlacement(dto);

            Assert.That(result.IsValid, Is.False);
        }
    }

    // Similar test classes for: CohortCategoryTests, ExperimentCategoryTests,
    // OutcomeCategoryTests, DiseaseModelCategoryTests, HiddenCategoryTests
}

Frontend Cascade Delete Tests (TypeScript)

// ============================================================================
// FRONTEND TEST STRUCTURE: annotation-unit.component.spec.ts
// ============================================================================

describe('AnnotationUnitComponent - Cascade Delete', () => {
  // ========================================================================
  // EXISTING BEHAVIOR TESTS (characterization tests)
  // ========================================================================

  describe('when deleting a Treatment with no references', () => {
    it('should delete immediately without dialog', () => {
      const treatment = createTreatmentUnit('Aspirin 100mg');
      component.containedIn = [];  // No references

      component.removeAnswerGroup();

      expect(dialogService.open).not.toHaveBeenCalled();
      expect(component.removeGroup.emit).toHaveBeenCalled();
    });
  });

  describe('when deleting a Treatment referenced by Cohorts', () => {
    it('should show confirmation dialog with reference list', () => {
      const treatment = createTreatmentUnit('Aspirin 100mg');
      component.containedIn = [
        { id: 'cohort-1', name: 'Control Group' },
        { id: 'cohort-2', name: 'Treatment Group' }
      ];

      component.removeAnswerGroup();

      expect(dialogService.open).toHaveBeenCalledWith(
        ConfirmationDialogComponent,
        expect.objectContaining({
          data: expect.objectContaining({
            contentHtml: expect.stringContaining('Control Group')
          })
        })
      );
    });

    it('should cascade remove references when user confirms', fakeAsync(() => {
      setupCascadeScenario();
      dialogResult$.next(true);  // User clicks DELETE

      tick();

      // Verify Treatment removed from Cohort 1's cohortTreatments
      const cohort1Treatments = getCohortTreatments('cohort-1');
      expect(cohort1Treatments).not.toContain(treatment.annotationId);

      // Verify Treatment removed from Cohort 2's cohortTreatments
      const cohort2Treatments = getCohortTreatments('cohort-2');
      expect(cohort2Treatments).not.toContain(treatment.annotationId);

      // Verify Treatment unit itself deleted
      expect(component.removeGroup.emit).toHaveBeenCalled();
    }));

    it('should NOT delete when user cancels', fakeAsync(() => {
      setupCascadeScenario();
      dialogResult$.next(false);  // User clicks Cancel

      tick();

      expect(component.removeGroup.emit).not.toHaveBeenCalled();
    }));
  });

  // Test all category combinations
  describe.each([
    ['Treatment', 'Cohort', 'cohortTreatments'],
    ['Disease Model', 'Cohort', 'cohortDiseaseModels'],
    ['Outcome', 'Cohort', 'cohortOutcomes'],
    ['Cohort', 'Experiment', 'experimentCohorts'],
  ])('when deleting %s referenced by %s', (unitType, containerType, lookupQuestion) => {
    it(`should remove from ${lookupQuestion} on cascade`, () => {
      // Parameterized test for each lookup relationship
    });
  });
});

6.3.2 Test Data Fixtures¶

# test-fixtures/annotation-questions.yaml
# Reusable test data for all validation tests

validQuestions:
  treatment_first_level:
    id: "11111111-1111-1111-1111-111111111111"
    category: "Treatment"
    system: false
    target:
      parentId: "d04ec2d7-3e10-4847-9999-befe7ee4c454"  # treatmentControl
      conditionalParentAnswers:
        conditionType: 0
        targetParentBoolean: false

  treatment_nested:
    id: "22222222-2222-2222-2222-222222222222"
    category: "Treatment"
    system: false
    target:
      parentId: "11111111-1111-1111-1111-111111111111"  # custom parent
      conditionalParentAnswers: null  # Not required for nested

  study_root:
    id: "33333333-3333-3333-3333-333333333333"
    category: "Study"
    system: false
    target: null  # Study can be root

invalidQuestions:
  treatment_wrong_parent:
    id: "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
    category: "Treatment"
    target:
      parentId: "62c852ad-3390-48a4-ac13-439bf6b6587f"  # cohortLabel - WRONG!
    expectedError: "AQ003"

  cohort_missing_parent:
    id: "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb"
    category: "Cohort"
    target: null  # Cohort cannot be root
    expectedError: "AQ002"

  treatment_missing_control_param:
    id: "cccccccc-cccc-cccc-cccc-cccccccccccc"
    category: "Treatment"
    target:
      parentId: "d04ec2d7-3e10-4847-9999-befe7ee4c454"
      conditionalParentAnswers: null  # Missing when parent is control
    expectedError: "AQ005"

6.3.3 CI/CD Integration¶

# .github/workflows/test-coverage.yml (additions)
annotation-question-validation-tests:
  runs-on: ubuntu-latest
  steps:
    - name: Run Backend Validation Tests
      run: |
        dotnet test \
          --filter "Category=AnnotationQuestionValidation" \
          --collect:"XPlat Code Coverage" \
          --results-directory ./coverage

    - name: Check Coverage Threshold
      run: |
        COVERAGE=$(grep -oP 'line-rate="\K[^"]+' coverage/*/coverage.cobertura.xml)
        if (( $(echo "$COVERAGE < 0.95" | bc -l) )); then
          echo "Coverage $COVERAGE is below 95% threshold"
          exit 1
        fi

    - name: Run Frontend Validation Tests
      working-directory: src/services/web
      run: |
        npx ng test --no-watch --code-coverage \
          --include="**/annotation-*.spec.ts"

    - name: Upload to SonarCloud
      uses: SonarSource/sonarqube-scan-action@v2
      with:
        args: >
          -Dsonar.projectKey=syrf-annotation-questions
          -Dsonar.coverage.exclusions=**/*.generated.ts,**/*.generated.cs

6.3.4 Test Coverage Matrix¶

Rule Category	Backend Tests	Frontend Tests	Coverage Target
Category Placement (7 categories)	21+ tests	N/A	100%
First-Level vs Nested	14+ tests	N/A	100%
Control Parameter Rules	8+ tests	N/A	100%
Lookup Referential Integrity	12+ tests	8+ tests	95%
Cascade Delete UI	N/A	16+ tests	90%
Generated Validators	Property-based	Property-based	100%

Total Estimated Tests: 79+ unit tests, 12+ integration tests

7. Appendix: Complete Rule Reference¶

7.1 Category Parent Rules (First-Level Custom Questions)¶

IMPORTANT: These rules apply ONLY to first-level custom questions (created via category header "Add Question" button).

Nested custom questions (created via "Add Related") have target.parentId set to the parent custom question's ID, NOT the system question GUID. See section 2.3.

Category	First-Level Parent	System Parent GUID	Control Param	Has System Questions
Study	`null` (true root)	N/A	N/A	No
Disease Model Induction	modelControl	`b18aa936-...`	Required	Yes
Treatment	treatmentControl	`d04ec2d7-...`	Required	Yes
Outcome Assessment	outcomeLabel	`dbe2720c-...`	Forbidden	Yes
Cohort	cohortLabel	`62c852ad-...`	Forbidden	Yes
Experiment	experimentLabel	`7c555b6e-...`	Forbidden	Yes
Hidden	N/A (no custom)	N/A	N/A	Yes (system-only)

Study Category Special Case:

Study has no system questions - it is the only category where custom questions are true root-level
First-level custom questions in Study have target: null
Nested questions in Study parent to other custom questions (like all other categories)

Example - Treatment Category:

First-level question:  target.parentId = "d04ec2d7-3e10-4847-9999-befe7ee4c454" (treatmentControl GUID)
Nested question:       target.parentId = "abc123..." (parent custom question's ID, NOT treatmentControl)

7.2 Lookup Dependency Matrix¶

Lookup Question	Source Category	Source Question	Multi-Select
cohortDiseaseModels	Disease Model Induction	diseaseModelLabel	Yes
cohortTreatments	Treatment	treatmentLabel	Yes
cohortOutcomes	Outcome Assessment	outcomeLabel	Yes
experimentCohorts	Cohort	cohortLabel	Yes
outcomePdfGraphs	Hidden	pdfReferences	Yes

7.3 Validation Error Messages¶

validationErrors:
  INVALID_CATEGORY:
    code: "AQ001"
    message: "Unknown category: {category}"

  MISSING_REQUIRED_PARENT:
    code: "AQ002"
    message: "Questions in {category} must have parent {expectedParent}"

  WRONG_PARENT:
    code: "AQ003"
    message: "Questions in {category} must have parent {expectedParent}, got {actualParent}"

  ROOT_NOT_ALLOWED:
    code: "AQ004"
    message: "Questions in {category} cannot be root-level"

  MISSING_CONTROL_PARAMETER:
    code: "AQ005"
    message: "{category} questions must specify control parameter (true/false/null)"

  CONTROL_PARAMETER_NOT_ALLOWED:
    code: "AQ006"
    message: "{category} questions must not have control parameter"

  ORPHANED_LOOKUP_REFERENCE:
    code: "AQ007"
    message: "Lookup references non-existent annotation: {annotationId}"

  WRONG_LOOKUP_SOURCE:
    code: "AQ008"
    message: "Annotation {annotationId} is not from expected source question"

  CROSS_STUDY_REFERENCE:
    code: "AQ009"
    message: "Cannot reference annotation from different study"

  # Frontend UI validation errors (create-question.component.ts)
  QUESTION_TEXT_TOO_LONG:
    code: "AQ010"
    message: "Question text exceeds maximum length of 80 characters"

  QUESTION_TEXT_REQUIRED:
    code: "AQ011"
    message: "Question text is required"

  OPTIONS_REQUIRED:
    code: "AQ012"
    message: "Options are required for dropdown/radio/checklist/autocomplete controls"

  DUPLICATE_OPTIONS:
    code: "AQ013"
    message: "Option values must be unique"

  INVALID_NUMERIC_OPTION:
    code: "AQ014"
    message: "Options must be valid {type} values"

  CONDITIONAL_ANSWERS_REQUIRED:
    code: "AQ015"
    message: "At least one conditional parent answer must be selected"

  INVALID_CONDITIONAL_ANSWER:
    code: "AQ016"
    message: "Conditional parent answer must match one of parent's options"

Appendix A: Characterization Test Results (2025-12-27)¶

This appendix documents the results of running characterization tests against the production MongoDB database (syrftest) to identify existing annotation questions that violate the placement rules defined in this specification.

Summary¶

Validation Check	Result
Study questions with system parent	0 violations ✅
Hidden category custom questions	0 violations ✅
Questions without required parent	32 violations ⚠️

Violation Details¶

32 custom questions in categories that require a parent (DMI, Treatment, Outcome Assessment, Cohort, Experiment) were found with Target = null (no parent).

Disease Model Induction (3 violations)¶

Project	Question	Created
Internal Testing 2024	tryingout	2024-09-02
Internal testing Aug 2024	Somethingelse	2024-08-15
TestMVP1508	new	2024-08-22

Experiment (28 violations)¶

Project	Question	Created
Cannabinoids	What was the time (post model induction;hours) of outcome assessment measurement where the difference between control and treatment is greatest?	2019-04-10
Cannabinoids	What was the time (post model induction;hours) of the last outcome assessment measurement?	2019-04-10
ChABC SCI	Number of Controls	2018-07-04
ChABC SCI	Number in Treatment Group	2018-07-04
ChABC SCI	Total Number of Animals?	2018-12-13
ChABC SCI	Number of Sham Animals	2019-01-16
ChABC SCI	Number of Comparisons	2019-01-16
Comparison of brain serotonin levels between light and dark	when are the lights on ?	2018-01-15
For Alex Again	hn	2019-01-20
Is the result of intra-peritoneal coated meshes comparable for hernia in all research groups in animals?	Does it contain an animal experiment?	2018-03-26
NAFLD test 3	How many developed HCC?	2018-01-16
NAFLD1 test	Were any side effects noted?	2017-07-10
NAFLD1 test	test	2017-07-10
PC12 OGD Data Extraction	Statistical Test Used	2019-03-05
Pre-clinical studies in tissue-engineering for long-gap oesophageal atresia in pigs. A systematic literature review.	Pre seeding	2019-03-29
Preclinical and Clinical Studies on the Use of Stem cells in Craniofacial Bone Regeneration: A Systematic Review.	M	2018-04-15
ProjectName	experiment	2019-06-24
SR Tool Feature Analysis	Is HPLC used?	2018-06-29
Systematic review and meta-analysis of preclinical studies of efficacya and safety of M. charantia on type 2 diabetes mellitus	design	2019-01-03
Systematic review of drug delivery methods to the inner ear	Is there a comparator?	2018-01-09
TLR4/liver regeneration	was TLR4 measured	2017-05-31
Test	test question	2018-07-19
Teste	What is the figure?	2017-05-03
Teste	What is the figure (alternate)?	2017-05-03
The influene of bioinorganic elements included in calcium phosphate based bone subsitutes	species?	2017-07-07
VGF derived neuropeptides and pain	Was VGF measured and was is up regulated or down regulated	2017-09-21
exposed to A have effects on B	is this a trial or observation study	2019-03-10
gfsdkfgi	hhch?	2018-10-29

Outcome Assessment (1 violation)¶

Project	Question	Created
Internal Testing 2024	question	2024-09-02

Analysis¶

Most violations are in test/development projects: Project names like "Test", "Teste", "NAFLD1 test", "Internal Testing 2024", "TestMVP1508", "For Alex Again", "gfsdkfgi" indicate these are not production reviews.
All violations are historical: The newest violations are from September 2024 (internal testing), with most from 2017-2019.
Questions still function: Despite missing the required parent, these questions are still usable in the UI. The parent relationship affects hierarchy display but does not break core functionality.
No data integrity issues: The questions are self-contained and do not reference non-existent entities.

Recommendation¶

No migration or cleanup is required.

The new validation in Project.UpsertCustomAnnotationQuestion() prevents future violations
Existing questions continue to function
Most affected projects are test/development projects
Cleaning up would require project owner coordination with minimal benefit

MongoDB Query Used¶

db.pmProject.aggregate([
  { $unwind: "$AnnotationQuestions" },
  { $match: {
      "AnnotationQuestions.System": false,
      "AnnotationQuestions.Category": {
        $in: ["Disease Model Induction", "Treatment", "Outcome Assessment", "Cohort", "Experiment"]
      },
      "AnnotationQuestions.Target": null
  }},
  { $project: {
      projectName: "$Name",
      category: "$AnnotationQuestions.Category",
      question: "$AnnotationQuestions.Question",
      createdDate: "$AnnotationQuestions.DateTimeCreated"
  }},
  { $sort: { category: 1, projectName: 1 }}
])