Bulk Study Update: Blank Field Behavior Inconsistency¶
Summary¶
A discrepancy exists between the documented behavior and actual implementation of the Bulk Study Update feature when handling blank/empty cells for customId and pdfRelativePath fields.
| Field | Documentation Says | Code Actually Does |
|---|---|---|
customId (blank cell) |
Erases existing value | Preserves existing value |
pdfRelativePath (blank cell) |
Erases existing value | Preserves existing value |
| Screening columns (blank cell) | Preserves existing value | Preserves existing value ✓ |
Evidence¶
Documentation (User Guide)¶
File: /user-guide/manage-studies.md (lines 45, 67-69)
> **Warning:** Providing an empty/blank cell for `pdfRelativePath` or `customId`
> in the update file *will erase* any existing value for that field in SyRF for that study.
> **Screening Decisions:** ... Blank cells in screening columns are *ignored*
> and do not erase existing decisions.
Code Implementation¶
File: src/libs/project-management/SyRF.ProjectManagement.Core/Services/StudyReferenceFileParser.cs (lines 179-180)
if (!string.IsNullOrWhiteSpace(u.CustomId)) study.CustomId = u.CustomId;
if (!string.IsNullOrWhiteSpace(u.PdfRelativePath)) study.PdfRelativePath = u.PdfRelativePath;
File: src/libs/project-management/SyRF.ProjectManagement.Core/Services/ParserImplementations/NewParser/StudyUpdateRecordProcessor.cs (lines 172-186)
private string? GetValueFromColumn(IReadOnlyList<string> fields, string columnName)
{
if (!_columnDictionary.TryGetValue(columnName.ToLowerInvariant(), out var columnIndex))
{
return null; // Missing column → null
}
var val = fields[columnIndex];
return string.IsNullOrWhiteSpace(val) ? "" : val; // Blank field → empty string
}
Data Flow Analysis¶
| CSV State | GetValueFromColumn() Returns |
IsNullOrWhiteSpace() |
Update Behavior |
|---|---|---|---|
| Column missing entirely | null |
true |
Preserved |
| Column present, cell empty | "" |
true |
Preserved |
| Column present, cell has value | "value" |
false |
Updated |
Both missing columns AND blank fields result in preservation, contradicting documentation.
E2E Test Plan¶
Objective¶
Confirm the actual runtime behavior in a deployed environment to definitively verify the bug before implementing a fix.
Test Scenarios¶
Scenario 1: Blank CustomId Field¶
Setup:
- Create a project with test studies
- Ensure studies have existing
customIdvalues (e.g., "ORIGINAL-001", "ORIGINAL-002")
Test:
- Export studies to get
studyIdvalues - Create CSV with columns:
studyId,customId - Leave
customIdcells blank for test studies - Upload via Bulk Study Update
Expected (per docs): customId values should be erased (null/empty)
Expected (per code): customId values should be preserved ("ORIGINAL-001", etc.)
Scenario 2: Blank PdfRelativePath Field¶
Setup:
- Use same project/studies
- Ensure studies have existing
pdfRelativePathvalues
Test:
- Create CSV with columns:
studyId,pdfRelativePath - Leave
pdfRelativePathcells blank - Upload via Bulk Study Update
Expected (per docs): pdfRelativePath values should be erased
Expected (per code): pdfRelativePath values should be preserved
Scenario 3: Missing Column vs Blank Field¶
Setup:
- Studies with both
customIdandpdfRelativePathset
Test A - Missing Column:
- CSV with only
studyIdandcustomIdcolumns (nopdfRelativePathcolumn) customIdhas new values- Upload and verify
pdfRelativePathis preserved
Test B - Blank Field:
- CSV with
studyId,customId,pdfRelativePathcolumns customIdhas new values,pdfRelativePathis blank- Upload and verify behavior matches Test A (both preserved)
Scenario 4: Screening Baseline (Control)¶
Test:
- CSV with screening columns containing blank cells
- Verify blank screening cells are ignored (preserved)
- This confirms screening behavior matches documentation
Test Environment¶
- Recommended: Staging environment with
syrftestdatabase - Alternative: Local development with test database
- Cleanup: Delete test project after verification
Verification Query¶
// MongoDB query to verify field values after bulk update
db.pmStudy.find(
{ _id: { $in: [CSUUID("study-id-1"), CSUUID("study-id-2")] } },
{ customId: 1, pdfRelativePath: 1 }
)
Options Analysis¶
Option A: Fix Code to Match Documentation (Blank = Erase)¶
Implementation:
// In StudyReferenceFileParser.cs
// Change conditional updates to unconditional for non-null StudyUpdate values
if (u.CustomId != null) study.CustomId = string.IsNullOrWhiteSpace(u.CustomId) ? null : u.CustomId;
if (u.PdfRelativePath != null) study.PdfRelativePath = string.IsNullOrWhiteSpace(u.PdfRelativePath) ? null : u.PdfRelativePath;
Also requires change in GetValueFromColumn():
// Return null for missing column, empty string for blank field
if (!_columnDictionary.TryGetValue(columnName.ToLowerInvariant(), out var columnIndex))
{
return null; // Missing column - don't touch field
}
return fields[columnIndex]; // Return actual value (including empty string)
Pros:
- Matches existing user documentation and training materials
- Allows users to intentionally clear field values
- More powerful: can both update AND clear in single operation
- Documentation was written with this behavior in mind (original intent)
Cons:
- Breaking change for users who rely on current behavior
- Risk of accidental data loss if users don't carefully prepare CSVs
- Requires users to remove columns entirely to preserve values
- More destructive default behavior
Risk Assessment: MEDIUM-HIGH
- Users who have been using the feature with blank cells expecting preservation will suddenly lose data
Option B: Fix Documentation to Match Code (Blank = Preserve)¶
Implementation:
- No code changes required
- Update
/user-guide/manage-studies.mdto reflect actual behavior
Documentation Change:
> **Note:** Blank/empty cells for `pdfRelativePath` or `customId` are *ignored*
> during the update - they will NOT change existing values. Only provide values
> in cells where you want to update data.
Pros:
- No code changes, minimal effort
- Safer default - less risk of accidental data loss
- Simpler mental model: "only non-empty values cause updates"
- Consistent with screening column behavior
- No breaking change for existing users
Cons:
- No way to intentionally clear a field value via bulk update
- Original documentation suggests erase behavior was intended
- May have been an implementation bug that became the de facto standard
- Reduces feature capability
Risk Assessment: LOW
- Documentation-only change, no functional impact on existing workflows
Option C: Add Explicit Sentinel Value¶
Implementation:
private const string ClearSentinel = "[CLEAR]";
// In StudyReferenceFileParser.cs
if (!string.IsNullOrWhiteSpace(u.CustomId))
{
study.CustomId = u.CustomId == ClearSentinel ? null : u.CustomId;
}
if (!string.IsNullOrWhiteSpace(u.PdfRelativePath))
{
study.PdfRelativePath = u.PdfRelativePath == ClearSentinel ? null : u.PdfRelativePath;
}
Documentation:
- Blank cells: Existing values are preserved (no change)
- `[CLEAR]`: Explicitly removes/erases the existing value
- Any other value: Updates to that value
Pros:
- Explicit user intent - no ambiguity
- Backward compatible with current behavior
- Provides full functionality (update, preserve, AND clear)
- Best of both worlds
- Clear audit trail in CSV files
Cons:
- More complex for users to learn
- Documentation and training materials need updates
- Additional code complexity
- Sentinel value could theoretically conflict with actual data (unlikely for "[CLEAR]")
Risk Assessment: LOW-MEDIUM
- Backward compatible, but adds complexity
Option D: Distinguish Missing Column vs Blank Field¶
Implementation:
// GetValueFromColumn already distinguishes:
// - Missing column → null
// - Blank field → ""
// In StudyReferenceFileParser.cs
if (u.CustomId != null) // Column exists in CSV
{
study.CustomId = string.IsNullOrWhiteSpace(u.CustomId) ? null : u.CustomId;
}
// If u.CustomId is null (column missing), don't touch study.CustomId
Behavior: | CSV State | Result | |-----------|--------| | Column missing | Field preserved (untouched) | | Column present, cell blank | Field erased (set to null) | | Column present, cell has value | Field updated |
Pros:
- Matches original documentation intent
- Fine-grained control via CSV structure
- Allows both preservation (omit column) and clearing (include blank column)
- Leverages existing parsing infrastructure
Cons:
- Subtle distinction users may misunderstand
- Current behavior treats both cases identically (breaking change)
- Requires careful CSV column management
- Mental model complexity: "presence of column matters, not just value"
Risk Assessment: MEDIUM
- Breaking change for blank field behavior, but preserves missing column behavior
Recommendation¶
Preferred Option: Option D (Distinguish Missing Column vs Blank Field)¶
Rationale:
-
Aligns with Documentation Intent: The user guide explicitly states blank cells erase values. This suggests the original feature design intended this behavior. Option D restores that intent.
-
Maintains Intuitive Default: Users who don't include a column don't affect that field - this is already true and remains unchanged.
-
Enables Full Functionality: Users can:
- Preserve: Don't include the column in CSV
- Update: Include column with new value
-
Clear: Include column with blank value
-
Leverages Existing Infrastructure: The
GetValueFromColumn()method already distinguishesnull(missing column) from""(blank field). The fix only requires changing the update logic. -
Matches User Expectations: Users reading the current documentation would expect blank cells to erase. The current behavior is the bug, not the documentation.
Implementation Complexity¶
Low - Single file change in StudyReferenceFileParser.cs:
// Current (buggy):
if (!string.IsNullOrWhiteSpace(u.CustomId)) study.CustomId = u.CustomId;
if (!string.IsNullOrWhiteSpace(u.PdfRelativePath)) study.PdfRelativePath = u.PdfRelativePath;
// Fixed (Option D):
if (u.CustomId != null) study.CustomId = string.IsNullOrWhiteSpace(u.CustomId) ? null : u.CustomId;
if (u.PdfRelativePath != null) study.PdfRelativePath = string.IsNullOrWhiteSpace(u.PdfRelativePath) ? null : u.PdfRelativePath;
Migration Considerations¶
- Announce in Release Notes: Clearly document the behavior change
- Update What's New: Add entry explaining the fix
- User Communication: Email users who have used bulk update feature
- Timing: Deploy during low-activity period
Alternative Consideration¶
If breaking change risk is deemed too high, Option C (sentinel value) provides equivalent functionality without changing existing behavior. It requires more documentation effort but is fully backward compatible.
Behavioral Consistency Analysis¶
The Inconsistency Question¶
The original documentation intentionally specifies different behavior for different column types:
| Column Type | Documented Behavior | Rationale |
|---|---|---|
| Screening | Blank = Preserved | Screening decisions are binary (0/1). Blank means "no decision made" - preserving existing decisions is safer. |
| CustomId | Blank = Erased | Metadata fields. Blank could be intentional "clear this field". |
| PdfRelativePath | Blank = Erased | Metadata fields. Blank could be intentional "clear this field". |
Option E: Consistent Behavior (All Preserve)¶
Make ALL fields behave the same way: blank = preserve.
Implementation:
- No code changes (current behavior)
- Update documentation to match code
Behavior: | Column Type | Blank Cell | |-------------|------------| | Screening | Preserved | | CustomId | Preserved | | PdfRelativePath | Preserved |
Pros:
- Perfectly consistent - one rule for all columns
- Safest - no accidental data loss
- No code changes needed
- Current behavior preserved
Cons:
- No way to clear ANY field via bulk update
- Reduces feature capability significantly
- Original design intentionally had different behavior
Option F: Consistent Behavior (All Erase)¶
Make ALL fields behave the same way: blank = erase.
Implementation:
- Change screening to also erase on blank
- Change CustomId/PdfRelativePath to erase on blank
Behavior: | Column Type | Blank Cell | |-------------|------------| | Screening | Erased | | CustomId | Erased | | PdfRelativePath | Erased |
Pros:
- Perfectly consistent - one rule for all columns
- Maximum capability - can clear any field
Cons:
- DANGEROUS for screening - blank could accidentally clear decisions
- Breaking change for all column types
- Screening "erase" semantically unclear (what does null screening mean?)
- High risk of data loss
Risk Assessment: HIGH - Not recommended
Revisited Recommendation¶
After considering consistency, I still recommend Option D (different behavior per column type) because:
- Semantic Difference: Screening decisions and metadata fields ARE fundamentally different:
- Screening: Ternary state (Include/Exclude/No Decision)
-
Metadata: String value or no value
-
Original Design Intent: The documentation explicitly describes different behavior, suggesting this was a deliberate design choice, not an oversight.
-
Risk Profile: Clearing a screening decision is more dangerous than clearing a PDF path. Different risk warrants different default behavior.
-
User Mental Model: Users likely think of screening columns differently than metadata columns - they're mapped to users and stages, not simple text fields.
However, if perfect consistency is required, then Option E (all preserve) is the safest choice, accepting the trade-off that fields cannot be cleared via bulk update.
Consistency Decision Matrix¶
| Priority | Recommended Option |
|---|---|
| Feature capability + original intent | Option D (different behavior) |
| Perfect consistency + safety | Option E (all preserve) |
| Perfect consistency + power | Option F (all erase) - NOT RECOMMENDED |
| Backward compatibility + flexibility | Option C (sentinel value) |
Post-E2E Testing Action Plan¶
If E2E Testing Confirms Bug¶
Once testing confirms the code diverges from documentation:
Immediate Actions¶
- Document Findings: Record exact test results with screenshots/logs
- Severity Assessment: Classify as medium priority (functional deviation, not data corruption)
- Stakeholder Notification: Brief Product Owner (Alex) on findings
- Decision Meeting: Schedule 30-min decision session
Decision Timeline¶
| Day | Action |
|---|---|
| 0 | E2E test confirms bug |
| 1-2 | Share Product Owner Report (below) with Alex |
| 3-5 | Decision meeting, select option |
| 6-10 | Implementation (if code change selected) |
| 11-12 | QA verification |
| 13-14 | Staging deployment + user communication |
| 15+ | Production deployment |
Product Owner Report¶
To: Alex (Product Owner) From: Development Team Subject: Bulk Study Update - Behavior Discrepancy Requiring Decision
Executive Summary¶
We discovered that the Bulk Study Update feature behaves differently than documented in the user guide. This affects how blank cells in CSV uploads are processed for customId and pdfRelativePath fields.
Impact: Users following the documentation may have unexpected results. No data corruption has occurred, but the feature doesn't work as described.
Current Situation¶
| What Documentation Says | What Actually Happens |
|---|---|
Blank cell erases existing customId |
Blank cell preserves existing customId |
Blank cell erases existing pdfRelativePath |
Blank cell preserves existing pdfRelativePath |
| Blank screening cell preserves decision | Blank screening cell preserves decision ✓ |
User Impact Assessment¶
Affected Users: Any project administrator who has used Bulk Study Update with blank cells expecting erasure.
Likelihood of Impact: LOW-MEDIUM
- Feature is relatively new (March 2025)
- Erasing fields is a less common use case than updating them
- Users may not have noticed if they weren't specifically trying to clear fields
Data Risk: NONE
- Current behavior preserves data (safer than documented)
- No data has been lost; fields may not have been cleared when users expected
Decision Required¶
We need your input to select one of the following approaches:
Option A: Fix Code to Match Documentation¶
"Make it work as documented"
What changes: Blank cells will erase field values (as documented)
Pros:
- Matches what users read in documentation
- Enables users to clear fields via bulk update
- No documentation changes needed
Cons:
- Breaking change for any users relying on current behavior
- Risk of accidental data erasure if users aren't careful with CSVs
- Requires user communication about behavior change
Effort: 2-3 days development + testing Risk: Medium (behavior change)
Option B: Fix Documentation to Match Code¶
"Document what it actually does"
What changes: Update user guide to say blank cells preserve existing values
Pros:
- No code changes, minimal effort
- Safest option - preserves data by default
- No risk of breaking existing workflows
Cons:
- Users cannot clear fields via bulk update at all
- Reduces feature capability
- May frustrate users who read current docs and expected erasure
Effort: 1 day documentation update Risk: Low
Option C: Add Explicit Clear Syntax¶
"Best of both worlds with [CLEAR] keyword"
What changes:
- Blank cells preserve (current behavior)
- Special value
[CLEAR]explicitly erases
Pros:
- Backward compatible
- Explicit user intent - no accidental erasure
- Full feature capability
- Clear audit trail in CSV files
Cons:
- Users must learn new syntax
- Slightly more complex to explain
- Documentation update needed
Effort: 3-4 days development + testing + documentation Risk: Low (additive change)
Option D: Smart Column Detection (Recommended by Dev Team)¶
"Match documented behavior precisely"
What changes:
- Missing column = field preserved (don't touch what's not mentioned)
- Blank cell in present column = field erased (explicit blank means clear)
Pros:
- Matches original documentation exactly
- Intuitive: "if you include the column, you're managing that field"
- Preserves screening behavior (already works this way conceptually)
Cons:
- Subtle distinction users must understand
- Breaking change for blank cell behavior
- Requires user communication
Effort: 2-3 days development + testing Risk: Medium (behavior change, but matches docs)
Recommendation¶
Development team recommends Option D because:
- It matches what the documentation already says
- The distinction (missing column vs blank cell) is intuitive
- It restores the original design intent
However, if minimal risk is priority, choose Option B (fix docs). If maximum flexibility is priority, choose Option C (sentinel value).
Questions for Decision¶
- How many users have actively used Bulk Study Update? (Helps assess impact)
- Have we received any support tickets about this behavior?
- Is clearing fields via bulk update a common/important use case?
- What is our appetite for user communication about a behavior change?
Technical Implementation Plan (Dev Team)¶
Prerequisites¶
- E2E testing completed confirming bug
- Product Owner decision on selected option
- Sprint capacity allocated
Option D Implementation (Recommended)¶
Phase 1: Code Changes (1-2 days)¶
File: src/libs/project-management/SyRF.ProjectManagement.Core/Services/StudyReferenceFileParser.cs
Current Code (lines 179-180):
if (!string.IsNullOrWhiteSpace(u.CustomId)) study.CustomId = u.CustomId;
if (!string.IsNullOrWhiteSpace(u.PdfRelativePath)) study.PdfRelativePath = u.PdfRelativePath;
New Code:
// null = column missing (preserve), "" = blank cell (erase), "value" = update
if (u.CustomId != null)
{
study.CustomId = string.IsNullOrWhiteSpace(u.CustomId) ? null : u.CustomId;
}
if (u.PdfRelativePath != null)
{
study.PdfRelativePath = string.IsNullOrWhiteSpace(u.PdfRelativePath) ? null : u.PdfRelativePath;
}
File: src/libs/project-management/SyRF.ProjectManagement.Core/Services/ParserImplementations/NewParser/StudyUpdateRecordProcessor.cs
Current Code (line 185):
Verify: This already returns "" for blank and null for missing column - no change needed here.
Phase 2: Unit Tests (0.5 days)¶
File: src/libs/project-management/SyRF.ProjectManagement.Core.Tests/StudyUpdateTests.cs (new file)
public class BulkStudyUpdateBehaviorTests
{
[Fact]
public void MissingCustomIdColumn_ShouldPreserveExistingValue()
{
// Arrange: Study with CustomId = "EXISTING"
// Act: Apply update with CustomId = null (missing column)
// Assert: CustomId still "EXISTING"
}
[Fact]
public void BlankCustomIdCell_ShouldEraseExistingValue()
{
// Arrange: Study with CustomId = "EXISTING"
// Act: Apply update with CustomId = "" (blank cell)
// Assert: CustomId is null
}
[Fact]
public void ValueInCustomIdCell_ShouldUpdateValue()
{
// Arrange: Study with CustomId = "EXISTING"
// Act: Apply update with CustomId = "NEW"
// Assert: CustomId is "NEW"
}
// Repeat for PdfRelativePath...
[Fact]
public void BlankScreeningCell_ShouldPreserveExistingDecision()
{
// Verify screening behavior unchanged
}
}
Phase 3: Integration Tests (0.5 days)¶
File: src/services/project-management/SyRF.ProjectManagement.Endpoint.Tests/BulkStudyUpdateIntegrationTests.cs
Test full flow:
- Create test project with studies having CustomId/PdfRelativePath values
- Generate CSV with various blank/missing column scenarios
- Execute bulk update
- Verify database state matches expected behavior
Phase 4: Documentation Updates (0.5 days)¶
User Guide (/user-guide/manage-studies.md):
- Clarify distinction between missing column and blank cell
- Add examples showing both scenarios
- Update warning box to be more precise
What's New (/user-guide/whats-new/):
- Add entry for behavior clarification/fix
Phase 5: Release & Communication (0.5 days)¶
Release Notes:
### Bulk Study Update - Behavior Clarification
The Bulk Study Update feature now correctly distinguishes between:
- **Missing column**: Field is preserved (not modified)
- **Blank cell**: Field is cleared (set to empty)
This matches the documented behavior. If you were relying on blank cells
to preserve values, please adjust your CSV files to omit the column entirely.
User Communication (if deemed necessary):
- Email to project administrators
- In-app notification on bulk update page
Option C Implementation (Sentinel Value)¶
If Option C is selected instead:
Code Changes¶
File: StudyReferenceFileParser.cs
private const string ClearSentinel = "[CLEAR]";
// In update logic:
if (!string.IsNullOrWhiteSpace(u.CustomId))
{
study.CustomId = u.CustomId.Equals(ClearSentinel, StringComparison.OrdinalIgnoreCase)
? null
: u.CustomId;
}
if (!string.IsNullOrWhiteSpace(u.PdfRelativePath))
{
study.PdfRelativePath = u.PdfRelativePath.Equals(ClearSentinel, StringComparison.OrdinalIgnoreCase)
? null
: u.PdfRelativePath;
}
Documentation: Must clearly explain [CLEAR] syntax with examples.
Option B Implementation (Documentation Only)¶
If Option B is selected:
File: /user-guide/manage-studies.md
Change lines 44-45 from:
> **Warning:** Providing an empty/blank cell for `pdfRelativePath` or `customId`
> in the update file *will erase* any existing value
To:
> **Note:** Empty or blank cells for `pdfRelativePath` or `customId` are ignored
> during the update - they will NOT change existing values. To clear a field,
> you must use the SyRF web interface directly.
Implementation Checklist¶
Pre-Implementation¶
- Run E2E tests to confirm current behavior
- Share Product Owner Report with Alex
- Decision meeting completed
- Option selected: ______
Implementation (Option D)¶
- Update
StudyReferenceFileParser.cswith new logic - Create
BulkStudyUpdateBehaviorTests.csunit tests - Create integration tests
- All tests passing locally
- Code review completed
- Update user guide documentation
- Create What's New entry
- Draft release notes
- Draft user communication (if needed)
Deployment¶
- Deploy to staging
- QA verification on staging
- Send user communication (if approved)
- Deploy to production
- Monitor for support tickets
- Create release notes entry
- Update What's New page
- Plan user communication if breaking change
Completed: CustomId Accessibility Fix¶
Issue¶
The Study entity had inconsistent property accessibility:
| Property | Before | After |
|---|---|---|
PdfRelativePath |
public get, internal set |
public get, internal set |
CustomId |
public get, public set |
public get, internal set |
Change Made¶
File: src/libs/project-management/SyRF.ProjectManagement.Core/Model/StudyAggregate/Study.cs
// Before (line 93)
public string? CustomId { get; set; }
// After
public string? CustomId { get; internal set; }
Verification¶
Usages checked - Only 2 locations set CustomId:
StudyReferenceFileParser.cs:179(same assembly) ✓StudyShould.cs:121(test project withInternalsVisibleTo) ✓
Tests: All 9 StudyTest tests pass after change.
Rationale¶
- DDD Encapsulation: Domain entities should control their own state changes
- Consistency: Both bulk-updatable metadata fields now have identical accessibility
- Safety: Prevents external assemblies from bypassing domain logic
- InternalsVisibleTo: Test assembly explicitly granted access via csproj configuration
Related Configuration¶
File: src/libs/project-management/SyRF.ProjectManagement.Core/SyRF.ProjectManagement.Core.csproj
This allows the test project to access internal members for testing purposes.
References¶
- User Guide: Managing Studies (see
user-guide/manage-studies.mdin user-guide service) - What's New: Bulk Study Update (see
user-guide/whats-new/whats-new-bulk-update-study.md) - Source:
src/libs/project-management/SyRF.ProjectManagement.Core/Services/StudyReferenceFileParser.cs - Source:
src/libs/project-management/SyRF.ProjectManagement.Core/Services/ParserImplementations/NewParser/StudyUpdateRecordProcessor.cs