Skip to content

Add flat file import process#22

Open
sprucely wants to merge 11 commits into
mainfrom
swe/Import
Open

Add flat file import process#22
sprucely wants to merge 11 commits into
mainfrom
swe/Import

Conversation

@sprucely

@sprucely sprucely commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Closes #7

Summary by CodeRabbit

Release Notes

  • New Features
    • Added flat-file import functionality supporting CSV and XLSX file uploads for predefined datasets.
    • Implemented data validation with detailed error reporting for invalid rows and cells.
    • Added import history tracking with recent import status summaries.
    • Enabled downloading validation errors as CSV for analysis.
    • Enhanced pagination controls in data tables with configurable row-size options.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5de4834c-8a1c-43c5-8616-778676efbfc9

📥 Commits

Reviewing files that changed from the base of the PR and between 6169875 and 648c31d.

📒 Files selected for processing (9)
  • client/src/components/FlatFileImportPanel.tsx
  • client/src/test/components/FlatFileImportPanel.test.tsx
  • server.core/Migrations/20260617170435_ImportLog_Dataset_Idx.Designer.cs
  • server.core/Migrations/20260617170435_ImportLog_Dataset_Idx.cs
  • server.core/Migrations/AppDbContextModelSnapshot.cs
  • server/Import/FlatFileImportRegistry.cs
  • server/Import/FlatFileImportService.cs
  • server/Program.cs
  • tests/server.tests/Import/FlatFileImportServiceTests.cs
💤 Files with no reviewable changes (1)
  • server.core/Migrations/AppDbContextModelSnapshot.cs
✅ Files skipped from review due to trivial changes (2)
  • server.core/Migrations/20260617170435_ImportLog_Dataset_Idx.cs
  • server.core/Migrations/20260617170435_ImportLog_Dataset_Idx.Designer.cs
🚧 Files skipped from review as they are similar to previous changes (6)
  • server/Program.cs
  • server/Import/FlatFileImportRegistry.cs
  • client/src/test/components/FlatFileImportPanel.test.tsx
  • client/src/components/FlatFileImportPanel.tsx
  • tests/server.tests/Import/FlatFileImportServiceTests.cs
  • server/Import/FlatFileImportService.cs

📝 Walkthrough

Walkthrough

Adds an end-to-end CSV/XLSX flat-file dataset import feature: an ImportLog EF entity with migrations, an ImportDatasetDefinition domain model and registry of hard-coded dataset schemas, a FlatFileImportService that parses and stages files via CsvHelper/ClosedXML/Dapper, an ImportsController with three REST endpoints, an enhanced DataTable with full pagination controls, and a FlatFileImportPanel React component wired into the project-identification workflow stage.

Changes

Flat-File Import Feature

Layer / File(s) Summary
ImportLog entity, DbContext, and EF migrations
server.core/Domain/ImportLog.cs, server.core/Data/AppDbContext.cs, server.core/Migrations/20260609182232_AddImportLog.*, server.core/Migrations/20260617170435_ImportLog_Dataset_Idx.*, server.core/Migrations/AppDbContextModelSnapshot.cs, tests/server.tests/Data/AppDbContextTests.cs
ImportLog entity with EF data annotations is added, registered as ImportLogs DbSet, and mapped to app.ImportLog. Two migrations create the table and add a composite index on (Dataset, CompletedAt desc, Id desc). Tests assert schema mapping and index configuration.
Import domain contracts and API response records
server/Import/ImportDatasetDefinition.cs, server/Models/Imports/ImportResponses.cs
ImportColumnType enum, ImportColumn/ImportUniqueKey records, and ImportDatasetDefinition with normalized-header lookup and collision detection. Full set of sealed record response types for cell/file/row errors, validation, success, dataset summaries, and history payloads.
FlatFileImportRegistry: dataset catalog
server/Import/FlatFileImportRegistry.cs
IFlatFileImportRegistry interface and FlatFileImportRegistry with a static in-memory catalog of dataset definitions (all-projects, active-projects, assistance-listing-numbers), each with column mappings, source header aliases, and factory helpers.
FlatFileImportService: parsing, validation, and staging
server/Import/FlatFileImportService.cs, server/server.csproj, tests/server.tests/server.tests.csproj
IFlatFileImportService.ImportAsync validates file type, resolves the dataset, parses CSV via CsvHelper and XLSX via ClosedXML, checks required/duplicate headers, validates cell types, bulk-copies valid rows into #FlatFileImportRows, runs SQL staging duplicate-key validation, replaces the target table in a transaction, serializes validation history JSON, and logs every attempt to ImportLog. Adds ClosedXML, CsvHelper, Dapper, and Microsoft.Data.SqlClient dependencies.
ImportsController endpoints and DI wiring
server/Controllers/ImportsController.cs, server/Program.cs
GET /recent returns per-dataset latest import with deserialized validation stats; GET /{id} returns import log detail with validation history; POST /{dataset} (50 MB limit) delegates to FlatFileImportService and maps result variants to 200/400. IFlatFileImportRegistry registered as singleton and IFlatFileImportService as scoped.
Server-side tests
tests/server.tests/Import/FlatFileImportServiceTests.cs, tests/server.tests/Controllers/ImportsControllerTests.cs, tests/server.tests/Data/AppDbContextTests.cs
FlatFileImportServiceTests covers header normalization, file type rejection, malformed workbooks, missing/duplicate headers, no-data-row rejection, per-row cell errors, flat-file column order, persistence failure logging, and payload truncation. ImportsControllerTests verifies GET /recent returns the correct latest import per dataset.
DataTable: pagination controls and column-meta styling
client/src/shared/dataTable.tsx
ColumnMeta augmented with cellClassName/headerClassName. Adds pageSizeOptions and tableClassName props with defaults. Replaces minimal Previous/Next controls with first/prev/next/last navigation, rows-per-page dropdown, and a validated page-number input with clamping. Applies per-column class names to header and body cells.
FlatFileImportPanel React component
client/src/components/FlatFileImportPanel.tsx
TypeScript interfaces, API helpers (upload, fetch recent, fetch detail), dataset selector and file input, React Query integration, upload mutation with validation-vs-generic-error routing, RecentImportSummaryList cards, HistoricalImportDetails DataTable, ValidationResults with sortable Errors column and per-cell highlighting, Download CSV action, and status/date/error formatting helpers.
Workflow route integration and client tests
client/src/routes/(authenticated)/workflow.$stageId.tsx, client/src/test/components/FlatFileImportPanel.test.tsx, client/src/test/routes/(authenticated)/workflow.test.tsx
FlatFileImportPanel rendered inside a "Load required data" SectionPanel for the project-identification stage; other stages keep the "Coming soon" placeholder. Test suite covers recent-import status rendering, history loading on card click, upload success/reselect enforcement/post-success clear, same-filename re-upload, generic 400 failure, validation error highlighting and sorting, flat-file column order, pagination, and CSV download.

Sequence Diagram(s)

sequenceDiagram
  participant Browser as FlatFileImportPanel
  participant RQ as React Query
  participant API as ImportsController
  participant Service as FlatFileImportService
  participant SQL as SQL Server

  Browser->>RQ: mount — fetch /api/imports/recent
  RQ->>API: GET /api/imports/recent
  API->>SQL: Query ImportLogs (latest per dataset)
  SQL-->>API: RecentImportLog[]
  API-->>RQ: ImportDatasetSummaryResponse[]
  RQ-->>Browser: Render per-dataset summary cards

  Browser->>API: POST /api/imports/{dataset} (FormData)
  API->>Service: ImportAsync(datasetId, file, user)
  Service->>SQL: Parse, validate, BulkCopy `#FlatFileImportRows`
  Service->>SQL: StagingValidation + DELETE/INSERT target table
  Service->>SQL: INSERT ImportLog (Succeeded)
  Service-->>API: ImportSucceeded
  API-->>Browser: 200 OK ImportSuccessResponse

  alt Validation errors
    Service-->>API: ImportValidationFailed
    API-->>Browser: 400 ImportValidationResponse
    Browser->>Browser: Render ValidationResults with highlighted cells
  end

  Browser->>API: GET /api/imports/{id} (on card click)
  API->>SQL: Load ImportLog by id
  SQL-->>API: ImportLog with ErrorPayload JSON
  API-->>Browser: ImportLogDetailResponse with validation history
  Browser->>Browser: Render HistoricalImportDetails DataTable
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested reviewers

  • srkirkland
  • jSylvestre
  • rmartinsen-ucd

Poem

🐇 A bunny hops in with a spreadsheet in paw,
CSV rows and XLSX columns it saw.
Stage temp tables built, duplicate keys caught,
Validation highlights every cell that fought.
Download the errors — a CSV treat!
The import is done, the pipeline complete. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add flat file import process' clearly and directly summarizes the main objective of the PR—implementing comprehensive flat file import functionality across the entire application.
Linked Issues check ✅ Passed The PR implementation fully addresses all coding requirements from issue #7: React UI with file picker and dataset selection on Project Identification stage; backend API endpoint accepting spreadsheet files; column mapping and validation; data loading with full-replace semantics; and import metadata persistence to app.ImportLog via Entity Framework.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the flat file import feature: client-side UI component, route integration, data table enhancements (supporting validation results display), backend service/controller/registry implementation, database schema, migrations, dependency injection setup, and comprehensive tests. No unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch swe/Import

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
server.core/Data/AppDbContext.cs (1)

19-20: 💤 Low value

Table naming convention inconsistency.

ImportLog uses a singular table name while Users is plural. Consider standardizing on plural table names for consistency (e.g., ImportLogs).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server.core/Data/AppDbContext.cs` around lines 19 - 20, The ImportLog entity
uses a singular table name "ImportLog" while the User entity uses a plural table
name "Users", creating an inconsistency in naming conventions. In the
modelBuilder configuration, change the ImportLog entity's ToTable call to use
the plural form "ImportLogs" instead of "ImportLog" to align with the plural
naming convention used throughout the database schema.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@client/src/components/FlatFileImportPanel.tsx`:
- Around line 211-232: The onError handler in FlatFileImportPanel.tsx has two
issues that need fixing. First, the queryClient.invalidateQueries call that
refreshes recent import summaries is only executed in the validation error path
but should apply to all failure scenarios, so move it outside of the if block to
ensure recent imports always refresh after any error. Second, the fallback
validation state object uses the component state variable dataset instead of
variables.dataset, which can cause the wrong dataset to display if the user
changes their selection while a request is in flight; replace the component
state dataset reference with variables.dataset in the fallback validation
object.

In `@server.core/Domain/ImportLog.cs`:
- Line 40: The ErrorPayload property in the ImportLog class is declared as
nvarchar(max) without size constraints, while other string fields like Dataset,
Filename, Name, and Email all use explicit Truncate() calls with defined limits.
This allows validation payloads from SerializeValidationLogPayload() to grow
unbounded and degrade SQL Server performance. Add a [MaxLength] attribute to the
ErrorPayload property to enforce a reasonable size limit (such as 65535 or
1048576 for 1MB), consistent with the truncation pattern used for other string
fields in the class.

In `@server/Controllers/ImportsController.cs`:
- Around line 19-24: The query in ImportsController is applying a global
Take(datasetIds.Count * 10) limit before grouping by dataset, which causes
datasets with fewer recent logs to be dropped entirely if other datasets
dominate the results. Refactor the LINQ query to group by dataset (using GroupBy
on log.Dataset) after ordering by CompletedAt, then apply the Take(10) limit per
dataset group rather than globally. This ensures each dataset in datasetIds gets
its own recent import logs regardless of how many logs other datasets have. The
Select should then project from the grouped results into RecentImportLog.

In `@server/Import/FlatFileImportService.cs`:
- Around line 167-238: The ParseWorkbook method lacks exception handling around
the workbook file operations (the XLWorkbook initialization and worksheet
access), while the parallel ParseCsv method includes proper exception handling
for malformed files. Wrap the workbook opening and initial worksheet reading
operations in a try-catch block, and when any exception occurs, return a
FlatFileParseResult containing an ImportFileError describing the parsing
failure, rather than allowing the exception to propagate. This ensures malformed
.xlsx files are treated as validation failures that return structured error
results instead of unhandled exceptions.

In `@server/Import/ImportDatasetDefinition.cs`:
- Around line 42-53: The loop building the columnsByNormalizedHeader dictionary
silently overwrites entries when multiple ImportColumn instances have source
headers that normalize to the same key, causing incorrect column mapping. Before
assigning to the columnsByNormalizedHeader dictionary in the nested loop (where
sourceHeader is iterated), check if the normalized key already exists. If it
does, throw an appropriate exception with details about which columns have the
conflicting headers to ensure collisions are detected and reported during
construction rather than causing silent failures at runtime.

---

Nitpick comments:
In `@server.core/Data/AppDbContext.cs`:
- Around line 19-20: The ImportLog entity uses a singular table name "ImportLog"
while the User entity uses a plural table name "Users", creating an
inconsistency in naming conventions. In the modelBuilder configuration, change
the ImportLog entity's ToTable call to use the plural form "ImportLogs" instead
of "ImportLog" to align with the plural naming convention used throughout the
database schema.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 76d7bfb4-cfdc-4b64-9622-e9bea38564de

📥 Commits

Reviewing files that changed from the base of the PR and between 1e2d970 and 2d789db.

📒 Files selected for processing (20)
  • client/src/components/FlatFileImportPanel.tsx
  • client/src/routes/(authenticated)/workflow.$stageId.tsx
  • client/src/shared/dataTable.tsx
  • client/src/test/components/FlatFileImportPanel.test.tsx
  • client/src/test/routes/(authenticated)/workflow.test.tsx
  • server.core/Data/AppDbContext.cs
  • server.core/Domain/ImportLog.cs
  • server.core/Migrations/20260609182232_AddImportLog.Designer.cs
  • server.core/Migrations/20260609182232_AddImportLog.cs
  • server.core/Migrations/AppDbContextModelSnapshot.cs
  • server/Controllers/ImportsController.cs
  • server/Import/FlatFileImportRegistry.cs
  • server/Import/FlatFileImportService.cs
  • server/Import/ImportDatasetDefinition.cs
  • server/Models/Imports/ImportResponses.cs
  • server/Program.cs
  • server/server.csproj
  • tests/server.tests/Data/AppDbContextTests.cs
  • tests/server.tests/Import/FlatFileImportServiceTests.cs
  • tests/server.tests/server.tests.csproj

Comment thread client/src/components/FlatFileImportPanel.tsx
Comment thread server.core/Domain/ImportLog.cs
Comment thread server/Controllers/ImportsController.cs
Comment thread server/Import/FlatFileImportService.cs
Comment thread server/Import/ImportDatasetDefinition.cs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/server.tests/Controllers/ImportsControllerTests.cs (1)

14-67: ⚡ Quick win

Add a same-timestamp tie-break assertion for /recent.

Line 14 currently validates latest-by-time, but not the new Id tie-break path. Add two all-projects logs with identical CompletedAt and assert the higher Id entry wins.

Proposed test extension
     [Fact]
     public async Task Recent_returns_latest_import_for_each_dataset()
     {
         await using var db = TestDbContextFactory.CreateInMemory();
         var completedAt = DateTimeOffset.Parse("2026-06-17T12:00:00Z");
@@
         db.ImportLogs.Add(new ImportLog
         {
             Dataset = "active-projects",
             Filename = "active-projects.csv",
@@
         });
+
+        // Same timestamp tie-break should pick the higher Id
+        db.ImportLogs.Add(new ImportLog
+        {
+            Dataset = "all-projects",
+            Filename = "all-projects-tie-a.csv",
+            Status = "Succeeded",
+            AttemptedRows = 999,
+            RowsImported = 999,
+            StartedAt = completedAt.AddHours(1).AddSeconds(-30),
+            CompletedAt = completedAt.AddHours(1),
+        });
+        db.ImportLogs.Add(new ImportLog
+        {
+            Dataset = "all-projects",
+            Filename = "all-projects-tie-b.csv",
+            Status = "Succeeded",
+            AttemptedRows = 1000,
+            RowsImported = 1000,
+            StartedAt = completedAt.AddHours(1).AddSeconds(-20),
+            CompletedAt = completedAt.AddHours(1),
+        });
         await db.SaveChangesAsync();
@@
-        allProjects.LastImport!.Filename.Should().Be("all-projects-29.csv");
+        allProjects.LastImport!.Filename.Should().Be("all-projects-tie-b.csv");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/server.tests/Controllers/ImportsControllerTests.cs` around lines 14 -
67, The test method Recent_returns_latest_import_for_each_dataset() currently
validates latest-import-by-time behavior but does not test the tie-break logic
when two imports have the same CompletedAt timestamp. Add two ImportLog entries
for the "all-projects" dataset with identical CompletedAt values before the
SaveChangesAsync() call, ensuring they will have different Id values. Then add
an assertion to verify that the ImportLog entry with the higher Id is the one
returned in the allProjects.LastImport result, confirming that the Id tie-break
logic works correctly when timestamps are equal.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server.core/Domain/User.cs`:
- Around line 8-9: Remove the unnecessary Index attributes that are decorating
the Email and Name properties in the User class. The [Index(nameof(Email))] and
[Index(nameof(Name))] attributes should be deleted since they add storage and
maintenance overhead without providing query performance benefits, as the only
User entity query in the codebase filters exclusively by EntraId, not by Email
or Name.

---

Nitpick comments:
In `@tests/server.tests/Controllers/ImportsControllerTests.cs`:
- Around line 14-67: The test method
Recent_returns_latest_import_for_each_dataset() currently validates
latest-import-by-time behavior but does not test the tie-break logic when two
imports have the same CompletedAt timestamp. Add two ImportLog entries for the
"all-projects" dataset with identical CompletedAt values before the
SaveChangesAsync() call, ensuring they will have different Id values. Then add
an assertion to verify that the ImportLog entry with the higher Id is the one
returned in the allProjects.LastImport result, confirming that the Id tie-break
logic works correctly when timestamps are equal.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3adb1bbc-7fb4-4ab6-afe1-12fd375568f8

📥 Commits

Reviewing files that changed from the base of the PR and between 2d789db and 6169875.

📒 Files selected for processing (13)
  • client/src/components/FlatFileImportPanel.tsx
  • client/src/test/components/FlatFileImportPanel.test.tsx
  • server.core/Domain/ImportLog.cs
  • server.core/Domain/User.cs
  • server.core/Migrations/20260617161705_AddIndexes.Designer.cs
  • server.core/Migrations/20260617161705_AddIndexes.cs
  • server.core/Migrations/AppDbContextModelSnapshot.cs
  • server/Controllers/ImportsController.cs
  • server/Import/FlatFileImportService.cs
  • server/Import/ImportDatasetDefinition.cs
  • tests/server.tests/Controllers/ImportsControllerTests.cs
  • tests/server.tests/Data/AppDbContextTests.cs
  • tests/server.tests/Import/FlatFileImportServiceTests.cs
✅ Files skipped from review due to trivial changes (3)
  • server.core/Migrations/20260617161705_AddIndexes.cs
  • server.core/Migrations/AppDbContextModelSnapshot.cs
  • server.core/Migrations/20260617161705_AddIndexes.Designer.cs
🚧 Files skipped from review as they are similar to previous changes (6)
  • server.core/Domain/ImportLog.cs
  • server/Import/ImportDatasetDefinition.cs
  • tests/server.tests/Data/AppDbContextTests.cs
  • client/src/test/components/FlatFileImportPanel.test.tsx
  • client/src/components/FlatFileImportPanel.tsx
  • server/Import/FlatFileImportService.cs

Comment thread server.core/Domain/User.cs Outdated
@sprucely sprucely requested a review from rmartinsen-ucd June 17, 2026 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Build spreadsheet import process (React UI + backend) for reporting tables

1 participant