Harden NFD write-normalization against dropped fields#2322
Conversation
📝 WalkthroughWalkthroughThis PR refactors the Unicode NFD normalization strategy and test infrastructure. The wrapper now uses Copy()-and-overwrite to preserve all model fields automatically instead of explicit object construction. Test assertions are unified via a StringCheck abstraction that replaces form-specific helpers with decomposed/decomposable validators. All entity tests and data factories are updated accordingly. ChangesNFD Normalization Strategy and Test Infrastructure
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Previously NormalizationAssert only verified strings were in the expected form. ASCII satisfies both NFC and NFD, so a factory or test fixture using ASCII content silently bypassed the normalizer — the assertion passed without exercising any normalization work. New `requireNonTrivial` parameter asserts each string differs from its conversion to the opposite form, ensuring the input actually contains content the normalizer would have to act on. Applied to `AllNfcFactories_ProduceNfcData`, which now enforces non-triviality for every NfcTestData factory in one place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add BeEquivalentTo(input) with an NFC≡NFD string comparer alongside the existing AssertAllNfd(captured). The pure form-check ignores primitives, enums, Guids, and dates, so a normalizer that drops a non-string field (the HomographNumber regression) passes silently — this commit closes that hole. - New `NormalizedStrings()` equivalency option treats NFC-equal-to-NFD as equal so the assertion accepts the normalizer's string changes while requiring everything else to match exactly. - New `AssertNormalized` helper combines AssertAllNfd + BeEquivalentTo. - NfcTestData factories now populate non-default values for non-string scalars (HomographNumber, Order, EntryId, etc.) so the identity assertion has something to compare. A default-valued field can't catch a normalizer that drops it. Verified by reintroducing the Sense.Order drop: previously 0 tests caught it; now 5 fail with `Expected property Senses[0].Order to be 1.5, but found 0`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The write-normalization wrapper rebuilt each model with object initializers, duplicating the field list, so a field added to a model later but missed here would be silently reset to default on write. The seven non-record normalizers now copy-then-overwrite text (records already get this from with-expressions); non-text fields, including future ones, ride along via the per-type Copy() method, whose completeness is enforced by EntityCopyMethodTests. Behaviour is unchanged for every current field. Also make Publication.Copy() return Publication instead of IObjectWithId (it implemented only the non-generic IPossibility), dropping two now-redundant casts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove a vacuous AssertAllDecomposable(WritingSystem) call (all its WS string fields are skipped, so it asserted nothing); isolate the empty-MultiString negative test so it cannot pass on an unrelated "no values" message; and replace the NormalizationAssert (form, requireNonTrivial) flag pair, two params encoding one bit threaded through five methods, with a single named leaf-check delegate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6b0024d to
f91c0a8
Compare
UI unit Tests 1 files 59 suites 28s ⏱️ Results for commit 30ac45e. ♻️ This comment has been updated with latest results. |
|
The latest updates on your projects. Learn more about Argos notifications ↗︎
|
Comments now describe the guard as catching data byte-identical in NFC and NFD (ASCII being one example), not ASCII specifically. Rename AssertNormalized to AssertNormalizedToNfd since it is NFD-specific. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replaces the (description, Action<string, string, List<string>>) pair threaded through the traversal — two loosely coupled params a caller could mismatch — with a single StringCheck record bundling the failure-header label and a validator that maps a string to an issue message or null. Removes the mutable-list delegate and the mismatch risk; behaviour and failure messages are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The "ASCII bypasses the normalizer" reason was stated three times: in the AssertAllDecomposable docstring, again above the validators, and in the DecomposableNfc failure message. Keep the docstring and the failure message; remove the redundant block comment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
C# FwHeadless Unit Tests48 tests 48 ✅ 16s ⏱️ Results for commit 30ac45e. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/FwLite/MiniLcm/Normalization/MiniLcmApiWriteNormalizationWrapper.cs`:
- Around line 37-39: EntityCopyMethodTests currently enumerates object types
from LcmCrdtKernel.ConfigureCrdt and misses Translation, so Translation.Copy()
completeness isn't enforced while
MiniLcmApiWriteNormalizationWrapper.NormalizeTranslation relies on
translation.Copy(); add coverage by either registering Translation in
LcmCrdtKernel.ConfigureCrdt's .Add<Translation>() list used by
EntityCopyMethodTests or add a focused unit test that calls Translation.Copy()
and asserts field preservation (i.e., create a Translation instance with all
fields set, call Copy(), and verify no text-bearing fields are lost) so
Translation.Copy() is exercised alongside the other Copy()-based normalizers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 886db3be-e4e5-44e7-9f45-457cadd8d3c3
📒 Files selected for processing (6)
backend/FwLite/MiniLcm.Tests/Helpers/NfcTestData.csbackend/FwLite/MiniLcm.Tests/Helpers/NormalizationAssert.csbackend/FwLite/MiniLcm.Tests/WriteNormalizationTests.csbackend/FwLite/MiniLcm/Models/Entry.csbackend/FwLite/MiniLcm/Models/Publication.csbackend/FwLite/MiniLcm/Normalization/MiniLcmApiWriteNormalizationWrapper.cs
Hardens the NFD write-normalization path from #2146 so a model field can't silently slip through dropped or un-normalized.
The write-normalization wrapper rebuilt each model with
new X { ... }initializers, re-listing every field — so a field added to a model later but missed there would be reset to its default on write, uncaught unless test data also happened to set it non-default. The seven non-record normalizers now copy-then-overwrite only the text fields (records already get this fromwith), so non-text fields, including ones added in future, are preserved by each type'sCopy()— whose completeness is already enforced byEntityCopyMethodTests. Behaviour is unchanged for every current field, andPublication.Copy()is now strongly typed, dropping two redundant casts.On the test side: factory data uses non-default scalars and genuinely-decomposable (non-ASCII) NFC content so a dropped or no-op'd field actually fails an assertion; the empty-MultiString negative test is isolated; a vacuous
WritingSystemassertion is removed; andNormalizationAssert's per-leaf check is now a single namedStringCheck(failure label + validator) instead of a threadedNormalizationForm.Build clean for MiniLcm/LcmCrdt; WriteNormalizationTests, NormalizationAssertTests, and EntityCopyMethodTests pass.
🤖 Generated with Claude Code