iceberg: don't reject new fields with required descendants#30585
Conversation
a309eef to
b64e4bc
Compare
b64e4bc to
034cfc8
Compare
There was a problem hiding this comment.
Pull request overview
Fixes Iceberg schema evolution validation incorrectly rejecting newly-added optional fields whose nested types contain structurally-required descendants (e.g., map keys), by avoiding validation descent into the subtree of any field annotated as is_new.
Changes:
- Update
validate_schema_transformto stop descending into a field’s nested structure once that field is identified as newly added (is_new), preventing structurally-required descendants from triggeringnew_required_field. - Add a regression test covering adding a new optional map column (with required key field by construction).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/v/iceberg/compatibility.cc |
Adjusts validation traversal to skip validating descendants of newly-added fields, avoiding false new_required_field errors for structurally-required descendants. |
src/v/iceberg/tests/compatibility_test.cc |
Adds a regression test ensuring an optional added map field with a required key descendant is accepted. |
CI test resultstest results on build#84866
test results on build#84877
|
oleiman
left a comment
There was a problem hiding this comment.
code lgtm. one question about applying the same reasoning to structs.
| if (is_new_field) { | ||
| // Treat a new field's substructure as part of the new type | ||
| // definition. In particular, structurally-required children (e.g. | ||
| // map keys, list elements declared required) must not trigger | ||
| // new_required_field — that check applies only to top-level new | ||
| // columns. |
There was a problem hiding this comment.
map & list case is very clear, but this also applies to structs w/ required subfields, presumably. at a glance it seems fine, but I don't know if it's required/preferable. if not required, we could have a carve out for structs, otherwise we should probably call it out here and maybe add a quick unit test.
validate_schema_transform incorrectly rejected adding a new optional column whose nested type contains structurally-required descendants (e.g. an optional map, whose keys are always required). The walk visited every annotated field and triggered new_required_field on the nested required key/element. Stop descending into a subtree once an is_new ancestor is encountered: the substructure is part of the new type's definition, not a separate top-level addition. The same reasoning applies to required list elements and required struct subfields, though in practice only the map-key case arises today. The datalake record translator coerces all incoming schema fields to optional (see record_translator.cc), leaving map keys as the only structurally-required descendants. The list and struct cases are covered by regression tests for completeness.
034cfc8 to
07afd25
Compare
|
/backport v26.1.x |
|
/backport v25.3.x |
validate_schema_transform incorrectly rejected adding a new optional
column whose nested type contains structurally-required descendants
(e.g. an optional map, whose keys are always required). The walk
visited every annotated field and triggered new_required_field on the
nested required key/element.
Stop descending into a subtree once an is_new ancestor is encountered:
the substructure is part of the new type's definition, not a separate
top-level addition. The same reasoning applies to required list
elements and required struct subfields, though in practice only the
map-key case arises today. The datalake record translator coerces
all incoming schema fields to optional (see record_translator.cc),
leaving map keys as the only structurally-required descendants. The
list and struct cases are covered by regression tests for
completeness.
https://redpandadata.atlassian.net/browse/CORE-16367
Backports Required
Release Notes
Bug Fixes