Skip to content

iceberg: don't reject new fields with required descendants#30585

Merged
nvartolomei merged 1 commit into
redpanda-data:devfrom
nvartolomei:nv/iceberg-compat-new-field-required-descendants
May 22, 2026
Merged

iceberg: don't reject new fields with required descendants#30585
nvartolomei merged 1 commit into
redpanda-data:devfrom
nvartolomei:nv/iceberg-compat-new-field-required-descendants

Conversation

@nvartolomei
Copy link
Copy Markdown
Contributor

@nvartolomei nvartolomei commented May 22, 2026

validate_schema_transform incorrectly rejected adding a new optional
column whose nested type contains structurally-required descendants
(e.g. an optional map, whose keys are always required). The walk
visited every annotated field and triggered new_required_field on the
nested required key/element.

Stop descending into a subtree once an is_new ancestor is encountered:
the substructure is part of the new type's definition, not a separate
top-level addition. The same reasoning applies to required list
elements and required struct subfields, though in practice only the
map-key case arises today. The datalake record translator coerces
all incoming schema fields to optional (see record_translator.cc),
leaving map keys as the only structurally-required descendants. The
list and struct cases are covered by regression tests for
completeness.

https://redpandadata.atlassian.net/browse/CORE-16367

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v26.1.x
  • v25.3.x
  • v25.2.x

Release Notes

Bug Fixes

  • Fix schema evolution incorrectly rejecting new optional Iceberg columns whose nested types contain structurally-required fields (e.g. maps, whose keys are always required).

Copilot AI review requested due to automatic review settings May 22, 2026 12:53
@nvartolomei nvartolomei changed the title iceberg/compat: don't reject new fields with required descendants iceberg: don't reject new fields with required descendants May 22, 2026
@nvartolomei nvartolomei force-pushed the nv/iceberg-compat-new-field-required-descendants branch from a309eef to b64e4bc Compare May 22, 2026 12:53
@nvartolomei nvartolomei requested review from oleiman and wdberkeley May 22, 2026 12:53
@nvartolomei nvartolomei force-pushed the nv/iceberg-compat-new-field-required-descendants branch from b64e4bc to 034cfc8 Compare May 22, 2026 12:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Iceberg schema evolution validation incorrectly rejecting newly-added optional fields whose nested types contain structurally-required descendants (e.g., map keys), by avoiding validation descent into the subtree of any field annotated as is_new.

Changes:

  • Update validate_schema_transform to stop descending into a field’s nested structure once that field is identified as newly added (is_new), preventing structurally-required descendants from triggering new_required_field.
  • Add a regression test covering adding a new optional map column (with required key field by construction).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/v/iceberg/compatibility.cc Adjusts validation traversal to skip validating descendants of newly-added fields, avoiding false new_required_field errors for structurally-required descendants.
src/v/iceberg/tests/compatibility_test.cc Adds a regression test ensuring an optional added map field with a required key descendant is accepted.

Comment thread src/v/iceberg/tests/compatibility_test.cc
@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented May 22, 2026

CI test results

test results on build#84866
test_status test_class test_method test_arguments test_kind job_url passed reason test_history
FLAKY(PASS) NodeWiseRecoveryTest test_recovery_local_data_missing {"wait_for_final_manifest_uploads": true} integration https://buildkite.com/redpanda/redpanda/builds/84866#019e4fcc-4c29-49b8-bf13-8f241e517a67 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0053, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=NodeWiseRecoveryTest&test_method=test_recovery_local_data_missing
test results on build#84877
test_status test_class test_method test_arguments test_kind job_url passed reason test_history
FLAKY(PASS) NodeWiseRecoveryTest test_recovery_local_data_missing {"wait_for_final_manifest_uploads": false} integration https://buildkite.com/redpanda/redpanda/builds/84877#019e5082-2718-49c2-b19c-d18ba171e822 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0018, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=NodeWiseRecoveryTest&test_method=test_recovery_local_data_missing
FLAKY(PASS) RemoteLabelsTest test_share_bucket_delete_topic {"cloud_storage_type": 2} integration https://buildkite.com/redpanda/redpanda/builds/84877#019e508a-465f-427a-a172-8a3d049a2d0f 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=RemoteLabelsTest&test_method=test_share_bucket_delete_topic

Copy link
Copy Markdown
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm. one question about applying the same reasoning to structs.

Comment thread src/v/iceberg/compatibility.cc Outdated
Comment on lines +501 to +506
if (is_new_field) {
// Treat a new field's substructure as part of the new type
// definition. In particular, structurally-required children (e.g.
// map keys, list elements declared required) must not trigger
// new_required_field — that check applies only to top-level new
// columns.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map & list case is very clear, but this also applies to structs w/ required subfields, presumably. at a glance it seems fine, but I don't know if it's required/preferable. if not required, we could have a carve out for structs, otherwise we should probably call it out here and maybe add a quick unit test.

oleiman
oleiman previously approved these changes May 22, 2026
validate_schema_transform incorrectly rejected adding a new optional
column whose nested type contains structurally-required descendants
(e.g. an optional map, whose keys are always required). The walk
visited every annotated field and triggered new_required_field on the
nested required key/element.

Stop descending into a subtree once an is_new ancestor is encountered:
the substructure is part of the new type's definition, not a separate
top-level addition. The same reasoning applies to required list
elements and required struct subfields, though in practice only the
map-key case arises today. The datalake record translator coerces
all incoming schema fields to optional (see record_translator.cc),
leaving map keys as the only structurally-required descendants. The
list and struct cases are covered by regression tests for
completeness.
@nvartolomei nvartolomei force-pushed the nv/iceberg-compat-new-field-required-descendants branch from 034cfc8 to 07afd25 Compare May 22, 2026 16:14
@oleiman oleiman self-requested a review May 22, 2026 16:16
Copy link
Copy Markdown
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@nvartolomei nvartolomei enabled auto-merge May 22, 2026 16:25
@nvartolomei nvartolomei merged commit 86e3dce into redpanda-data:dev May 22, 2026
20 of 21 checks passed
@vbotbuildovich
Copy link
Copy Markdown
Collaborator

/backport v26.1.x

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

/backport v25.3.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants