Skip to content

[v26.1.x] iceberg: Add case-insensitive schema matching#30577

Open
vbotbuildovich wants to merge 5 commits into
redpanda-data:v26.1.xfrom
vbotbuildovich:ai-backport-pr-30459-v26.1.x-1779384736
Open

[v26.1.x] iceberg: Add case-insensitive schema matching#30577
vbotbuildovich wants to merge 5 commits into
redpanda-data:v26.1.xfrom
vbotbuildovich:ai-backport-pr-30459-v26.1.x-1779384736

Conversation

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

Backport of PR #30459

  • Command: git cherry-pick -x 84df79a 0924bb2 a3ad07e
  • Commits backported: 3
  • Conflicts resolved: 2
  • Commits skipped (already on target): 0
  • Backport branch: ai-backport-pr-30459-v26.1.x-1779384736

Conflict details

  • 84df79a (MODULE.bazel.lock): generated lockfile conflicted with the v26.1.x version; accepted theirs from the source commit (regeneration required, see below).
  • a3ad07e (src/v/model/model.cc): the incoming commit added format_to(iceberg_schema_case_insensitive ...) and operator>>(iceberg_schema_case_insensitive&) adjacent to fips_mode_flag's formatter. On dev, fips_mode_flag had already migrated from operator<< to format_to; v26.1.x still has the operator<< form. Kept v26.1.x's existing operator<<(fips_mode_flag&) unchanged and inserted the new iceberg_schema_case_insensitive formatter/parser above it. The incoming fips_mode_flag format_to was an unrelated context change from the source branch, not part of this PR's intent.

⚠️ Generated files

The following files were cherry-picked and may need regeneration:

  • MODULE.bazel
  • MODULE.bazel.lock

These files were accepted as-is from the source branch. Before merging,
regenerate them on the target branch to ensure they're correct. For example:

  • MODULE.bazel.lock: run bazel mod deps --lockfile_mode=update
  • *.pb.go / *.pb.cc / *.pb.h: rebuild protobuf targets
  • go.sum: run go mod tidy

Passes field_name_comparison to catalog_schema_manager::ensure_table_schema,
compatibility::check, and table_metadata field lookups so that name
comparisons can be made case-insensitively when the caller requests it.

coordinator and translation/deps pass field_name_comparison::verbatim as
a placeholder; config-driven resolution is wired up in the next commit.

(cherry picked from commit 0924bb2)
Adds the cluster-level property iceberg_schema_case_insensitive with
values yes/no/auto (default auto). This controls whether Iceberg schema
field name matching is done case-insensitively.

auto enables case-insensitive matching when the configured REST catalog
is AWS Glue (detected via SigV4 auth + service name "glue"), and uses
exact matching otherwise. This addresses a sporadic issue where AWS Glue
returns schema field names lower-cased rather than verbatim.

The resolution logic lives in datalake/coordinator/catalog_config, which
wires the cluster config and Glue detection together into a
field_name_comparison value that is then threaded into coordinator and
translation/deps.

(cherry picked from commit a3ad07e)
@vbotbuildovich vbotbuildovich requested a review from a team as a code owner May 21, 2026 17:34
@vbotbuildovich vbotbuildovich added this to the v26.1.x-next milestone May 21, 2026
@vbotbuildovich vbotbuildovich added the kind/backport PRs targeting a stable branch label May 21, 2026
@vbotbuildovich vbotbuildovich requested a review from wdberkeley May 21, 2026 17:34
The backport bot substituted the dev branch lockfile (lockFileVersion
26, Bazel 9.1.0). Regenerate from v26.1.x base (lockFileVersion 18,
Bazel 8.4.1) to add just the utf8proc and transitive rules_cc entries.
The utf8proc BCR build has includes=["."] commented out, so the header
is not on the angle-bracket include path in Bazel 8.4.1. Use a quoted
include instead, consistent with project conventions for external deps.
@wdberkeley
Copy link
Copy Markdown
Contributor

Fixed up the lockfile, plus a knockon fix

@wdberkeley wdberkeley requested a review from andrwng May 21, 2026 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/build area/redpanda kind/backport PRs targeting a stable branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants