Skip to content

REST: Add case-insensitive identifier resolution for REST catalog#16069

Open
lunar-shadow wants to merge 1 commit intoapache:mainfrom
lunar-shadow:case-insensitive-rest-catalog
Open

REST: Add case-insensitive identifier resolution for REST catalog#16069
lunar-shadow wants to merge 1 commit intoapache:mainfrom
lunar-shadow:case-insensitive-rest-catalog

Conversation

@lunar-shadow
Copy link
Copy Markdown
Contributor

Summary

Add case-insensitive identifier normalization to RESTCatalog, controlled by two catalog properties:

  • case-insensitive (default: false) - enable/disable case normalization
  • case-insensitive-type (default: lower_case) - conversion mode (lower_case or upper_case)

When enabled, all namespace, table, and view identifiers are normalized to a consistent case before being sent to the catalog backend.

Motivation

REST catalogs that preserve identifier case (e.g., CustomerDB.Orders) cause SCHEMA_NOT_FOUND / TABLE_NOT_FOUND errors when query engines like Spark normalize identifiers to lowercase. This affects interoperability across mixed-client ecosystems (Spark, Flink, Trino).

Closes: #14386

@github-actions github-actions Bot added the core label Apr 21, 2026
@lunar-shadow lunar-shadow force-pushed the case-insensitive-rest-catalog branch from 934588b to eec9a76 Compare April 21, 2026 11:55
"upper_case".equals(caseType)
? ident.name().toUpperCase(Locale.ROOT)
: ident.name().toLowerCase(Locale.ROOT);
return TableIdentifier.of(convertedNs, convertedName);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trino Iceberg connector supports the iceberg.rest-catalog.case-insensitive-name-matching configuration property, but its behavior differs from the approach in this PR. I'm commenting here because the PR description refers to Trino.

The connector attempts to locate remote objects in a case-insensitive way by iterating over namespaces and tables and comparing their lowercase forms. If it encounters an ambiguous situation (for example, both ORDERS and orders exist), it throws an exception. This logic correctly handles mixed-case names, whereas this PR appears to address only the uppercase case.

@gaborkaszab
Copy link
Copy Markdown
Contributor

Hi @lunar-shadow , thanks for the PR!
I'm trying to think it through how this would work with catalog server implementations that preserve the case of the names. E.g. these might cause some headaches:

  1. If we have a table with a mixed case name in the catalog (e.g. MyTable) then we won't be able to query it after turning this feature on
  2. There could be other engines writing tables into the catalog that don't use this feature. As a result there is no guarantee that whatever tables are created by those other engines are accessible with the one with the future turned on.
  3. There could be different tables in the catalog with names where only the case is different. E.g. mytable, MYTABLE, MyTable etc. If the user queries for MyTable and has this feature on as lower_case then the user might get a different table than expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[REST Catalog] Case-insensitive identifier matching for Polaris and similar catalogs

3 participants