Skip to content

API: Fix Identity projection for mismatched transform types (#15502)#16074

Open
yadavay-amzn wants to merge 1 commit intoapache:mainfrom
yadavay-amzn:fix/15502-identity-projection
Open

API: Fix Identity projection for mismatched transform types (#15502)#16074
yadavay-amzn wants to merge 1 commit intoapache:mainfrom
yadavay-amzn:fix/15502-identity-projection

Conversation

@yadavay-amzn
Copy link
Copy Markdown
Contributor

Fixes #15502

Problem

When an Iceberg table has manifests written using a partition spec with identity-transformed timestamp field, queries that filter on that field using a temporal transform like hours() fail with:

ValidationException: Invalid value for conversion to type timestamptz: 490674 (java.lang.Integer)

Identity.projectStrict() creates an unbound predicate with the literal from the input predicate. When the predicate term is a transform (e.g. hours(ts) = 490674), the literal type (integer) does not match the partition field type (timestamptz), causing a ValidationException when the unbound predicate is later bound to the partition schema.

Fix

Return null from projectStrict() when the predicate term is not a BoundReference, indicating the identity transform cannot project transform-based predicates. This causes the projection to fall back to alwaysTrue (inclusive) or alwaysFalse (strict), which is the correct behavior.

The fix is a 4-line guard clause at the top of projectStrict(). Since project() delegates to projectStrict(), both paths are covered.

Testing

Added a regression test in TestProjection that reproduces the exact scenario from the issue: identity-partitioned timestamptz field filtered with hours() transform, verifying both inclusive and strict projections.

@github-actions github-actions Bot added the API label Apr 22, 2026
@Test
public void testIdentityProjectionWithTransformPredicate() {
// Regression test for https://github.com/apache/iceberg/issues/15502
// Identity-partitioned timestamptz field filtered with hours() should not throw
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence looks incomplete. "... should not throw ValidationException"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — completed the sentence.


// The identity transform cannot project a hours-transform predicate, so it should
// be replaced with alwaysTrue (inclusive) rather than throwing ValidationException
assertThat(projected).isEqualTo(Expressions.alwaysTrue());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValidationException isn't thrown in this test even if I revert a change of Identity.java‎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception is caught internally by the projection framework and converted to alwaysTrue/alwaysFalse. Without the fix, projectStrict() creates an UnboundPredicate with an integer literal for a timestamptz field, which fails during binding — the test assertion fails because the projection result is neither alwaysTrue nor alwaysFalse. Updated the comments to explain this more clearly.

…5502)

Identity.project() and projectStrict() delegate to projectStrict() which
creates an unbound predicate with the literal from the input predicate.
When the predicate term is a transform (e.g. hours(ts) = 490674), the
literal type (integer) does not match the partition field type
(timestamptz), causing a ValidationException when the unbound predicate
is later bound to the partition schema.

Fix: Return null from projectStrict() when the predicate term is not a
BoundReference, indicating the identity transform cannot project
transform-based predicates. This causes the projection to fall back to
alwaysTrue (inclusive) or alwaysFalse (strict), which is correct.
@yadavay-amzn yadavay-amzn force-pushed the fix/15502-identity-projection branch from 72461b2 to a078b0f Compare April 22, 2026 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ValidationException when filtering identity-partitioned timestamp field using other transforms

2 participants