Skip to content

Lookup join silently ignores non-equi conditions for SEMI / ANTI joins #18686

@gortiz

Description

@gortiz

Summary

LookupJoinOperator supports INNER, LEFT, SEMI, and ANTI join types. For INNER/LEFT it evaluates the join's non-equi conditions after the dimension-table lookup, but for SEMI/ANTI it appears to silently ignore them, which can produce incorrect results.

Details

  • buildJoinedDataBlockDefault (INNER/LEFT) looks up the right row and then applies _nonEquiEvaluators before emitting the joined row.
  • buildJoinedDataBlockSemi / buildJoinedDataBlockAnti only test key existence via _rightTable.containsKey(key) and never reference _nonEquiEvaluators.
  • The constructor builds _nonEquiEvaluators from node.getNonEquiConditions() for every join type, so non-equi conditions present on a SEMI/ANTI lookup join are constructed but never evaluated.

Can SEMI/ANTI lookup joins carry non-equi conditions?

It appears so. RelToPlanNodeConverter#convertLogicalJoin builds the lookup JoinNode with joinInfo.nonEquiConditions for any join type and does not forbid non-equi conditions for lookup joins. (By contrast, the ASOF branch explicitly does Preconditions.checkState(joinInfo.nonEquiConditions.isEmpty(), ...).) I couldn't find a guard or a comment indicating that SEMI/ANTI lookup joins are guaranteed to be equi-only.

Impact

For a SEMI/ANTI lookup join that has a non-equi condition, the predicate is dropped: a SEMI join keeps left rows that the predicate should exclude, and an ANTI join drops left rows it should keep.

Questions / suggestions

  • Is this intended — i.e., are SEMI/ANTI lookup joins guaranteed elsewhere to never carry non-equi conditions? If so, a Preconditions.checkState/comment would make the invariant explicit.
  • Otherwise, SEMI/ANTI should evaluate the non-equi conditions (which would require fetching the right row via lookupValues instead of containsKey), or the planner should reject the combination (as ASOF does).

References: LookupJoinOperator#buildJoinedDataBlockSemi / #buildJoinedDataBlockAnti vs #buildJoinedDataBlockDefault; RelToPlanNodeConverter#convertLogicalJoin.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions