Describe
Auron currently rejects BroadcastHashJoinExec when Spark attaches a non-empty join condition beyond the equality keys. As a result, inner broadcast joins with additional predicates fall back even when the residual predicate could be evaluated natively after the join.
Describe the solution you'd like
Support residual conditions for native BroadcastHashJoinExec when the join type is InnerLike.
As in the first step for SMJ/SHJ, the broadcast hash join itself can still use only the equality keys, while the residual condition is applied as a native filter above the native join output. This keeps the implementation incremental and avoids requiring a protocol change in the first version.
The scope should be limited to:
BroadcastHashJoinExec
InnerLike joins
- residual predicates that are already representable by existing native expression conversion
Unsupported predicates should continue to fall back to Spark.
Describe alternatives you've considered
One alternative is to defer broadcast hash join support until a generic join-time condition field exists in the native join protocol. That would be more uniform, but it also increases the scope significantly.
Another alternative is to keep falling back for all broadcast joins with non-empty conditions, but that leaves an important join path uncovered.
Additional context
This issue should stay focused on inner broadcast hash joins and should not include:
- outer joins
- semi/anti joins
- null-aware anti join special handling
- broadcast nested loop joins
Suggested validation:
- inner BHJ with
equi keys + residual predicate
- build-left and build-right cases
- native filter placement above BHJ output
- clean fallback for unsupported predicates
Describe
Auron currently rejects
BroadcastHashJoinExecwhen Spark attaches a non-empty join condition beyond the equality keys. As a result, inner broadcast joins with additional predicates fall back even when the residual predicate could be evaluated natively after the join.Describe the solution you'd like
Support residual conditions for native
BroadcastHashJoinExecwhen the join type isInnerLike.As in the first step for SMJ/SHJ, the broadcast hash join itself can still use only the equality keys, while the residual condition is applied as a native filter above the native join output. This keeps the implementation incremental and avoids requiring a protocol change in the first version.
The scope should be limited to:
BroadcastHashJoinExecInnerLikejoinsUnsupported predicates should continue to fall back to Spark.
Describe alternatives you've considered
One alternative is to defer broadcast hash join support until a generic join-time condition field exists in the native join protocol. That would be more uniform, but it also increases the scope significantly.
Another alternative is to keep falling back for all broadcast joins with non-empty conditions, but that leaves an important join path uncovered.
Additional context
This issue should stay focused on inner broadcast hash joins and should not include:
Suggested validation:
equi keys + residual predicate