Skip to content

Support native residual conditions for inner broadcast hash joins #2194

@weimingdiit

Description

@weimingdiit

Describe
Auron currently rejects BroadcastHashJoinExec when Spark attaches a non-empty join condition beyond the equality keys. As a result, inner broadcast joins with additional predicates fall back even when the residual predicate could be evaluated natively after the join.

Describe the solution you'd like
Support residual conditions for native BroadcastHashJoinExec when the join type is InnerLike.

As in the first step for SMJ/SHJ, the broadcast hash join itself can still use only the equality keys, while the residual condition is applied as a native filter above the native join output. This keeps the implementation incremental and avoids requiring a protocol change in the first version.

The scope should be limited to:

  • BroadcastHashJoinExec
  • InnerLike joins
  • residual predicates that are already representable by existing native expression conversion

Unsupported predicates should continue to fall back to Spark.

Describe alternatives you've considered
One alternative is to defer broadcast hash join support until a generic join-time condition field exists in the native join protocol. That would be more uniform, but it also increases the scope significantly.

Another alternative is to keep falling back for all broadcast joins with non-empty conditions, but that leaves an important join path uncovered.

Additional context
This issue should stay focused on inner broadcast hash joins and should not include:

  • outer joins
  • semi/anti joins
  • null-aware anti join special handling
  • broadcast nested loop joins

Suggested validation:

  • inner BHJ with equi keys + residual predicate
  • build-left and build-right cases
  • native filter placement above BHJ output
  • clean fallback for unsupported predicates

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions