Skip to content

HIVE-29503: Prevent Join cardinality overestimation of joins with NDV(0) columns#6356

Open
konstantinb wants to merge 7 commits intoapache:masterfrom
konstantinb:HIVE-29503
Open

HIVE-29503: Prevent Join cardinality overestimation of joins with NDV(0) columns#6356
konstantinb wants to merge 7 commits intoapache:masterfrom
konstantinb:HIVE-29503

Conversation

@konstantinb
Copy link
Contributor

What changes were proposed in this pull request?

HIVE-29503: Use the fallback of half the number of rows when estimating the join product row count with an NDV of 0

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

…ng the join product row count with an NDV of 0
@konstantinb konstantinb changed the title HIVE-29503: Use the fallback of half the number of rows when estimating the join product row count with an NDV of 0 HIVE-29503: Prevent Join cardinality overestimation of joins with NDV(0) columns Mar 23, 2026
@konstantinb konstantinb marked this pull request as ready for review March 23, 2026 23:19
input vertices:
1 Map 2
Statistics: Num rows: 4 Data size: 1184 Basic stats: COMPLETE Column stats: COMPLETE
Statistics: Num rows: 2 Data size: 325 Basic stats: COMPLETE Column stats: NONE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at the moment. NDV of datetime/timestamp columns is not being assigned to colstats even if available. Changing that will make this estimate better; however, doing impacts over 100 .out files so perhaps doing so belongs to a separate story?

@konstantinb
Copy link
Contributor Author

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants