Skip to content

fix(stats): widen sum_value integer arithmetic to SUM-compatible types#20865

Open
kumarUjjawal wants to merge 3 commits intoapache:mainfrom
kumarUjjawal:fix/precision_sum_i64
Open

fix(stats): widen sum_value integer arithmetic to SUM-compatible types#20865
kumarUjjawal wants to merge 3 commits intoapache:mainfrom
kumarUjjawal:fix/precision_sum_i64

Conversation

@kumarUjjawal
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

As discussed in the review thread on #20768 and tracked by #20826, sum_value should not keep narrow integer column types during stats aggregation, because merge/multiply paths can overflow before values are widened.

What changes are included in this PR?

This PR updates statistics sum_value arithmetic to match SUM-style widening for small integer types, and applies that behavior consistently across merge and multiplication paths.

Are these changes tested?

Yes

Are there any user-facing changes?

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates common Related to common crate datasource Changes to the datasource crate physical-plan Changes to the physical-plan crate labels Mar 11, 2026
@kumarUjjawal
Copy link
Contributor Author

cc @jonathanc-n

@Dandandan
Copy link
Contributor

I think this looks good @jonathanc-n can you take a look as well?

Copy link
Contributor

@jonathanc-n jonathanc-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm thanks!

@asolimando
Copy link
Member

Thanks for working on this @kumarUjjawal, but I have a few suggestions:

  1. The original idea in Match Precision sum function against Int64 to prevent overflow. #20826 was to change the sum_value field type to always store a wide type, making overflow protection structural. The lazy-widening approach here requires every future call site to remember to use add_for_sum/cast_to_sum_type instead of add/multiply, with no compiler enforcement. At minimum, a doc comment on sum_value warning about this would help.

This forces to change all consumers, but it would prove more robust over time.

  1. The Exact/Inexact arms in cast_to_sum_type are nearly identical and could be collapsed I think

@kumarUjjawal kumarUjjawal force-pushed the fix/precision_sum_i64 branch from 27eff84 to e88ae02 Compare March 17, 2026 05:09
@Dandandan
Copy link
Contributor

@kumarUjjawal can you fix CI?

@kumarUjjawal
Copy link
Contributor Author

@Dandandan Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate datasource Changes to the datasource crate physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Match Precision sum function against Int64 to prevent overflow.

4 participants