-
Notifications
You must be signed in to change notification settings - Fork 268
perf: Additional optimizations for cast from string to int #3048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| macro_rules! cast_utf8_to_int { | ||
| ($array:expr, $eval_mode:expr, $array_type:ty, $cast_method:ident) => {{ | ||
| ($array:expr, $array_type:ty, $parse_fn:expr) => {{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass in the parsing function as a closure rather than passing in the eval mode and then matching on that for each row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats super nice
| match eval_mode { | ||
| EvalMode::Legacy => do_parse_string_to_int_legacy(str, min_value), | ||
| EvalMode::Ansi => do_parse_string_to_int_ansi(str, type_name, min_value), | ||
| EvalMode::Try => do_parse_string_to_int_try(str, min_value), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was happening once per row
|
Current performance compared to main branch: |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3048 +/- ##
============================================
+ Coverage 56.12% 59.50% +3.37%
- Complexity 976 1381 +405
============================================
Files 119 167 +48
Lines 11743 15522 +3779
Branches 2251 2575 +324
============================================
+ Hits 6591 9236 +2645
- Misses 4012 4989 +977
- Partials 1140 1297 +157 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| .map(|v| v as i16)) | ||
| fn parse_string_to_i8_legacy(str: &str) -> SparkResult<Option<i8>> { | ||
| match do_parse_string_to_int_legacy::<i32>(str, i32::MIN)? { | ||
| None => Ok(None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| None => Ok(None), |
?
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andygrove it is similar to #2600 (comment) suggestion and thanks for addressing those issues
I'll take a look next. |
|
thank you @andygrove |
Which issue does this PR close?
N/A
Rationale for this change
This follows on from #3017 and makes some additional optimizations.
See comment #3048 (comment) for criterion benchmark results.
What changes are included in this PR?
trim_asciiinstead of custom trim logicdo_parse_string_to_int_legacyand remove mutableparse_sign_and_digitsHow are these changes tested?
Existing tests.