-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Is your feature request related to a problem or challenge?
Accumulators (of aggregate functions) can be used in window functions, even if the aggregate function doesn't support sliding accumulators; see this issue:
This forces accumulators to not consume their internal state during evaluate(), which can lead to clones. For example, see StringAgg:
datafusion/datafusion/functions-aggregate/src/string_agg.rs
Lines 386 to 394 in a02e683
| fn evaluate(&mut self) -> Result<ScalarValue> { | |
| if self.has_value { | |
| Ok(ScalarValue::LargeUtf8(Some( | |
| self.accumulated_string.clone(), | |
| ))) | |
| } else { | |
| Ok(ScalarValue::LargeUtf8(None)) | |
| } | |
| } |
Previously we'd be able to std::mem::take the string, but because of the above issue we need to clone it for correctness.
Describe the solution you'd like
Can accumulators know when they are used in windows, which means they'd have to preserve their internal state, vs being used in regular aggregation where they can consume their internal state for potential optimization?
Describe alternatives you've considered
Maybe not worth doing if the clone would have minimal performance impact?
Also maybe not an issue if we proceed with this issue:
Additional context
No response