Background
Spark's time-window grouping expressions currently fall back to Spark in Comet:
window(timeColumn, windowDuration, [slideDuration], [startTime]) (TimeWindow) - tumbling/sliding windows, common in batch aggregation (GROUP BY window(ts, '1 hour')), not just streaming.
session_window(timeColumn, gapDuration) (SessionWindow) - session windows.
window_time(window) (WindowTime) - extracts the event time from a window column.
Comet has no serde for TimeWindow / SessionWindow / WindowTime today, so any query using them falls back.
Notes
These are not plain scalar functions: Spark's analyzer (TimeWindowing / SessionWindowing rules) rewrites window() / session_window() into an Expand plus grouping on a computed window struct. Native support would need to handle the rewritten form (struct construction and the window-boundary arithmetic) and the grouping that follows.
window() over a fixed duration is the most commonly used of the three and would be the natural starting point.
Acceptance criteria
window, session_window, and window_time execute natively in Comet and match Spark.
- Add SQL file test coverage under
expressions/datetime/.
Background
Spark's time-window grouping expressions currently fall back to Spark in Comet:
window(timeColumn, windowDuration, [slideDuration], [startTime])(TimeWindow) - tumbling/sliding windows, common in batch aggregation (GROUP BY window(ts, '1 hour')), not just streaming.session_window(timeColumn, gapDuration)(SessionWindow) - session windows.window_time(window)(WindowTime) - extracts the event time from a window column.Comet has no serde for
TimeWindow/SessionWindow/WindowTimetoday, so any query using them falls back.Notes
These are not plain scalar functions: Spark's analyzer (
TimeWindowing/SessionWindowingrules) rewriteswindow()/session_window()into anExpandplus grouping on a computed window struct. Native support would need to handle the rewritten form (struct construction and the window-boundary arithmetic) and the grouping that follows.window()over a fixed duration is the most commonly used of the three and would be the natural starting point.Acceptance criteria
window,session_window, andwindow_timeexecute natively in Comet and match Spark.expressions/datetime/.