diff --git a/.prettierignore b/.prettierignore new file mode 100644 index 0000000000..04933aa70c --- /dev/null +++ b/.prettierignore @@ -0,0 +1,22 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# prettier re-aligns markdown table columns to the widest cell, so adding a +# single expression row rewrites every row in the table. That produces noisy +# diffs and frequent merge conflicts between PRs that each add new expressions. +# This file is almost entirely tables, so exempt it from prettier formatting. +docs/source/user-guide/latest/expressions.md diff --git a/docs/source/user-guide/latest/expressions.md b/docs/source/user-guide/latest/expressions.md index 3627766445..916d4a1785 100644 --- a/docs/source/user-guide/latest/expressions.md +++ b/docs/source/user-guide/latest/expressions.md @@ -39,12 +39,12 @@ Most expressions can also be disabled with `spark.comet.expression.EXPRNAME.enab ## Status legend -| Status | Meaning | -| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| ✅ Supported | Comet produces Spark-compatible results by default. Some inputs or forms may fall back to Spark, and any incompatible behavior is opt-in (off by default). | -| ⚠️ Incorrect by default | Comet runs natively by default but can return results that differ from Spark (a wrong value, or a native error on valid input). See the linked detail on each row. | -| 🔜 Planned | Intended; tracked by an open issue or pull request. | -| 💤 Not currently planned | Not on the current roadmap; falls back to Spark and may be reconsidered later. | +| Status | Meaning | +| --- | --- | +| ✅ Supported | Comet produces Spark-compatible results by default. Some inputs or forms may fall back to Spark, and any incompatible behavior is opt-in (off by default). | +| ⚠️ Incorrect by default | Comet runs natively by default but can return results that differ from Spark (a wrong value, or a native error on valid input). See the linked detail on each row. | +| 🔜 Planned | Intended; tracked by an open issue or pull request. | +| 💤 Not currently planned | Not on the current roadmap; falls back to Spark and may be reconsidered later. | ## Not currently planned @@ -67,146 +67,146 @@ The tables below list every Spark built-in expression with its current status. ## agg_funcs -| Function | Status | Notes | -| ----------------------- | ------ | -------------------------------------------------------------------------------------------------------------------------- | -| `any` | ✅ | | -| `any_value` | ✅ | | -| `approx_count_distinct` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `array_agg` | 🔜 | Array aggregate (related to `collect_list`, [#2524](https://github.com/apache/datafusion-comet/issues/2524)) | -| `avg` | ✅ | Interval types fall back | -| `bit_and` | ✅ | | -| `bit_or` | ✅ | | -| `bit_xor` | ✅ | | -| `bool_and` | ✅ | | -| `bool_or` | ✅ | | -| `collect_list` | 🔜 | [#2524](https://github.com/apache/datafusion-comet/issues/2524) | -| `collect_set` | ✅ | | -| `corr` | ✅ | | -| `count` | ✅ | | -| `count_if` | ✅ | | -| `covar_pop` | ✅ | | -| `covar_samp` | ✅ | | -| `every` | ✅ | | -| `first` | ✅ | | -| `first_value` | ✅ | | -| `grouping` | 🔜 | Grouping indicator for ROLLUP/CUBE/GROUPING SETS | -| `grouping_id` | 🔜 | Grouping indicator for ROLLUP/CUBE/GROUPING SETS | -| `kurtosis` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `last` | ✅ | | -| `last_value` | ✅ | | -| `listagg` | 🔜 | String aggregation | -| `max` | ✅ | | -| `max_by` | 🔜 | [#3841](https://github.com/apache/datafusion-comet/issues/3841) | -| `mean` | ✅ | | -| `median` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `min` | ✅ | | -| `min_by` | 🔜 | [#3841](https://github.com/apache/datafusion-comet/issues/3841) | -| `mode` | 🔜 | [#3970](https://github.com/apache/datafusion-comet/issues/3970) | -| `percentile` | 🔜 | [#4542](https://github.com/apache/datafusion-comet/issues/4542) | -| `percentile_cont` | 🔜 | Percentile aggregate | -| `percentile_disc` | 🔜 | Percentile aggregate | -| `regr_avgx` | ✅ | Native: Spark rewrites to `Average` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | -| `regr_avgy` | ✅ | Native: Spark rewrites to `Average` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | -| `regr_count` | ✅ | Native: Spark rewrites to `Count` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | -| `regr_intercept` | 🔜 | Falls back; can reuse `covar_pop`/`var_pop` accumulators ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_r2` | 🔜 | Falls back; can reuse the `corr` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_slope` | 🔜 | Falls back; can reuse `covar_pop`/`var_pop` accumulators ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_sxx` | 🔜 | Falls back; can reuse `var_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_sxy` | 🔜 | Falls back; can reuse `covar_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_syy` | 🔜 | Falls back; can reuse `var_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `skewness` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `some` | ✅ | | -| `std` | ✅ | | -| `stddev` | ✅ | | -| `stddev_pop` | ✅ | | -| `stddev_samp` | ✅ | | -| `string_agg` | 🔜 | String aggregation (alias of `listagg`) | -| `sum` | ✅ | | -| `try_avg` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `try_sum` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `var_pop` | ✅ | | -| `var_samp` | ✅ | | -| `variance` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `any` | ✅ | | +| `any_value` | ✅ | | +| `approx_count_distinct` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `array_agg` | 🔜 | Array aggregate (related to `collect_list`, [#2524](https://github.com/apache/datafusion-comet/issues/2524)) | +| `avg` | ✅ | Interval types fall back | +| `bit_and` | ✅ | | +| `bit_or` | ✅ | | +| `bit_xor` | ✅ | | +| `bool_and` | ✅ | | +| `bool_or` | ✅ | | +| `collect_list` | 🔜 | [#2524](https://github.com/apache/datafusion-comet/issues/2524) | +| `collect_set` | ✅ | | +| `corr` | ✅ | | +| `count` | ✅ | | +| `count_if` | ✅ | | +| `covar_pop` | ✅ | | +| `covar_samp` | ✅ | | +| `every` | ✅ | | +| `first` | ✅ | | +| `first_value` | ✅ | | +| `grouping` | 🔜 | Grouping indicator for ROLLUP/CUBE/GROUPING SETS | +| `grouping_id` | 🔜 | Grouping indicator for ROLLUP/CUBE/GROUPING SETS | +| `kurtosis` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `last` | ✅ | | +| `last_value` | ✅ | | +| `listagg` | 🔜 | String aggregation | +| `max` | ✅ | | +| `max_by` | 🔜 | [#3841](https://github.com/apache/datafusion-comet/issues/3841) | +| `mean` | ✅ | | +| `median` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `min` | ✅ | | +| `min_by` | 🔜 | [#3841](https://github.com/apache/datafusion-comet/issues/3841) | +| `mode` | 🔜 | [#3970](https://github.com/apache/datafusion-comet/issues/3970) | +| `percentile` | 🔜 | [#4542](https://github.com/apache/datafusion-comet/issues/4542) | +| `percentile_cont` | 🔜 | Percentile aggregate | +| `percentile_disc` | 🔜 | Percentile aggregate | +| `regr_avgx` | ✅ | Native: Spark rewrites to `Average` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | +| `regr_avgy` | ✅ | Native: Spark rewrites to `Average` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | +| `regr_count` | ✅ | Native: Spark rewrites to `Count` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | +| `regr_intercept` | 🔜 | Falls back; can reuse `covar_pop`/`var_pop` accumulators ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_r2` | 🔜 | Falls back; can reuse the `corr` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_slope` | 🔜 | Falls back; can reuse `covar_pop`/`var_pop` accumulators ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_sxx` | 🔜 | Falls back; can reuse `var_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_sxy` | 🔜 | Falls back; can reuse `covar_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_syy` | 🔜 | Falls back; can reuse `var_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `skewness` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `some` | ✅ | | +| `std` | ✅ | | +| `stddev` | ✅ | | +| `stddev_pop` | ✅ | | +| `stddev_samp` | ✅ | | +| `string_agg` | 🔜 | String aggregation (alias of `listagg`) | +| `sum` | ✅ | | +| `try_avg` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `try_sum` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `var_pop` | ✅ | | +| `var_samp` | ✅ | | +| `variance` | ✅ | | --- ## array_funcs -| Function | Status | Notes | -| ----------------- | ------ | ----------------------------------------------------------------------------------- | -| `array` | ✅ | | -| `array_append` | ✅ | | -| `array_compact` | ✅ | | -| `array_contains` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | -| `array_distinct` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | -| `array_except` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | -| `array_insert` | ✅ | | -| `array_intersect` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | -| `array_join` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | -| `array_max` | ✅ | NaN ordering may differ ([details](compatibility/floating-point.md)) | -| `array_min` | ✅ | NaN ordering may differ ([details](compatibility/floating-point.md)) | -| `array_position` | ✅ | Binary/struct/map/null elements fall back | -| `array_prepend` | 🔜 | Sibling of `array_append` | -| `array_remove` | ✅ | | -| `array_repeat` | ✅ | | -| `array_union` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | -| `arrays_overlap` | ✅ | | -| `arrays_zip` | ✅ | | -| `element_at` | ✅ | MapType input falls back | -| `flatten` | ✅ | Binary/struct/map elements fall back | -| `get` | ✅ | | -| `sequence` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `shuffle` | 🔜 | Random array shuffle | -| `slice` | ✅ | Native ([#4149](https://github.com/apache/datafusion-comet/issues/4149)) | -| `sort_array` | ✅ | Nested struct/null arrays fall back | +| Function | Status | Notes | +| --- | --- | --- | +| `array` | ✅ | | +| `array_append` | ✅ | | +| `array_compact` | ✅ | | +| `array_contains` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | +| `array_distinct` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | +| `array_except` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | +| `array_insert` | ✅ | | +| `array_intersect` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | +| `array_join` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | +| `array_max` | ✅ | NaN ordering may differ ([details](compatibility/floating-point.md)) | +| `array_min` | ✅ | NaN ordering may differ ([details](compatibility/floating-point.md)) | +| `array_position` | ✅ | Binary/struct/map/null elements fall back | +| `array_prepend` | 🔜 | Sibling of `array_append` | +| `array_remove` | ✅ | | +| `array_repeat` | ✅ | | +| `array_union` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | +| `arrays_overlap` | ✅ | | +| `arrays_zip` | ✅ | | +| `element_at` | ✅ | MapType input falls back | +| `flatten` | ✅ | Binary/struct/map elements fall back | +| `get` | ✅ | | +| `sequence` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `shuffle` | 🔜 | Random array shuffle | +| `slice` | ✅ | Native ([#4149](https://github.com/apache/datafusion-comet/issues/4149)) | +| `sort_array` | ✅ | Nested struct/null arrays fall back | --- ## bitwise_funcs -| Function | Status | Notes | -| -------------------- | ------ | ---------------------------------------------------- | -| `&` | ✅ | | -| `<<` | ✅ | | -| `>>` | ✅ | | -| `>>>` | ✅ | Operator alias for `shiftrightunsigned` (Spark 4.0+) | -| `^` | ✅ | | -| `bit_count` | ✅ | | -| `bit_get` | ✅ | | -| `getbit` | ✅ | | -| `shiftright` | ✅ | | -| `shiftrightunsigned` | ✅ | | -| `\|` | ✅ | | -| `~` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `&` | ✅ | | +| `<<` | ✅ | | +| `>>` | ✅ | | +| `>>>` | ✅ | Operator alias for `shiftrightunsigned` (Spark 4.0+) | +| `^` | ✅ | | +| `bit_count` | ✅ | | +| `bit_get` | ✅ | | +| `getbit` | ✅ | | +| `shiftright` | ✅ | | +| `shiftrightunsigned` | ✅ | | +| `\|` | ✅ | | +| `~` | ✅ | | --- ## collection_funcs -| Function | Status | Notes | -| ------------- | ------ | ---------------------------------------------------------------------------------------------- | -| `array_size` | ✅ | | -| `cardinality` | ✅ | MapType input falls back | -| `concat` | ✅ | Binary/array children fall back | -| `reverse` | ✅ | Binary-element arrays fall back (Incompatible) ([details](compatibility/expressions/array.md)) | -| `size` | ✅ | MapType input falls back | +| Function | Status | Notes | +| --- | --- | --- | +| `array_size` | ✅ | | +| `cardinality` | ✅ | MapType input falls back | +| `concat` | ✅ | Binary/array children fall back | +| `reverse` | ✅ | Binary-element arrays fall back (Incompatible) ([details](compatibility/expressions/array.md)) | +| `size` | ✅ | MapType input falls back | --- ## conditional_funcs -| Function | Status | Notes | -| ------------ | ------ | --------------------------------------------------------------- | -| `coalesce` | ✅ | | -| `if` | ✅ | | -| `ifnull` | ✅ | | -| `nanvl` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `nullif` | ✅ | | -| `nullifzero` | ✅ | Lowers to `if`/`=` (Spark 4.0+) | -| `nvl` | ✅ | | -| `nvl2` | ✅ | | -| `when` | ✅ | | -| `zeroifnull` | ✅ | Lowers to `coalesce` (Spark 4.0+) | +| Function | Status | Notes | +| --- | --- | --- | +| `coalesce` | ✅ | | +| `if` | ✅ | | +| `ifnull` | ✅ | | +| `nanvl` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `nullif` | ✅ | | +| `nullifzero` | ✅ | Lowers to `if`/`=` (Spark 4.0+) | +| `nvl` | ✅ | | +| `nvl2` | ✅ | | +| `when` | ✅ | | +| `zeroifnull` | ✅ | Lowers to `coalesce` (Spark 4.0+) | --- @@ -214,89 +214,89 @@ The tables below list every Spark built-in expression with its current status. The type-name conversion functions (`bigint`, `binary`, `boolean`, `date`, `decimal`, `double`, `float`, `int`, `smallint`, `string`, `timestamp`, `tinyint`) are SQL aliases for `CAST(... AS )` and share the support and caveats of `cast`. -| Function | Status | Notes | -| -------- | ------ | ----------------------------------------------------------------------------------------------- | -| `cast` | ✅ | Some casts fall back; float-to-decimal is opt-in ([details](compatibility/expressions/cast.md)) | +| Function | Status | Notes | +| --- | --- | --- | +| `cast` | ✅ | Some casts fall back; float-to-decimal is opt-in ([details](compatibility/expressions/cast.md)) | --- ## datetime_funcs -| Function | Status | Notes | -| --------------------- | ------ | -------------------------------------------------------------------------------------------------------------------------------------- | -| `add_months` | ✅ | | -| `convert_timezone` | ✅ | | -| `curdate` | ✅ | Constant-folded to a literal (alias of `current_date`) | -| `current_date` | ✅ | Constant-folded to a literal before Comet sees the plan | -| `current_time` | 🔜 | Blocked on Spark 4.1 TIME type support ([#4288](https://github.com/apache/datafusion-comet/issues/4288)) | -| `current_timestamp` | ✅ | Constant-folded to a literal before Comet sees the plan | -| `current_timezone` | ✅ | | -| `date_add` | ✅ | | -| `date_diff` | ✅ | | -| `date_format` | ✅ | | -| `date_from_unix_date` | ✅ | | -| `date_part` | ✅ | | -| `date_sub` | ✅ | | -| `date_trunc` | ✅ | | -| `dateadd` | ✅ | | -| `datediff` | ✅ | | -| `datepart` | ✅ | | -| `day` | ✅ | | -| `dayname` | 🔜 | [#4544](https://github.com/apache/datafusion-comet/issues/4544) | -| `dayofmonth` | ✅ | | -| `dayofweek` | ✅ | | -| `dayofyear` | ✅ | | -| `extract` | ✅ | | -| `from_unixtime` | ✅ | | -| `from_utc_timestamp` | ✅ | Legacy zone forms fall back (Incompatible) ([details](compatibility/expressions/datetime.md)) | -| `hour` | ✅ | | -| `last_day` | ✅ | | -| `localtimestamp` | ✅ | | -| `make_date` | ✅ | | -| `make_dt_interval` | 🔜 | [#4541](https://github.com/apache/datafusion-comet/issues/4541) | -| `make_interval` | 🔜 | Produces legacy CalendarInterval; tracked by [#4540](https://github.com/apache/datafusion-comet/issues/4540) | -| `make_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `make_timestamp` | ✅ | | -| `make_timestamp_ltz` | ✅ | 2-arg TIME form falls back | -| `make_timestamp_ntz` | ✅ | 2-arg TIME form falls back | -| `make_ym_interval` | 🔜 | [#4541](https://github.com/apache/datafusion-comet/issues/4541) | -| `minute` | ✅ | | -| `month` | ✅ | | -| `monthname` | 🔜 | [#4544](https://github.com/apache/datafusion-comet/issues/4544) | -| `months_between` | ✅ | | -| `next_day` | ✅ | | -| `now` | ✅ | Constant-folded to a literal (alias of `current_timestamp`) | -| `quarter` | ✅ | | -| `second` | ✅ | | -| `session_window` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | -| `time_diff` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `time_trunc` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `timestamp_micros` | ✅ | | -| `timestamp_millis` | ✅ | | -| `timestamp_seconds` | ✅ | | -| `to_date` | ✅ | Rewrites to `Cast` (or `Cast(GetTimestamp)` with a format) before Comet sees the plan | -| `to_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `to_timestamp` | ✅ | Rewrites to `Cast` (or `GetTimestamp` with a format) before Comet sees the plan | -| `to_timestamp_ltz` | ✅ | Rewrites to `to_timestamp` (`TimestampType`) | -| `to_timestamp_ntz` | ✅ | Rewrites to `to_timestamp` (`TimestampNTZType`) | -| `to_unix_timestamp` | ✅ | | -| `to_utc_timestamp` | ✅ | Legacy zone forms fall back (Incompatible) ([details](compatibility/expressions/datetime.md)) | -| `trunc` | ✅ | | -| `try_make_interval` | 🔜 | Produces legacy CalendarInterval; tracked by [#4540](https://github.com/apache/datafusion-comet/issues/4540) | -| `try_make_timestamp` | ⚠️ | Returns a wrong value instead of NULL for invalid inputs ([#4554](https://github.com/apache/datafusion-comet/issues/4554)) | -| `try_to_date` | 🔜 | Rewrites to `Cast`/`GetTimestamp` but currently falls back; tracked by [#4556](https://github.com/apache/datafusion-comet/issues/4556) | -| `try_to_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `try_to_timestamp` | 🔜 | Rewrites to `Cast`/`GetTimestamp` but currently falls back; tracked by [#4556](https://github.com/apache/datafusion-comet/issues/4556) | -| `unix_date` | ✅ | | -| `unix_micros` | ✅ | | -| `unix_millis` | ✅ | | -| `unix_seconds` | ✅ | | -| `unix_timestamp` | ✅ | | -| `weekday` | ✅ | | -| `weekofyear` | ✅ | | -| `window` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | -| `window_time` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | -| `year` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `add_months` | ✅ | | +| `convert_timezone` | ✅ | | +| `curdate` | ✅ | Constant-folded to a literal (alias of `current_date`) | +| `current_date` | ✅ | Constant-folded to a literal before Comet sees the plan | +| `current_time` | 🔜 | Blocked on Spark 4.1 TIME type support ([#4288](https://github.com/apache/datafusion-comet/issues/4288)) | +| `current_timestamp` | ✅ | Constant-folded to a literal before Comet sees the plan | +| `current_timezone` | ✅ | | +| `date_add` | ✅ | | +| `date_diff` | ✅ | | +| `date_format` | ✅ | | +| `date_from_unix_date` | ✅ | | +| `date_part` | ✅ | | +| `date_sub` | ✅ | | +| `date_trunc` | ✅ | | +| `dateadd` | ✅ | | +| `datediff` | ✅ | | +| `datepart` | ✅ | | +| `day` | ✅ | | +| `dayname` | 🔜 | [#4544](https://github.com/apache/datafusion-comet/issues/4544) | +| `dayofmonth` | ✅ | | +| `dayofweek` | ✅ | | +| `dayofyear` | ✅ | | +| `extract` | ✅ | | +| `from_unixtime` | ✅ | | +| `from_utc_timestamp` | ✅ | Legacy zone forms fall back (Incompatible) ([details](compatibility/expressions/datetime.md)) | +| `hour` | ✅ | | +| `last_day` | ✅ | | +| `localtimestamp` | ✅ | | +| `make_date` | ✅ | | +| `make_dt_interval` | 🔜 | [#4541](https://github.com/apache/datafusion-comet/issues/4541) | +| `make_interval` | 🔜 | Produces legacy CalendarInterval; tracked by [#4540](https://github.com/apache/datafusion-comet/issues/4540) | +| `make_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `make_timestamp` | ✅ | | +| `make_timestamp_ltz` | ✅ | 2-arg TIME form falls back | +| `make_timestamp_ntz` | ✅ | 2-arg TIME form falls back | +| `make_ym_interval` | 🔜 | [#4541](https://github.com/apache/datafusion-comet/issues/4541) | +| `minute` | ✅ | | +| `month` | ✅ | | +| `monthname` | 🔜 | [#4544](https://github.com/apache/datafusion-comet/issues/4544) | +| `months_between` | ✅ | | +| `next_day` | ✅ | | +| `now` | ✅ | Constant-folded to a literal (alias of `current_timestamp`) | +| `quarter` | ✅ | | +| `second` | ✅ | | +| `session_window` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | +| `time_diff` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `time_trunc` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `timestamp_micros` | ✅ | | +| `timestamp_millis` | ✅ | | +| `timestamp_seconds` | ✅ | | +| `to_date` | ✅ | Rewrites to `Cast` (or `Cast(GetTimestamp)` with a format) before Comet sees the plan | +| `to_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `to_timestamp` | ✅ | Rewrites to `Cast` (or `GetTimestamp` with a format) before Comet sees the plan | +| `to_timestamp_ltz` | ✅ | Rewrites to `to_timestamp` (`TimestampType`) | +| `to_timestamp_ntz` | ✅ | Rewrites to `to_timestamp` (`TimestampNTZType`) | +| `to_unix_timestamp` | ✅ | | +| `to_utc_timestamp` | ✅ | Legacy zone forms fall back (Incompatible) ([details](compatibility/expressions/datetime.md)) | +| `trunc` | ✅ | | +| `try_make_interval` | 🔜 | Produces legacy CalendarInterval; tracked by [#4540](https://github.com/apache/datafusion-comet/issues/4540) | +| `try_make_timestamp` | ⚠️ | Returns a wrong value instead of NULL for invalid inputs ([#4554](https://github.com/apache/datafusion-comet/issues/4554)) | +| `try_to_date` | 🔜 | Rewrites to `Cast`/`GetTimestamp` but currently falls back; tracked by [#4556](https://github.com/apache/datafusion-comet/issues/4556) | +| `try_to_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `try_to_timestamp` | 🔜 | Rewrites to `Cast`/`GetTimestamp` but currently falls back; tracked by [#4556](https://github.com/apache/datafusion-comet/issues/4556) | +| `unix_date` | ✅ | | +| `unix_micros` | ✅ | | +| `unix_millis` | ✅ | | +| `unix_seconds` | ✅ | | +| `unix_timestamp` | ✅ | | +| `weekday` | ✅ | | +| `weekofyear` | ✅ | | +| `window` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | +| `window_time` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | +| `year` | ✅ | | --- @@ -306,43 +306,43 @@ The type-name conversion functions (`bigint`, `binary`, `boolean`, `date`, `deci expression-level). The `outer` variants are wired but marked `Incompatible`; they require `spark.comet.exec.explode.enabled=true` and `allowIncompatible`. -| Function | Status | Notes | -| ------------------ | ------ | ----------------------------------------------------------------------------------------------------------------------------- | -| `explode` | ✅ | via `CometExplodeExec` | -| `explode_outer` | ✅ | outer=true falls back (Incompatible) ([audit](../../contributor-guide/expression-audits/generator_funcs.md#explode_outer)) | -| `inline` | 🔜 | Operator-level generator (like `explode`) | -| `inline_outer` | 🔜 | Operator-level generator (like `explode`) | -| `posexplode` | ✅ | via `CometExplodeExec` | -| `posexplode_outer` | ✅ | outer=true falls back (Incompatible) ([audit](../../contributor-guide/expression-audits/generator_funcs.md#posexplode_outer)) | -| `stack` | 🔜 | Operator-level generator | +| Function | Status | Notes | +| --- | --- | --- | +| `explode` | ✅ | via `CometExplodeExec` | +| `explode_outer` | ✅ | outer=true falls back (Incompatible) ([audit](../../contributor-guide/expression-audits/generator_funcs.md#explode_outer)) | +| `inline` | 🔜 | Operator-level generator (like `explode`) | +| `inline_outer` | 🔜 | Operator-level generator (like `explode`) | +| `posexplode` | ✅ | via `CometExplodeExec` | +| `posexplode_outer` | ✅ | outer=true falls back (Incompatible) ([audit](../../contributor-guide/expression-audits/generator_funcs.md#posexplode_outer)) | +| `stack` | 🔜 | Operator-level generator | --- ## hash_funcs -| Function | Status | Notes | -| ---------- | ------ | ----- | -| `crc32` | ✅ | | -| `hash` | ✅ | | -| `md5` | ✅ | | -| `sha` | ✅ | | -| `sha1` | ✅ | | -| `sha2` | ✅ | | -| `xxhash64` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `crc32` | ✅ | | +| `hash` | ✅ | | +| `md5` | ✅ | | +| `sha` | ✅ | | +| `sha1` | ✅ | | +| `sha2` | ✅ | | +| `xxhash64` | ✅ | | --- ## json_funcs -| Function | Status | Notes | -| ------------------- | ------ | -------------------------------------------------------------------------------------------------------------------------------- | -| `from_json` | ✅ | Falls back by default; opt-in via allowIncompatible ([audit](../../contributor-guide/expression-audits/json_funcs.md#from_json)) | -| `get_json_object` | ✅ | Some inputs need allowIncompatible ([audit](../../contributor-guide/expression-audits/json_funcs.md#get_json_object)) | -| `json_array_length` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `json_object_keys` | 🔜 | [#3161](https://github.com/apache/datafusion-comet/issues/3161) | -| `json_tuple` | 🔜 | [#3160](https://github.com/apache/datafusion-comet/issues/3160) | -| `schema_of_json` | 🔜 | [#3163](https://github.com/apache/datafusion-comet/issues/3163) | -| `to_json` | ✅ | Options and map/array inputs fall back ([audit](../../contributor-guide/expression-audits/json_funcs.md#to_json)) | +| Function | Status | Notes | +| --- | --- | --- | +| `from_json` | ✅ | Falls back by default; opt-in via allowIncompatible ([audit](../../contributor-guide/expression-audits/json_funcs.md#from_json)) | +| `get_json_object` | ✅ | Some inputs need allowIncompatible ([audit](../../contributor-guide/expression-audits/json_funcs.md#get_json_object)) | +| `json_array_length` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `json_object_keys` | 🔜 | [#3161](https://github.com/apache/datafusion-comet/issues/3161) | +| `json_tuple` | 🔜 | [#3160](https://github.com/apache/datafusion-comet/issues/3160) | +| `schema_of_json` | 🔜 | [#3163](https://github.com/apache/datafusion-comet/issues/3163) | +| `to_json` | ✅ | Options and map/array inputs fall back ([audit](../../contributor-guide/expression-audits/json_funcs.md#to_json)) | --- @@ -350,269 +350,269 @@ expression-level). The `outer` variants are wired but marked `Incompatible`; the All higher-order functions are planned via [#4224](https://github.com/apache/datafusion-comet/issues/4224). -| Function | Status | Notes | -| ------------------ | ------ | --------------------------------------------------------------- | -| `aggregate` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `array_sort` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `exists` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `filter` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `forall` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `map_filter` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `map_zip_with` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `reduce` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `transform` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `transform_keys` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `transform_values` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `zip_with` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| Function | Status | Notes | +| --- | --- | --- | +| `aggregate` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `array_sort` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `exists` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `filter` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `forall` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `map_filter` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `map_zip_with` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `reduce` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `transform` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `transform_keys` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `transform_values` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `zip_with` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | --- ## map_funcs -| Function | Status | Notes | -| ------------------ | ------ | -------------------------------------------------------------------------------------------- | -| `element_at` | ✅ | MapType input falls back | -| `map` | 🔜 | Constructs a map | -| `map_concat` | 🔜 | Concatenates maps | -| `map_contains_key` | ✅ | | -| `map_entries` | ✅ | | -| `map_from_arrays` | ✅ | | -| `map_from_entries` | ✅ | BinaryType key/value falls back (Incompatible) ([details](compatibility/expressions/map.md)) | -| `map_keys` | ✅ | | -| `map_values` | ✅ | | -| `str_to_map` | ✅ | | -| `try_element_at` | ✅ | Lowers to `element_at`; array input (MapType falls back) | +| Function | Status | Notes | +| --- | --- | --- | +| `element_at` | ✅ | MapType input falls back | +| `map` | 🔜 | Constructs a map | +| `map_concat` | 🔜 | Concatenates maps | +| `map_contains_key` | ✅ | | +| `map_entries` | ✅ | | +| `map_from_arrays` | ✅ | | +| `map_from_entries` | ✅ | BinaryType key/value falls back (Incompatible) ([details](compatibility/expressions/map.md)) | +| `map_keys` | ✅ | | +| `map_values` | ✅ | | +| `str_to_map` | ✅ | | +| `try_element_at` | ✅ | Lowers to `element_at`; array input (MapType falls back) | --- ## math_funcs -| Function | Status | Notes | -| -------------- | ------ | ---------------------------------------------------------------------------------------------------------------------------- | -| `%` | ✅ | try_mod (TRY mode) falls back | -| `*` | ✅ | Interval multiplication falls back | -| `+` | ✅ | | -| `-` | ✅ | | -| `/` | ✅ | | -| `abs` | ✅ | Interval types fall back | -| `acos` | ✅ | | -| `acosh` | ✅ | | -| `asin` | ✅ | | -| `asinh` | ✅ | | -| `atan` | ✅ | | -| `atan2` | ✅ | | -| `atanh` | ✅ | | -| `bin` | ✅ | | -| `bround` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `cbrt` | ✅ | | -| `ceil` | ✅ | Two-arg form falls back | -| `ceiling` | ✅ | | -| `conv` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `cos` | ✅ | | -| `cosh` | ✅ | | -| `cot` | ✅ | | -| `csc` | ✅ | | -| `degrees` | ✅ | | -| `div` | ✅ | | -| `e` | ✅ | Folds to a literal (like `pi`) | -| `exp` | ✅ | | -| `expm1` | ✅ | | -| `factorial` | ✅ | | -| `floor` | ✅ | Two-arg form falls back | -| `greatest` | ✅ | | -| `hex` | ✅ | | -| `hypot` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `least` | ✅ | | -| `ln` | ✅ | | -| `log` | ✅ | | -| `log10` | ✅ | | -| `log1p` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `log2` | ✅ | | -| `mod` | ✅ | | -| `negative` | ✅ | | -| `pi` | ✅ | | -| `pmod` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `positive` | ✅ | | -| `pow` | ✅ | | -| `power` | ✅ | | -| `radians` | ✅ | | -| `rand` | ✅ | | -| `randn` | ✅ | | -| `random` | ✅ | Alias for `rand` (Spark 4.0+); seed must be a literal | -| `randstr` | 🔜 | Random string (Spark 4.0+) | -| `rint` | ✅ | | -| `round` | ✅ | Float/double inputs fall back | -| `sec` | ✅ | | -| `shiftleft` | ✅ | | -| `sign` | ✅ | | -| `signum` | ✅ | | -| `sin` | ✅ | | -| `sinh` | ✅ | | -| `sqrt` | ✅ | | -| `tan` | ✅ | | -| `tanh` | ✅ | | -| `try_add` | ✅ | Datetime/interval form falls back | -| `try_divide` | ✅ | | -| `try_mod` | 🔜 | Lowers to `Remainder` with TRY eval mode, which falls back ([#4484](https://github.com/apache/datafusion-comet/issues/4484)) | -| `try_multiply` | ✅ | | -| `try_subtract` | ✅ | | -| `unhex` | ✅ | | -| `uniform` | ✅ | Constant-folded; literal arguments only (Spark 4.0+) | -| `width_bucket` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `%` | ✅ | try_mod (TRY mode) falls back | +| `*` | ✅ | Interval multiplication falls back | +| `+` | ✅ | | +| `-` | ✅ | | +| `/` | ✅ | | +| `abs` | ✅ | Interval types fall back | +| `acos` | ✅ | | +| `acosh` | ✅ | | +| `asin` | ✅ | | +| `asinh` | ✅ | | +| `atan` | ✅ | | +| `atan2` | ✅ | | +| `atanh` | ✅ | | +| `bin` | ✅ | | +| `bround` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `cbrt` | ✅ | | +| `ceil` | ✅ | Two-arg form falls back | +| `ceiling` | ✅ | | +| `conv` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `cos` | ✅ | | +| `cosh` | ✅ | | +| `cot` | ✅ | | +| `csc` | ✅ | | +| `degrees` | ✅ | | +| `div` | ✅ | | +| `e` | ✅ | Folds to a literal (like `pi`) | +| `exp` | ✅ | | +| `expm1` | ✅ | | +| `factorial` | ✅ | | +| `floor` | ✅ | Two-arg form falls back | +| `greatest` | ✅ | | +| `hex` | ✅ | | +| `hypot` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `least` | ✅ | | +| `ln` | ✅ | | +| `log` | ✅ | | +| `log10` | ✅ | | +| `log1p` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `log2` | ✅ | | +| `mod` | ✅ | | +| `negative` | ✅ | | +| `pi` | ✅ | | +| `pmod` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `positive` | ✅ | | +| `pow` | ✅ | | +| `power` | ✅ | | +| `radians` | ✅ | | +| `rand` | ✅ | | +| `randn` | ✅ | | +| `random` | ✅ | Alias for `rand` (Spark 4.0+); seed must be a literal | +| `randstr` | 🔜 | Random string (Spark 4.0+) | +| `rint` | ✅ | | +| `round` | ✅ | Float/double inputs fall back | +| `sec` | ✅ | | +| `shiftleft` | ✅ | | +| `sign` | ✅ | | +| `signum` | ✅ | | +| `sin` | ✅ | | +| `sinh` | ✅ | | +| `sqrt` | ✅ | | +| `tan` | ✅ | | +| `tanh` | ✅ | | +| `try_add` | ✅ | Datetime/interval form falls back | +| `try_divide` | ✅ | | +| `try_mod` | 🔜 | Lowers to `Remainder` with TRY eval mode, which falls back ([#4484](https://github.com/apache/datafusion-comet/issues/4484)) | +| `try_multiply` | ✅ | | +| `try_subtract` | ✅ | | +| `unhex` | ✅ | | +| `uniform` | ✅ | Constant-folded; literal arguments only (Spark 4.0+) | +| `width_bucket` | ✅ | | --- ## misc_funcs -| Function | Status | Notes | -| ----------------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------ | -| `aes_decrypt` | 🔜 | Falls back; `StaticInvoke` not allowlisted; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)) | -| `aes_encrypt` | 🔜 | Falls back; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)); nondeterministic IV by default | -| `assert_true` | 🔜 | Lowers to `RaiseError`, which falls back | -| `current_catalog` | ✅ | Resolved to a literal by the analyzer (`ReplaceCurrentLike`) | -| `current_database` | ✅ | Resolved to a literal by the analyzer (`ReplaceCurrentLike`) | -| `current_schema` | ✅ | Alias of `current_database`; resolved to a literal by the analyzer | -| `current_user` | ✅ | Resolved to a literal by the analyzer; same as `user` | -| `equal_null` | ✅ | Lowers to `<=>` (`EqualNullSafe`) | -| `is_variant_null` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `monotonically_increasing_id` | ✅ | | -| `parse_json` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `raise_error` | 🔜 | Raises a runtime error | -| `rand` | ✅ | Seed must be a literal | -| `randn` | ✅ | Seed must be a literal | -| `schema_of_variant` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `schema_of_variant_agg` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `session_user` | ✅ | Alias of `current_user`; resolved to a literal by the analyzer | -| `spark_partition_id` | ✅ | | -| `to_variant_object` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `try_aes_decrypt` | 🔜 | Falls back; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)) | -| `try_parse_json` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `try_variant_get` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `typeof` | ✅ | Foldable; resolved to a literal before Comet sees the plan | -| `user` | ✅ | Resolved to a literal by the Spark analyzer before reaching Comet | -| `uuid` | 🔜 | Nondeterministic random UUID | -| `variant_get` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| Function | Status | Notes | +| --- | --- | --- | +| `aes_decrypt` | 🔜 | Falls back; `StaticInvoke` not allowlisted; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)) | +| `aes_encrypt` | 🔜 | Falls back; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)); nondeterministic IV by default | +| `assert_true` | 🔜 | Lowers to `RaiseError`, which falls back | +| `current_catalog` | ✅ | Resolved to a literal by the analyzer (`ReplaceCurrentLike`) | +| `current_database` | ✅ | Resolved to a literal by the analyzer (`ReplaceCurrentLike`) | +| `current_schema` | ✅ | Alias of `current_database`; resolved to a literal by the analyzer | +| `current_user` | ✅ | Resolved to a literal by the analyzer; same as `user` | +| `equal_null` | ✅ | Lowers to `<=>` (`EqualNullSafe`) | +| `is_variant_null` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `monotonically_increasing_id` | ✅ | | +| `parse_json` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `raise_error` | 🔜 | Raises a runtime error | +| `rand` | ✅ | Seed must be a literal | +| `randn` | ✅ | Seed must be a literal | +| `schema_of_variant` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `schema_of_variant_agg` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `session_user` | ✅ | Alias of `current_user`; resolved to a literal by the analyzer | +| `spark_partition_id` | ✅ | | +| `to_variant_object` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `try_aes_decrypt` | 🔜 | Falls back; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)) | +| `try_parse_json` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `try_variant_get` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `typeof` | ✅ | Foldable; resolved to a literal before Comet sees the plan | +| `user` | ✅ | Resolved to a literal by the Spark analyzer before reaching Comet | +| `uuid` | 🔜 | Nondeterministic random UUID | +| `variant_get` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | --- ## predicate_funcs -| Function | Status | Notes | -| ------------- | ------ | --------------------------------------------------------------------------------------- | -| `!` | ✅ | | -| `<` | ✅ | | -| `<=` | ✅ | | -| `<=>` | ✅ | | -| `=` | ✅ | | -| `==` | ✅ | | -| `>` | ✅ | | -| `>=` | ✅ | | -| `and` | ✅ | | -| `between` | ✅ | | -| `ilike` | ✅ | | -| `in` | ✅ | | -| `isnan` | ✅ | | -| `isnotnull` | ✅ | | -| `isnull` | ✅ | | -| `like` | ✅ | | -| `not` | ✅ | | -| `or` | ✅ | | -| `regexp` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | -| `regexp_like` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | -| `rlike` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | +| Function | Status | Notes | +| --- | --- | --- | +| `!` | ✅ | | +| `<` | ✅ | | +| `<=` | ✅ | | +| `<=>` | ✅ | | +| `=` | ✅ | | +| `==` | ✅ | | +| `>` | ✅ | | +| `>=` | ✅ | | +| `and` | ✅ | | +| `between` | ✅ | | +| `ilike` | ✅ | | +| `in` | ✅ | | +| `isnan` | ✅ | | +| `isnotnull` | ✅ | | +| `isnull` | ✅ | | +| `like` | ✅ | | +| `not` | ✅ | | +| `or` | ✅ | | +| `regexp` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | +| `regexp_like` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | +| `rlike` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | --- ## string_funcs -| Function | Status | Notes | -| -------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------ | -| `ascii` | ✅ | | -| `base64` | 🔜 | Lowers to `StaticInvoke(encode)` (not allowlisted); falls back | -| `bit_length` | ✅ | | -| `btrim` | ✅ | | -| `char` | ✅ | | -| `char_length` | ✅ | | -| `character_length` | ✅ | | -| `chr` | ✅ | | -| `collate` | 🔜 | Spark collation (umbrella [#2190](https://github.com/apache/datafusion-comet/issues/2190)) | -| `collation` | ✅ | Constant-folded to a literal (Spark 4.0+) | -| `concat_ws` | ✅ | | -| `contains` | ✅ | | -| `decode` | ✅ | | -| `elt` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `encode` | 🔜 | Lowers to `StaticInvoke(encode)` (not allowlisted); falls back | -| `endswith` | ✅ | | -| `find_in_set` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `format_number` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `format_string` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `initcap` | ✅ | | -| `instr` | ✅ | | -| `lcase` | ✅ | | -| `left` | ✅ | | -| `len` | ✅ | | -| `length` | ✅ | | -| `levenshtein` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `locate` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `lower` | ✅ | | -| `lpad` | ✅ | | -| `ltrim` | ✅ | | -| `luhn_check` | ✅ | Native via `StaticInvoke` (tests: luhn_check.sql) | -| `mask` | 🔜 | Data masking | -| `octet_length` | ✅ | | -| `overlay` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `position` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `printf` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `regexp_count` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `regexp_extract` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `regexp_extract_all` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `regexp_instr` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `regexp_replace` | ✅ | | -| `regexp_substr` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `repeat` | ✅ | | -| `replace` | ✅ | | -| `right` | ✅ | | -| `rpad` | ✅ | | -| `rtrim` | ✅ | | -| `soundex` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `space` | ✅ | | -| `split` | ✅ | | -| `split_part` | 🔜 | Lowers to `element_at(StringSplitSQL(...))`; `StringSplitSQL` falls back ([#4561](https://github.com/apache/datafusion-comet/issues/4561)) | -| `startswith` | ✅ | | -| `substr` | ✅ | | -| `substring` | ✅ | | -| `substring_index` | ✅ | | -| `to_binary` | ✅ | Hex form accelerated; other formats fall back | -| `to_char` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `to_number` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `to_varchar` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `translate` | ✅ | | -| `trim` | ✅ | | -| `try_to_binary` | 🔜 | Lowers to `TryEval(...)`, which falls back | -| `try_to_number` | 🔜 | TRY variant of `to_number` | -| `ucase` | ✅ | | -| `unbase64` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `upper` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `ascii` | ✅ | | +| `base64` | 🔜 | Lowers to `StaticInvoke(encode)` (not allowlisted); falls back | +| `bit_length` | ✅ | | +| `btrim` | ✅ | | +| `char` | ✅ | | +| `char_length` | ✅ | | +| `character_length` | ✅ | | +| `chr` | ✅ | | +| `collate` | 🔜 | Spark collation (umbrella [#2190](https://github.com/apache/datafusion-comet/issues/2190)) | +| `collation` | ✅ | Constant-folded to a literal (Spark 4.0+) | +| `concat_ws` | ✅ | | +| `contains` | ✅ | | +| `decode` | ✅ | | +| `elt` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `encode` | 🔜 | Lowers to `StaticInvoke(encode)` (not allowlisted); falls back | +| `endswith` | ✅ | | +| `find_in_set` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `format_number` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `format_string` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `initcap` | ✅ | | +| `instr` | ✅ | | +| `lcase` | ✅ | | +| `left` | ✅ | | +| `len` | ✅ | | +| `length` | ✅ | | +| `levenshtein` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `locate` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `lower` | ✅ | | +| `lpad` | ✅ | | +| `ltrim` | ✅ | | +| `luhn_check` | ✅ | Native via `StaticInvoke` (tests: luhn_check.sql) | +| `mask` | 🔜 | Data masking | +| `octet_length` | ✅ | | +| `overlay` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `position` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `printf` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `regexp_count` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `regexp_extract` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `regexp_extract_all` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `regexp_instr` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `regexp_replace` | ✅ | | +| `regexp_substr` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `repeat` | ✅ | | +| `replace` | ✅ | | +| `right` | ✅ | | +| `rpad` | ✅ | | +| `rtrim` | ✅ | | +| `soundex` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `space` | ✅ | | +| `split` | ✅ | | +| `split_part` | 🔜 | Lowers to `element_at(StringSplitSQL(...))`; `StringSplitSQL` falls back ([#4561](https://github.com/apache/datafusion-comet/issues/4561)) | +| `startswith` | ✅ | | +| `substr` | ✅ | | +| `substring` | ✅ | | +| `substring_index` | ✅ | | +| `to_binary` | ✅ | Hex form accelerated; other formats fall back | +| `to_char` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `to_number` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `to_varchar` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `translate` | ✅ | | +| `trim` | ✅ | | +| `try_to_binary` | 🔜 | Lowers to `TryEval(...)`, which falls back | +| `try_to_number` | 🔜 | TRY variant of `to_number` | +| `ucase` | ✅ | | +| `unbase64` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `upper` | ✅ | | --- ## struct_funcs -| Function | Status | Notes | -| -------------- | ------ | ------------------------------- | -| `named_struct` | ✅ | Duplicate field names fall back | -| `struct` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `named_struct` | ✅ | Duplicate field names fall back | +| `struct` | ✅ | | --- ## url_funcs -| Function | Status | Notes | -| ---------------- | ------ | ----- | -| `parse_url` | ✅ | | -| `try_url_decode` | ✅ | | -| `url_decode` | ✅ | | -| `url_encode` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `parse_url` | ✅ | | +| `try_url_decode` | ✅ | | +| `url_decode` | ✅ | | +| `url_encode` | ✅ | | --- @@ -625,17 +625,17 @@ When enabled, `lag` and `lead` are explicitly wired; aggregate window functions `ntile`, `percent_rank`, `cume_dist`, `nth_value`) are not yet wired in the window serde and fall back to Spark. -| Function | Status | Notes | -| -------------- | ------ | ------------------------------------------------------------------------------------------- | -| `cume_dist` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `dense_rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `lag` | ✅ | via `CometWindowExec` | -| `lead` | ✅ | via `CometWindowExec` | -| `nth_value` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `ntile` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `percent_rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `row_number` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| Function | Status | Notes | +| --- | --- | --- | +| `cume_dist` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `dense_rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `lag` | ✅ | via `CometWindowExec` | +| `lead` | ✅ | via `CometWindowExec` | +| `nth_value` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `ntile` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `percent_rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `row_number` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | ---