From 48fe026dc30c3016a4f5e9d7930ad995b3a9c678 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Wed, 3 Jun 2026 10:58:56 -0600 Subject: [PATCH 1/3] build: exempt expressions.md from prettier formatting prettier re-aligns markdown table columns to the widest cell, so adding a single expression row rewrites every row in the table. That produces noisy diffs and frequent merge conflicts between PRs that each add new expressions. Exempt the file from prettier so future additions stay as one-line diffs. --- .prettierignore | 5 +++++ 1 file changed, 5 insertions(+) create mode 100644 .prettierignore diff --git a/.prettierignore b/.prettierignore new file mode 100644 index 0000000000..4b81e3fef4 --- /dev/null +++ b/.prettierignore @@ -0,0 +1,5 @@ +# prettier re-aligns markdown table columns to the widest cell, so adding a +# single expression row rewrites every row in the table. That produces noisy +# diffs and frequent merge conflicts between PRs that each add new expressions. +# This file is almost entirely tables, so exempt it from prettier formatting. +docs/source/user-guide/latest/expressions.md From 6df735af9cc20549f5c1af18c7d9d316649cd83a Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Wed, 3 Jun 2026 10:58:56 -0600 Subject: [PATCH 2/3] docs: collapse expression tables to single-space padding With prettier no longer aligning the tables, collapse the existing column padding so that adding an expression row never shifts the other rows. Combined with the prettier exemption, every future addition is a true one-line diff that cannot collide on re-alignment. --- docs/source/user-guide/latest/expressions.md | 940 +++++++++---------- 1 file changed, 470 insertions(+), 470 deletions(-) diff --git a/docs/source/user-guide/latest/expressions.md b/docs/source/user-guide/latest/expressions.md index 3627766445..916d4a1785 100644 --- a/docs/source/user-guide/latest/expressions.md +++ b/docs/source/user-guide/latest/expressions.md @@ -39,12 +39,12 @@ Most expressions can also be disabled with `spark.comet.expression.EXPRNAME.enab ## Status legend -| Status | Meaning | -| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| ✅ Supported | Comet produces Spark-compatible results by default. Some inputs or forms may fall back to Spark, and any incompatible behavior is opt-in (off by default). | -| ⚠️ Incorrect by default | Comet runs natively by default but can return results that differ from Spark (a wrong value, or a native error on valid input). See the linked detail on each row. | -| 🔜 Planned | Intended; tracked by an open issue or pull request. | -| 💤 Not currently planned | Not on the current roadmap; falls back to Spark and may be reconsidered later. | +| Status | Meaning | +| --- | --- | +| ✅ Supported | Comet produces Spark-compatible results by default. Some inputs or forms may fall back to Spark, and any incompatible behavior is opt-in (off by default). | +| ⚠️ Incorrect by default | Comet runs natively by default but can return results that differ from Spark (a wrong value, or a native error on valid input). See the linked detail on each row. | +| 🔜 Planned | Intended; tracked by an open issue or pull request. | +| 💤 Not currently planned | Not on the current roadmap; falls back to Spark and may be reconsidered later. | ## Not currently planned @@ -67,146 +67,146 @@ The tables below list every Spark built-in expression with its current status. ## agg_funcs -| Function | Status | Notes | -| ----------------------- | ------ | -------------------------------------------------------------------------------------------------------------------------- | -| `any` | ✅ | | -| `any_value` | ✅ | | -| `approx_count_distinct` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `array_agg` | 🔜 | Array aggregate (related to `collect_list`, [#2524](https://github.com/apache/datafusion-comet/issues/2524)) | -| `avg` | ✅ | Interval types fall back | -| `bit_and` | ✅ | | -| `bit_or` | ✅ | | -| `bit_xor` | ✅ | | -| `bool_and` | ✅ | | -| `bool_or` | ✅ | | -| `collect_list` | 🔜 | [#2524](https://github.com/apache/datafusion-comet/issues/2524) | -| `collect_set` | ✅ | | -| `corr` | ✅ | | -| `count` | ✅ | | -| `count_if` | ✅ | | -| `covar_pop` | ✅ | | -| `covar_samp` | ✅ | | -| `every` | ✅ | | -| `first` | ✅ | | -| `first_value` | ✅ | | -| `grouping` | 🔜 | Grouping indicator for ROLLUP/CUBE/GROUPING SETS | -| `grouping_id` | 🔜 | Grouping indicator for ROLLUP/CUBE/GROUPING SETS | -| `kurtosis` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `last` | ✅ | | -| `last_value` | ✅ | | -| `listagg` | 🔜 | String aggregation | -| `max` | ✅ | | -| `max_by` | 🔜 | [#3841](https://github.com/apache/datafusion-comet/issues/3841) | -| `mean` | ✅ | | -| `median` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `min` | ✅ | | -| `min_by` | 🔜 | [#3841](https://github.com/apache/datafusion-comet/issues/3841) | -| `mode` | 🔜 | [#3970](https://github.com/apache/datafusion-comet/issues/3970) | -| `percentile` | 🔜 | [#4542](https://github.com/apache/datafusion-comet/issues/4542) | -| `percentile_cont` | 🔜 | Percentile aggregate | -| `percentile_disc` | 🔜 | Percentile aggregate | -| `regr_avgx` | ✅ | Native: Spark rewrites to `Average` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | -| `regr_avgy` | ✅ | Native: Spark rewrites to `Average` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | -| `regr_count` | ✅ | Native: Spark rewrites to `Count` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | -| `regr_intercept` | 🔜 | Falls back; can reuse `covar_pop`/`var_pop` accumulators ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_r2` | 🔜 | Falls back; can reuse the `corr` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_slope` | 🔜 | Falls back; can reuse `covar_pop`/`var_pop` accumulators ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_sxx` | 🔜 | Falls back; can reuse `var_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_sxy` | 🔜 | Falls back; can reuse `covar_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `regr_syy` | 🔜 | Falls back; can reuse `var_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | -| `skewness` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `some` | ✅ | | -| `std` | ✅ | | -| `stddev` | ✅ | | -| `stddev_pop` | ✅ | | -| `stddev_samp` | ✅ | | -| `string_agg` | 🔜 | String aggregation (alias of `listagg`) | -| `sum` | ✅ | | -| `try_avg` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `try_sum` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `var_pop` | ✅ | | -| `var_samp` | ✅ | | -| `variance` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `any` | ✅ | | +| `any_value` | ✅ | | +| `approx_count_distinct` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `array_agg` | 🔜 | Array aggregate (related to `collect_list`, [#2524](https://github.com/apache/datafusion-comet/issues/2524)) | +| `avg` | ✅ | Interval types fall back | +| `bit_and` | ✅ | | +| `bit_or` | ✅ | | +| `bit_xor` | ✅ | | +| `bool_and` | ✅ | | +| `bool_or` | ✅ | | +| `collect_list` | 🔜 | [#2524](https://github.com/apache/datafusion-comet/issues/2524) | +| `collect_set` | ✅ | | +| `corr` | ✅ | | +| `count` | ✅ | | +| `count_if` | ✅ | | +| `covar_pop` | ✅ | | +| `covar_samp` | ✅ | | +| `every` | ✅ | | +| `first` | ✅ | | +| `first_value` | ✅ | | +| `grouping` | 🔜 | Grouping indicator for ROLLUP/CUBE/GROUPING SETS | +| `grouping_id` | 🔜 | Grouping indicator for ROLLUP/CUBE/GROUPING SETS | +| `kurtosis` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `last` | ✅ | | +| `last_value` | ✅ | | +| `listagg` | 🔜 | String aggregation | +| `max` | ✅ | | +| `max_by` | 🔜 | [#3841](https://github.com/apache/datafusion-comet/issues/3841) | +| `mean` | ✅ | | +| `median` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `min` | ✅ | | +| `min_by` | 🔜 | [#3841](https://github.com/apache/datafusion-comet/issues/3841) | +| `mode` | 🔜 | [#3970](https://github.com/apache/datafusion-comet/issues/3970) | +| `percentile` | 🔜 | [#4542](https://github.com/apache/datafusion-comet/issues/4542) | +| `percentile_cont` | 🔜 | Percentile aggregate | +| `percentile_disc` | 🔜 | Percentile aggregate | +| `regr_avgx` | ✅ | Native: Spark rewrites to `Average` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | +| `regr_avgy` | ✅ | Native: Spark rewrites to `Average` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | +| `regr_count` | ✅ | Native: Spark rewrites to `Count` (tests in [#4551](https://github.com/apache/datafusion-comet/issues/4551)) | +| `regr_intercept` | 🔜 | Falls back; can reuse `covar_pop`/`var_pop` accumulators ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_r2` | 🔜 | Falls back; can reuse the `corr` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_slope` | 🔜 | Falls back; can reuse `covar_pop`/`var_pop` accumulators ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_sxx` | 🔜 | Falls back; can reuse `var_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_sxy` | 🔜 | Falls back; can reuse `covar_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `regr_syy` | 🔜 | Falls back; can reuse `var_pop` accumulator ([#4552](https://github.com/apache/datafusion-comet/issues/4552)) | +| `skewness` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `some` | ✅ | | +| `std` | ✅ | | +| `stddev` | ✅ | | +| `stddev_pop` | ✅ | | +| `stddev_samp` | ✅ | | +| `string_agg` | 🔜 | String aggregation (alias of `listagg`) | +| `sum` | ✅ | | +| `try_avg` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `try_sum` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `var_pop` | ✅ | | +| `var_samp` | ✅ | | +| `variance` | ✅ | | --- ## array_funcs -| Function | Status | Notes | -| ----------------- | ------ | ----------------------------------------------------------------------------------- | -| `array` | ✅ | | -| `array_append` | ✅ | | -| `array_compact` | ✅ | | -| `array_contains` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | -| `array_distinct` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | -| `array_except` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | -| `array_insert` | ✅ | | -| `array_intersect` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | -| `array_join` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | -| `array_max` | ✅ | NaN ordering may differ ([details](compatibility/floating-point.md)) | -| `array_min` | ✅ | NaN ordering may differ ([details](compatibility/floating-point.md)) | -| `array_position` | ✅ | Binary/struct/map/null elements fall back | -| `array_prepend` | 🔜 | Sibling of `array_append` | -| `array_remove` | ✅ | | -| `array_repeat` | ✅ | | -| `array_union` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | -| `arrays_overlap` | ✅ | | -| `arrays_zip` | ✅ | | -| `element_at` | ✅ | MapType input falls back | -| `flatten` | ✅ | Binary/struct/map elements fall back | -| `get` | ✅ | | -| `sequence` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `shuffle` | 🔜 | Random array shuffle | -| `slice` | ✅ | Native ([#4149](https://github.com/apache/datafusion-comet/issues/4149)) | -| `sort_array` | ✅ | Nested struct/null arrays fall back | +| Function | Status | Notes | +| --- | --- | --- | +| `array` | ✅ | | +| `array_append` | ✅ | | +| `array_compact` | ✅ | | +| `array_contains` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | +| `array_distinct` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | +| `array_except` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | +| `array_insert` | ✅ | | +| `array_intersect` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | +| `array_join` | ✅ | Incompatible; falls back by default ([details](compatibility/expressions/array.md)) | +| `array_max` | ✅ | NaN ordering may differ ([details](compatibility/floating-point.md)) | +| `array_min` | ✅ | NaN ordering may differ ([details](compatibility/floating-point.md)) | +| `array_position` | ✅ | Binary/struct/map/null elements fall back | +| `array_prepend` | 🔜 | Sibling of `array_append` | +| `array_remove` | ✅ | | +| `array_repeat` | ✅ | | +| `array_union` | ✅ | NaN/signed-zero handling may differ ([details](compatibility/floating-point.md)) | +| `arrays_overlap` | ✅ | | +| `arrays_zip` | ✅ | | +| `element_at` | ✅ | MapType input falls back | +| `flatten` | ✅ | Binary/struct/map elements fall back | +| `get` | ✅ | | +| `sequence` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `shuffle` | 🔜 | Random array shuffle | +| `slice` | ✅ | Native ([#4149](https://github.com/apache/datafusion-comet/issues/4149)) | +| `sort_array` | ✅ | Nested struct/null arrays fall back | --- ## bitwise_funcs -| Function | Status | Notes | -| -------------------- | ------ | ---------------------------------------------------- | -| `&` | ✅ | | -| `<<` | ✅ | | -| `>>` | ✅ | | -| `>>>` | ✅ | Operator alias for `shiftrightunsigned` (Spark 4.0+) | -| `^` | ✅ | | -| `bit_count` | ✅ | | -| `bit_get` | ✅ | | -| `getbit` | ✅ | | -| `shiftright` | ✅ | | -| `shiftrightunsigned` | ✅ | | -| `\|` | ✅ | | -| `~` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `&` | ✅ | | +| `<<` | ✅ | | +| `>>` | ✅ | | +| `>>>` | ✅ | Operator alias for `shiftrightunsigned` (Spark 4.0+) | +| `^` | ✅ | | +| `bit_count` | ✅ | | +| `bit_get` | ✅ | | +| `getbit` | ✅ | | +| `shiftright` | ✅ | | +| `shiftrightunsigned` | ✅ | | +| `\|` | ✅ | | +| `~` | ✅ | | --- ## collection_funcs -| Function | Status | Notes | -| ------------- | ------ | ---------------------------------------------------------------------------------------------- | -| `array_size` | ✅ | | -| `cardinality` | ✅ | MapType input falls back | -| `concat` | ✅ | Binary/array children fall back | -| `reverse` | ✅ | Binary-element arrays fall back (Incompatible) ([details](compatibility/expressions/array.md)) | -| `size` | ✅ | MapType input falls back | +| Function | Status | Notes | +| --- | --- | --- | +| `array_size` | ✅ | | +| `cardinality` | ✅ | MapType input falls back | +| `concat` | ✅ | Binary/array children fall back | +| `reverse` | ✅ | Binary-element arrays fall back (Incompatible) ([details](compatibility/expressions/array.md)) | +| `size` | ✅ | MapType input falls back | --- ## conditional_funcs -| Function | Status | Notes | -| ------------ | ------ | --------------------------------------------------------------- | -| `coalesce` | ✅ | | -| `if` | ✅ | | -| `ifnull` | ✅ | | -| `nanvl` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `nullif` | ✅ | | -| `nullifzero` | ✅ | Lowers to `if`/`=` (Spark 4.0+) | -| `nvl` | ✅ | | -| `nvl2` | ✅ | | -| `when` | ✅ | | -| `zeroifnull` | ✅ | Lowers to `coalesce` (Spark 4.0+) | +| Function | Status | Notes | +| --- | --- | --- | +| `coalesce` | ✅ | | +| `if` | ✅ | | +| `ifnull` | ✅ | | +| `nanvl` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `nullif` | ✅ | | +| `nullifzero` | ✅ | Lowers to `if`/`=` (Spark 4.0+) | +| `nvl` | ✅ | | +| `nvl2` | ✅ | | +| `when` | ✅ | | +| `zeroifnull` | ✅ | Lowers to `coalesce` (Spark 4.0+) | --- @@ -214,89 +214,89 @@ The tables below list every Spark built-in expression with its current status. The type-name conversion functions (`bigint`, `binary`, `boolean`, `date`, `decimal`, `double`, `float`, `int`, `smallint`, `string`, `timestamp`, `tinyint`) are SQL aliases for `CAST(... AS )` and share the support and caveats of `cast`. -| Function | Status | Notes | -| -------- | ------ | ----------------------------------------------------------------------------------------------- | -| `cast` | ✅ | Some casts fall back; float-to-decimal is opt-in ([details](compatibility/expressions/cast.md)) | +| Function | Status | Notes | +| --- | --- | --- | +| `cast` | ✅ | Some casts fall back; float-to-decimal is opt-in ([details](compatibility/expressions/cast.md)) | --- ## datetime_funcs -| Function | Status | Notes | -| --------------------- | ------ | -------------------------------------------------------------------------------------------------------------------------------------- | -| `add_months` | ✅ | | -| `convert_timezone` | ✅ | | -| `curdate` | ✅ | Constant-folded to a literal (alias of `current_date`) | -| `current_date` | ✅ | Constant-folded to a literal before Comet sees the plan | -| `current_time` | 🔜 | Blocked on Spark 4.1 TIME type support ([#4288](https://github.com/apache/datafusion-comet/issues/4288)) | -| `current_timestamp` | ✅ | Constant-folded to a literal before Comet sees the plan | -| `current_timezone` | ✅ | | -| `date_add` | ✅ | | -| `date_diff` | ✅ | | -| `date_format` | ✅ | | -| `date_from_unix_date` | ✅ | | -| `date_part` | ✅ | | -| `date_sub` | ✅ | | -| `date_trunc` | ✅ | | -| `dateadd` | ✅ | | -| `datediff` | ✅ | | -| `datepart` | ✅ | | -| `day` | ✅ | | -| `dayname` | 🔜 | [#4544](https://github.com/apache/datafusion-comet/issues/4544) | -| `dayofmonth` | ✅ | | -| `dayofweek` | ✅ | | -| `dayofyear` | ✅ | | -| `extract` | ✅ | | -| `from_unixtime` | ✅ | | -| `from_utc_timestamp` | ✅ | Legacy zone forms fall back (Incompatible) ([details](compatibility/expressions/datetime.md)) | -| `hour` | ✅ | | -| `last_day` | ✅ | | -| `localtimestamp` | ✅ | | -| `make_date` | ✅ | | -| `make_dt_interval` | 🔜 | [#4541](https://github.com/apache/datafusion-comet/issues/4541) | -| `make_interval` | 🔜 | Produces legacy CalendarInterval; tracked by [#4540](https://github.com/apache/datafusion-comet/issues/4540) | -| `make_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `make_timestamp` | ✅ | | -| `make_timestamp_ltz` | ✅ | 2-arg TIME form falls back | -| `make_timestamp_ntz` | ✅ | 2-arg TIME form falls back | -| `make_ym_interval` | 🔜 | [#4541](https://github.com/apache/datafusion-comet/issues/4541) | -| `minute` | ✅ | | -| `month` | ✅ | | -| `monthname` | 🔜 | [#4544](https://github.com/apache/datafusion-comet/issues/4544) | -| `months_between` | ✅ | | -| `next_day` | ✅ | | -| `now` | ✅ | Constant-folded to a literal (alias of `current_timestamp`) | -| `quarter` | ✅ | | -| `second` | ✅ | | -| `session_window` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | -| `time_diff` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `time_trunc` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `timestamp_micros` | ✅ | | -| `timestamp_millis` | ✅ | | -| `timestamp_seconds` | ✅ | | -| `to_date` | ✅ | Rewrites to `Cast` (or `Cast(GetTimestamp)` with a format) before Comet sees the plan | -| `to_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `to_timestamp` | ✅ | Rewrites to `Cast` (or `GetTimestamp` with a format) before Comet sees the plan | -| `to_timestamp_ltz` | ✅ | Rewrites to `to_timestamp` (`TimestampType`) | -| `to_timestamp_ntz` | ✅ | Rewrites to `to_timestamp` (`TimestampNTZType`) | -| `to_unix_timestamp` | ✅ | | -| `to_utc_timestamp` | ✅ | Legacy zone forms fall back (Incompatible) ([details](compatibility/expressions/datetime.md)) | -| `trunc` | ✅ | | -| `try_make_interval` | 🔜 | Produces legacy CalendarInterval; tracked by [#4540](https://github.com/apache/datafusion-comet/issues/4540) | -| `try_make_timestamp` | ⚠️ | Returns a wrong value instead of NULL for invalid inputs ([#4554](https://github.com/apache/datafusion-comet/issues/4554)) | -| `try_to_date` | 🔜 | Rewrites to `Cast`/`GetTimestamp` but currently falls back; tracked by [#4556](https://github.com/apache/datafusion-comet/issues/4556) | -| `try_to_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | -| `try_to_timestamp` | 🔜 | Rewrites to `Cast`/`GetTimestamp` but currently falls back; tracked by [#4556](https://github.com/apache/datafusion-comet/issues/4556) | -| `unix_date` | ✅ | | -| `unix_micros` | ✅ | | -| `unix_millis` | ✅ | | -| `unix_seconds` | ✅ | | -| `unix_timestamp` | ✅ | | -| `weekday` | ✅ | | -| `weekofyear` | ✅ | | -| `window` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | -| `window_time` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | -| `year` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `add_months` | ✅ | | +| `convert_timezone` | ✅ | | +| `curdate` | ✅ | Constant-folded to a literal (alias of `current_date`) | +| `current_date` | ✅ | Constant-folded to a literal before Comet sees the plan | +| `current_time` | 🔜 | Blocked on Spark 4.1 TIME type support ([#4288](https://github.com/apache/datafusion-comet/issues/4288)) | +| `current_timestamp` | ✅ | Constant-folded to a literal before Comet sees the plan | +| `current_timezone` | ✅ | | +| `date_add` | ✅ | | +| `date_diff` | ✅ | | +| `date_format` | ✅ | | +| `date_from_unix_date` | ✅ | | +| `date_part` | ✅ | | +| `date_sub` | ✅ | | +| `date_trunc` | ✅ | | +| `dateadd` | ✅ | | +| `datediff` | ✅ | | +| `datepart` | ✅ | | +| `day` | ✅ | | +| `dayname` | 🔜 | [#4544](https://github.com/apache/datafusion-comet/issues/4544) | +| `dayofmonth` | ✅ | | +| `dayofweek` | ✅ | | +| `dayofyear` | ✅ | | +| `extract` | ✅ | | +| `from_unixtime` | ✅ | | +| `from_utc_timestamp` | ✅ | Legacy zone forms fall back (Incompatible) ([details](compatibility/expressions/datetime.md)) | +| `hour` | ✅ | | +| `last_day` | ✅ | | +| `localtimestamp` | ✅ | | +| `make_date` | ✅ | | +| `make_dt_interval` | 🔜 | [#4541](https://github.com/apache/datafusion-comet/issues/4541) | +| `make_interval` | 🔜 | Produces legacy CalendarInterval; tracked by [#4540](https://github.com/apache/datafusion-comet/issues/4540) | +| `make_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `make_timestamp` | ✅ | | +| `make_timestamp_ltz` | ✅ | 2-arg TIME form falls back | +| `make_timestamp_ntz` | ✅ | 2-arg TIME form falls back | +| `make_ym_interval` | 🔜 | [#4541](https://github.com/apache/datafusion-comet/issues/4541) | +| `minute` | ✅ | | +| `month` | ✅ | | +| `monthname` | 🔜 | [#4544](https://github.com/apache/datafusion-comet/issues/4544) | +| `months_between` | ✅ | | +| `next_day` | ✅ | | +| `now` | ✅ | Constant-folded to a literal (alias of `current_timestamp`) | +| `quarter` | ✅ | | +| `second` | ✅ | | +| `session_window` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | +| `time_diff` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `time_trunc` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `timestamp_micros` | ✅ | | +| `timestamp_millis` | ✅ | | +| `timestamp_seconds` | ✅ | | +| `to_date` | ✅ | Rewrites to `Cast` (or `Cast(GetTimestamp)` with a format) before Comet sees the plan | +| `to_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `to_timestamp` | ✅ | Rewrites to `Cast` (or `GetTimestamp` with a format) before Comet sees the plan | +| `to_timestamp_ltz` | ✅ | Rewrites to `to_timestamp` (`TimestampType`) | +| `to_timestamp_ntz` | ✅ | Rewrites to `to_timestamp` (`TimestampNTZType`) | +| `to_unix_timestamp` | ✅ | | +| `to_utc_timestamp` | ✅ | Legacy zone forms fall back (Incompatible) ([details](compatibility/expressions/datetime.md)) | +| `trunc` | ✅ | | +| `try_make_interval` | 🔜 | Produces legacy CalendarInterval; tracked by [#4540](https://github.com/apache/datafusion-comet/issues/4540) | +| `try_make_timestamp` | ⚠️ | Returns a wrong value instead of NULL for invalid inputs ([#4554](https://github.com/apache/datafusion-comet/issues/4554)) | +| `try_to_date` | 🔜 | Rewrites to `Cast`/`GetTimestamp` but currently falls back; tracked by [#4556](https://github.com/apache/datafusion-comet/issues/4556) | +| `try_to_time` | 🔜 | Spark 4.1 TIME type; tracked by [#4288](https://github.com/apache/datafusion-comet/issues/4288) | +| `try_to_timestamp` | 🔜 | Rewrites to `Cast`/`GetTimestamp` but currently falls back; tracked by [#4556](https://github.com/apache/datafusion-comet/issues/4556) | +| `unix_date` | ✅ | | +| `unix_micros` | ✅ | | +| `unix_millis` | ✅ | | +| `unix_seconds` | ✅ | | +| `unix_timestamp` | ✅ | | +| `weekday` | ✅ | | +| `weekofyear` | ✅ | | +| `window` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | +| `window_time` | 🔜 | Time-window grouping; tracked by [#4553](https://github.com/apache/datafusion-comet/issues/4553) | +| `year` | ✅ | | --- @@ -306,43 +306,43 @@ The type-name conversion functions (`bigint`, `binary`, `boolean`, `date`, `deci expression-level). The `outer` variants are wired but marked `Incompatible`; they require `spark.comet.exec.explode.enabled=true` and `allowIncompatible`. -| Function | Status | Notes | -| ------------------ | ------ | ----------------------------------------------------------------------------------------------------------------------------- | -| `explode` | ✅ | via `CometExplodeExec` | -| `explode_outer` | ✅ | outer=true falls back (Incompatible) ([audit](../../contributor-guide/expression-audits/generator_funcs.md#explode_outer)) | -| `inline` | 🔜 | Operator-level generator (like `explode`) | -| `inline_outer` | 🔜 | Operator-level generator (like `explode`) | -| `posexplode` | ✅ | via `CometExplodeExec` | -| `posexplode_outer` | ✅ | outer=true falls back (Incompatible) ([audit](../../contributor-guide/expression-audits/generator_funcs.md#posexplode_outer)) | -| `stack` | 🔜 | Operator-level generator | +| Function | Status | Notes | +| --- | --- | --- | +| `explode` | ✅ | via `CometExplodeExec` | +| `explode_outer` | ✅ | outer=true falls back (Incompatible) ([audit](../../contributor-guide/expression-audits/generator_funcs.md#explode_outer)) | +| `inline` | 🔜 | Operator-level generator (like `explode`) | +| `inline_outer` | 🔜 | Operator-level generator (like `explode`) | +| `posexplode` | ✅ | via `CometExplodeExec` | +| `posexplode_outer` | ✅ | outer=true falls back (Incompatible) ([audit](../../contributor-guide/expression-audits/generator_funcs.md#posexplode_outer)) | +| `stack` | 🔜 | Operator-level generator | --- ## hash_funcs -| Function | Status | Notes | -| ---------- | ------ | ----- | -| `crc32` | ✅ | | -| `hash` | ✅ | | -| `md5` | ✅ | | -| `sha` | ✅ | | -| `sha1` | ✅ | | -| `sha2` | ✅ | | -| `xxhash64` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `crc32` | ✅ | | +| `hash` | ✅ | | +| `md5` | ✅ | | +| `sha` | ✅ | | +| `sha1` | ✅ | | +| `sha2` | ✅ | | +| `xxhash64` | ✅ | | --- ## json_funcs -| Function | Status | Notes | -| ------------------- | ------ | -------------------------------------------------------------------------------------------------------------------------------- | -| `from_json` | ✅ | Falls back by default; opt-in via allowIncompatible ([audit](../../contributor-guide/expression-audits/json_funcs.md#from_json)) | -| `get_json_object` | ✅ | Some inputs need allowIncompatible ([audit](../../contributor-guide/expression-audits/json_funcs.md#get_json_object)) | -| `json_array_length` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `json_object_keys` | 🔜 | [#3161](https://github.com/apache/datafusion-comet/issues/3161) | -| `json_tuple` | 🔜 | [#3160](https://github.com/apache/datafusion-comet/issues/3160) | -| `schema_of_json` | 🔜 | [#3163](https://github.com/apache/datafusion-comet/issues/3163) | -| `to_json` | ✅ | Options and map/array inputs fall back ([audit](../../contributor-guide/expression-audits/json_funcs.md#to_json)) | +| Function | Status | Notes | +| --- | --- | --- | +| `from_json` | ✅ | Falls back by default; opt-in via allowIncompatible ([audit](../../contributor-guide/expression-audits/json_funcs.md#from_json)) | +| `get_json_object` | ✅ | Some inputs need allowIncompatible ([audit](../../contributor-guide/expression-audits/json_funcs.md#get_json_object)) | +| `json_array_length` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `json_object_keys` | 🔜 | [#3161](https://github.com/apache/datafusion-comet/issues/3161) | +| `json_tuple` | 🔜 | [#3160](https://github.com/apache/datafusion-comet/issues/3160) | +| `schema_of_json` | 🔜 | [#3163](https://github.com/apache/datafusion-comet/issues/3163) | +| `to_json` | ✅ | Options and map/array inputs fall back ([audit](../../contributor-guide/expression-audits/json_funcs.md#to_json)) | --- @@ -350,269 +350,269 @@ expression-level). The `outer` variants are wired but marked `Incompatible`; the All higher-order functions are planned via [#4224](https://github.com/apache/datafusion-comet/issues/4224). -| Function | Status | Notes | -| ------------------ | ------ | --------------------------------------------------------------- | -| `aggregate` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `array_sort` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `exists` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `filter` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `forall` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `map_filter` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `map_zip_with` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `reduce` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `transform` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `transform_keys` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `transform_values` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | -| `zip_with` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| Function | Status | Notes | +| --- | --- | --- | +| `aggregate` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `array_sort` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `exists` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `filter` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `forall` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `map_filter` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `map_zip_with` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `reduce` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `transform` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `transform_keys` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `transform_values` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | +| `zip_with` | 🔜 | [#4224](https://github.com/apache/datafusion-comet/issues/4224) | --- ## map_funcs -| Function | Status | Notes | -| ------------------ | ------ | -------------------------------------------------------------------------------------------- | -| `element_at` | ✅ | MapType input falls back | -| `map` | 🔜 | Constructs a map | -| `map_concat` | 🔜 | Concatenates maps | -| `map_contains_key` | ✅ | | -| `map_entries` | ✅ | | -| `map_from_arrays` | ✅ | | -| `map_from_entries` | ✅ | BinaryType key/value falls back (Incompatible) ([details](compatibility/expressions/map.md)) | -| `map_keys` | ✅ | | -| `map_values` | ✅ | | -| `str_to_map` | ✅ | | -| `try_element_at` | ✅ | Lowers to `element_at`; array input (MapType falls back) | +| Function | Status | Notes | +| --- | --- | --- | +| `element_at` | ✅ | MapType input falls back | +| `map` | 🔜 | Constructs a map | +| `map_concat` | 🔜 | Concatenates maps | +| `map_contains_key` | ✅ | | +| `map_entries` | ✅ | | +| `map_from_arrays` | ✅ | | +| `map_from_entries` | ✅ | BinaryType key/value falls back (Incompatible) ([details](compatibility/expressions/map.md)) | +| `map_keys` | ✅ | | +| `map_values` | ✅ | | +| `str_to_map` | ✅ | | +| `try_element_at` | ✅ | Lowers to `element_at`; array input (MapType falls back) | --- ## math_funcs -| Function | Status | Notes | -| -------------- | ------ | ---------------------------------------------------------------------------------------------------------------------------- | -| `%` | ✅ | try_mod (TRY mode) falls back | -| `*` | ✅ | Interval multiplication falls back | -| `+` | ✅ | | -| `-` | ✅ | | -| `/` | ✅ | | -| `abs` | ✅ | Interval types fall back | -| `acos` | ✅ | | -| `acosh` | ✅ | | -| `asin` | ✅ | | -| `asinh` | ✅ | | -| `atan` | ✅ | | -| `atan2` | ✅ | | -| `atanh` | ✅ | | -| `bin` | ✅ | | -| `bround` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `cbrt` | ✅ | | -| `ceil` | ✅ | Two-arg form falls back | -| `ceiling` | ✅ | | -| `conv` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `cos` | ✅ | | -| `cosh` | ✅ | | -| `cot` | ✅ | | -| `csc` | ✅ | | -| `degrees` | ✅ | | -| `div` | ✅ | | -| `e` | ✅ | Folds to a literal (like `pi`) | -| `exp` | ✅ | | -| `expm1` | ✅ | | -| `factorial` | ✅ | | -| `floor` | ✅ | Two-arg form falls back | -| `greatest` | ✅ | | -| `hex` | ✅ | | -| `hypot` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `least` | ✅ | | -| `ln` | ✅ | | -| `log` | ✅ | | -| `log10` | ✅ | | -| `log1p` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `log2` | ✅ | | -| `mod` | ✅ | | -| `negative` | ✅ | | -| `pi` | ✅ | | -| `pmod` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `positive` | ✅ | | -| `pow` | ✅ | | -| `power` | ✅ | | -| `radians` | ✅ | | -| `rand` | ✅ | | -| `randn` | ✅ | | -| `random` | ✅ | Alias for `rand` (Spark 4.0+); seed must be a literal | -| `randstr` | 🔜 | Random string (Spark 4.0+) | -| `rint` | ✅ | | -| `round` | ✅ | Float/double inputs fall back | -| `sec` | ✅ | | -| `shiftleft` | ✅ | | -| `sign` | ✅ | | -| `signum` | ✅ | | -| `sin` | ✅ | | -| `sinh` | ✅ | | -| `sqrt` | ✅ | | -| `tan` | ✅ | | -| `tanh` | ✅ | | -| `try_add` | ✅ | Datetime/interval form falls back | -| `try_divide` | ✅ | | -| `try_mod` | 🔜 | Lowers to `Remainder` with TRY eval mode, which falls back ([#4484](https://github.com/apache/datafusion-comet/issues/4484)) | -| `try_multiply` | ✅ | | -| `try_subtract` | ✅ | | -| `unhex` | ✅ | | -| `uniform` | ✅ | Constant-folded; literal arguments only (Spark 4.0+) | -| `width_bucket` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `%` | ✅ | try_mod (TRY mode) falls back | +| `*` | ✅ | Interval multiplication falls back | +| `+` | ✅ | | +| `-` | ✅ | | +| `/` | ✅ | | +| `abs` | ✅ | Interval types fall back | +| `acos` | ✅ | | +| `acosh` | ✅ | | +| `asin` | ✅ | | +| `asinh` | ✅ | | +| `atan` | ✅ | | +| `atan2` | ✅ | | +| `atanh` | ✅ | | +| `bin` | ✅ | | +| `bround` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `cbrt` | ✅ | | +| `ceil` | ✅ | Two-arg form falls back | +| `ceiling` | ✅ | | +| `conv` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `cos` | ✅ | | +| `cosh` | ✅ | | +| `cot` | ✅ | | +| `csc` | ✅ | | +| `degrees` | ✅ | | +| `div` | ✅ | | +| `e` | ✅ | Folds to a literal (like `pi`) | +| `exp` | ✅ | | +| `expm1` | ✅ | | +| `factorial` | ✅ | | +| `floor` | ✅ | Two-arg form falls back | +| `greatest` | ✅ | | +| `hex` | ✅ | | +| `hypot` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `least` | ✅ | | +| `ln` | ✅ | | +| `log` | ✅ | | +| `log10` | ✅ | | +| `log1p` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `log2` | ✅ | | +| `mod` | ✅ | | +| `negative` | ✅ | | +| `pi` | ✅ | | +| `pmod` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `positive` | ✅ | | +| `pow` | ✅ | | +| `power` | ✅ | | +| `radians` | ✅ | | +| `rand` | ✅ | | +| `randn` | ✅ | | +| `random` | ✅ | Alias for `rand` (Spark 4.0+); seed must be a literal | +| `randstr` | 🔜 | Random string (Spark 4.0+) | +| `rint` | ✅ | | +| `round` | ✅ | Float/double inputs fall back | +| `sec` | ✅ | | +| `shiftleft` | ✅ | | +| `sign` | ✅ | | +| `signum` | ✅ | | +| `sin` | ✅ | | +| `sinh` | ✅ | | +| `sqrt` | ✅ | | +| `tan` | ✅ | | +| `tanh` | ✅ | | +| `try_add` | ✅ | Datetime/interval form falls back | +| `try_divide` | ✅ | | +| `try_mod` | 🔜 | Lowers to `Remainder` with TRY eval mode, which falls back ([#4484](https://github.com/apache/datafusion-comet/issues/4484)) | +| `try_multiply` | ✅ | | +| `try_subtract` | ✅ | | +| `unhex` | ✅ | | +| `uniform` | ✅ | Constant-folded; literal arguments only (Spark 4.0+) | +| `width_bucket` | ✅ | | --- ## misc_funcs -| Function | Status | Notes | -| ----------------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------ | -| `aes_decrypt` | 🔜 | Falls back; `StaticInvoke` not allowlisted; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)) | -| `aes_encrypt` | 🔜 | Falls back; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)); nondeterministic IV by default | -| `assert_true` | 🔜 | Lowers to `RaiseError`, which falls back | -| `current_catalog` | ✅ | Resolved to a literal by the analyzer (`ReplaceCurrentLike`) | -| `current_database` | ✅ | Resolved to a literal by the analyzer (`ReplaceCurrentLike`) | -| `current_schema` | ✅ | Alias of `current_database`; resolved to a literal by the analyzer | -| `current_user` | ✅ | Resolved to a literal by the analyzer; same as `user` | -| `equal_null` | ✅ | Lowers to `<=>` (`EqualNullSafe`) | -| `is_variant_null` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `monotonically_increasing_id` | ✅ | | -| `parse_json` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `raise_error` | 🔜 | Raises a runtime error | -| `rand` | ✅ | Seed must be a literal | -| `randn` | ✅ | Seed must be a literal | -| `schema_of_variant` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `schema_of_variant_agg` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `session_user` | ✅ | Alias of `current_user`; resolved to a literal by the analyzer | -| `spark_partition_id` | ✅ | | -| `to_variant_object` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `try_aes_decrypt` | 🔜 | Falls back; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)) | -| `try_parse_json` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `try_variant_get` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `typeof` | ✅ | Foldable; resolved to a literal before Comet sees the plan | -| `user` | ✅ | Resolved to a literal by the Spark analyzer before reaching Comet | -| `uuid` | 🔜 | Nondeterministic random UUID | -| `variant_get` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| Function | Status | Notes | +| --- | --- | --- | +| `aes_decrypt` | 🔜 | Falls back; `StaticInvoke` not allowlisted; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)) | +| `aes_encrypt` | 🔜 | Falls back; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)); nondeterministic IV by default | +| `assert_true` | 🔜 | Lowers to `RaiseError`, which falls back | +| `current_catalog` | ✅ | Resolved to a literal by the analyzer (`ReplaceCurrentLike`) | +| `current_database` | ✅ | Resolved to a literal by the analyzer (`ReplaceCurrentLike`) | +| `current_schema` | ✅ | Alias of `current_database`; resolved to a literal by the analyzer | +| `current_user` | ✅ | Resolved to a literal by the analyzer; same as `user` | +| `equal_null` | ✅ | Lowers to `<=>` (`EqualNullSafe`) | +| `is_variant_null` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `monotonically_increasing_id` | ✅ | | +| `parse_json` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `raise_error` | 🔜 | Raises a runtime error | +| `rand` | ✅ | Seed must be a literal | +| `randn` | ✅ | Seed must be a literal | +| `schema_of_variant` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `schema_of_variant_agg` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `session_user` | ✅ | Alias of `current_user`; resolved to a literal by the analyzer | +| `spark_partition_id` | ✅ | | +| `to_variant_object` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `try_aes_decrypt` | 🔜 | Falls back; planned via codegen dispatch ([#4558](https://github.com/apache/datafusion-comet/issues/4558)) | +| `try_parse_json` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `try_variant_get` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `typeof` | ✅ | Foldable; resolved to a literal before Comet sees the plan | +| `user` | ✅ | Resolved to a literal by the Spark analyzer before reaching Comet | +| `uuid` | 🔜 | Nondeterministic random UUID | +| `variant_get` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | --- ## predicate_funcs -| Function | Status | Notes | -| ------------- | ------ | --------------------------------------------------------------------------------------- | -| `!` | ✅ | | -| `<` | ✅ | | -| `<=` | ✅ | | -| `<=>` | ✅ | | -| `=` | ✅ | | -| `==` | ✅ | | -| `>` | ✅ | | -| `>=` | ✅ | | -| `and` | ✅ | | -| `between` | ✅ | | -| `ilike` | ✅ | | -| `in` | ✅ | | -| `isnan` | ✅ | | -| `isnotnull` | ✅ | | -| `isnull` | ✅ | | -| `like` | ✅ | | -| `not` | ✅ | | -| `or` | ✅ | | -| `regexp` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | -| `regexp_like` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | -| `rlike` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | +| Function | Status | Notes | +| --- | --- | --- | +| `!` | ✅ | | +| `<` | ✅ | | +| `<=` | ✅ | | +| `<=>` | ✅ | | +| `=` | ✅ | | +| `==` | ✅ | | +| `>` | ✅ | | +| `>=` | ✅ | | +| `and` | ✅ | | +| `between` | ✅ | | +| `ilike` | ✅ | | +| `in` | ✅ | | +| `isnan` | ✅ | | +| `isnotnull` | ✅ | | +| `isnull` | ✅ | | +| `like` | ✅ | | +| `not` | ✅ | | +| `or` | ✅ | | +| `regexp` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | +| `regexp_like` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | +| `rlike` | ✅ | Falls back by default; opt-in via allowIncompatible ([details](compatibility/regex.md)) | --- ## string_funcs -| Function | Status | Notes | -| -------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------ | -| `ascii` | ✅ | | -| `base64` | 🔜 | Lowers to `StaticInvoke(encode)` (not allowlisted); falls back | -| `bit_length` | ✅ | | -| `btrim` | ✅ | | -| `char` | ✅ | | -| `char_length` | ✅ | | -| `character_length` | ✅ | | -| `chr` | ✅ | | -| `collate` | 🔜 | Spark collation (umbrella [#2190](https://github.com/apache/datafusion-comet/issues/2190)) | -| `collation` | ✅ | Constant-folded to a literal (Spark 4.0+) | -| `concat_ws` | ✅ | | -| `contains` | ✅ | | -| `decode` | ✅ | | -| `elt` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `encode` | 🔜 | Lowers to `StaticInvoke(encode)` (not allowlisted); falls back | -| `endswith` | ✅ | | -| `find_in_set` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `format_number` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `format_string` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `initcap` | ✅ | | -| `instr` | ✅ | | -| `lcase` | ✅ | | -| `left` | ✅ | | -| `len` | ✅ | | -| `length` | ✅ | | -| `levenshtein` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `locate` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `lower` | ✅ | | -| `lpad` | ✅ | | -| `ltrim` | ✅ | | -| `luhn_check` | ✅ | Native via `StaticInvoke` (tests: luhn_check.sql) | -| `mask` | 🔜 | Data masking | -| `octet_length` | ✅ | | -| `overlay` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `position` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `printf` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `regexp_count` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `regexp_extract` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `regexp_extract_all` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `regexp_instr` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `regexp_replace` | ✅ | | -| `regexp_substr` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | -| `repeat` | ✅ | | -| `replace` | ✅ | | -| `right` | ✅ | | -| `rpad` | ✅ | | -| `rtrim` | ✅ | | -| `soundex` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `space` | ✅ | | -| `split` | ✅ | | -| `split_part` | 🔜 | Lowers to `element_at(StringSplitSQL(...))`; `StringSplitSQL` falls back ([#4561](https://github.com/apache/datafusion-comet/issues/4561)) | -| `startswith` | ✅ | | -| `substr` | ✅ | | -| `substring` | ✅ | | -| `substring_index` | ✅ | | -| `to_binary` | ✅ | Hex form accelerated; other formats fall back | -| `to_char` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `to_number` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `to_varchar` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `translate` | ✅ | | -| `trim` | ✅ | | -| `try_to_binary` | 🔜 | Lowers to `TryEval(...)`, which falls back | -| `try_to_number` | 🔜 | TRY variant of `to_number` | -| `ucase` | ✅ | | -| `unbase64` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | -| `upper` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `ascii` | ✅ | | +| `base64` | 🔜 | Lowers to `StaticInvoke(encode)` (not allowlisted); falls back | +| `bit_length` | ✅ | | +| `btrim` | ✅ | | +| `char` | ✅ | | +| `char_length` | ✅ | | +| `character_length` | ✅ | | +| `chr` | ✅ | | +| `collate` | 🔜 | Spark collation (umbrella [#2190](https://github.com/apache/datafusion-comet/issues/2190)) | +| `collation` | ✅ | Constant-folded to a literal (Spark 4.0+) | +| `concat_ws` | ✅ | | +| `contains` | ✅ | | +| `decode` | ✅ | | +| `elt` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `encode` | 🔜 | Lowers to `StaticInvoke(encode)` (not allowlisted); falls back | +| `endswith` | ✅ | | +| `find_in_set` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `format_number` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `format_string` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `initcap` | ✅ | | +| `instr` | ✅ | | +| `lcase` | ✅ | | +| `left` | ✅ | | +| `len` | ✅ | | +| `length` | ✅ | | +| `levenshtein` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `locate` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `lower` | ✅ | | +| `lpad` | ✅ | | +| `ltrim` | ✅ | | +| `luhn_check` | ✅ | Native via `StaticInvoke` (tests: luhn_check.sql) | +| `mask` | 🔜 | Data masking | +| `octet_length` | ✅ | | +| `overlay` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `position` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `printf` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `regexp_count` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `regexp_extract` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `regexp_extract_all` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `regexp_instr` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `regexp_replace` | ✅ | | +| `regexp_substr` | 🔜 | tracking [#4098](https://github.com/apache/datafusion-comet/issues/4098) | +| `repeat` | ✅ | | +| `replace` | ✅ | | +| `right` | ✅ | | +| `rpad` | ✅ | | +| `rtrim` | ✅ | | +| `soundex` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `space` | ✅ | | +| `split` | ✅ | | +| `split_part` | 🔜 | Lowers to `element_at(StringSplitSQL(...))`; `StringSplitSQL` falls back ([#4561](https://github.com/apache/datafusion-comet/issues/4561)) | +| `startswith` | ✅ | | +| `substr` | ✅ | | +| `substring` | ✅ | | +| `substring_index` | ✅ | | +| `to_binary` | ✅ | Hex form accelerated; other formats fall back | +| `to_char` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `to_number` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `to_varchar` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `translate` | ✅ | | +| `trim` | ✅ | | +| `try_to_binary` | 🔜 | Lowers to `TryEval(...)`, which falls back | +| `try_to_number` | 🔜 | TRY variant of `to_number` | +| `ucase` | ✅ | | +| `unbase64` | 🔜 | [#4538](https://github.com/apache/datafusion-comet/issues/4538) | +| `upper` | ✅ | | --- ## struct_funcs -| Function | Status | Notes | -| -------------- | ------ | ------------------------------- | -| `named_struct` | ✅ | Duplicate field names fall back | -| `struct` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `named_struct` | ✅ | Duplicate field names fall back | +| `struct` | ✅ | | --- ## url_funcs -| Function | Status | Notes | -| ---------------- | ------ | ----- | -| `parse_url` | ✅ | | -| `try_url_decode` | ✅ | | -| `url_decode` | ✅ | | -| `url_encode` | ✅ | | +| Function | Status | Notes | +| --- | --- | --- | +| `parse_url` | ✅ | | +| `try_url_decode` | ✅ | | +| `url_decode` | ✅ | | +| `url_encode` | ✅ | | --- @@ -625,17 +625,17 @@ When enabled, `lag` and `lead` are explicitly wired; aggregate window functions `ntile`, `percent_rank`, `cume_dist`, `nth_value`) are not yet wired in the window serde and fall back to Spark. -| Function | Status | Notes | -| -------------- | ------ | ------------------------------------------------------------------------------------------- | -| `cume_dist` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `dense_rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `lag` | ✅ | via `CometWindowExec` | -| `lead` | ✅ | via `CometWindowExec` | -| `nth_value` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `ntile` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `percent_rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | -| `row_number` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| Function | Status | Notes | +| --- | --- | --- | +| `cume_dist` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `dense_rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `lag` | ✅ | via `CometWindowExec` | +| `lead` | ✅ | via `CometWindowExec` | +| `nth_value` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `ntile` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `percent_rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `rank` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | +| `row_number` | 🔜 | Window function; tracked by [#2721](https://github.com/apache/datafusion-comet/issues/2721) | --- From fe8b5ff7193beeba66bba4d7386468e170b7ec2c Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Wed, 3 Jun 2026 13:19:28 -0600 Subject: [PATCH 3/3] fix: add ASF license header to .prettierignore The Apache RAT license check rejects .prettierignore for missing a license header, failing the Preflight CI job. Add the standard ASF header in shell-comment style. --- .prettierignore | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/.prettierignore b/.prettierignore index 4b81e3fef4..04933aa70c 100644 --- a/.prettierignore +++ b/.prettierignore @@ -1,3 +1,20 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + # prettier re-aligns markdown table columns to the widest cell, so adding a # single expression row rewrites every row in the table. That produces noisy # diffs and frequent merge conflicts between PRs that each add new expressions.