Skip to content

[Enhancement](udf) support volatility for udaf && udtf#63611

Open
linrrzqqq wants to merge 2 commits into
apache:masterfrom
linrrzqqq:volatile-udaf-udtf
Open

[Enhancement](udf) support volatility for udaf && udtf#63611
linrrzqqq wants to merge 2 commits into
apache:masterfrom
linrrzqqq:volatile-udaf-udtf

Conversation

@linrrzqqq
Copy link
Copy Markdown
Collaborator

@linrrzqqq linrrzqqq commented May 25, 2026

Related PR: #62698

Problem Summary:

Add volatility metadata support for UDAF and UDTF definitions, so user-defined aggregate and table functions can preserve and expose their volatility semantics consistently with UDFs.

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review completed. I did not find blocking correctness issues in the changed UDAF/UDTF volatility paths.

Critical checkpoint conclusions:

  • Goal/test coverage: The PR extends volatility parsing, SHOW/SHOW CREATE emission, and Nereids catalog-function conversion for Java/Python UDAF and UDTF. Added unit tests cover default UDAF volatility, explicit UDTF volatility, SHOW properties, SHOW CREATE SQL, and Nereids conversion preservation.
  • Scope/focus: The implementation is small and localized to function creation, display, replay SQL, and UDF expression wrappers.
  • Concurrency/lifecycle: No new shared mutable state, locks, threads, static initialization, or lifecycle-sensitive ownership were introduced.
  • Config/compatibility: No new config items. Existing functions without persisted volatility still use Function.getVolatility() fallback to IMMUTABLE, and replay SQL now includes a property that CreateFunctionCommand accepts for Java/Python UDAF/UDTF.
  • Parallel paths: Java/Python and UDAF/UDTF conversion paths appear updated consistently; RPC/non-Java/Python paths still reject volatility.
  • Error handling: Invalid volatility continues through FunctionVolatility.fromString and is converted to AnalysisException.
  • Transaction/persistence/data correctness: No transaction or storage visibility paths are touched.
  • Performance/observability: No hot-path or observability concerns found for these metadata-only changes.

User focus: No additional user-provided review focus was supplied.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31897 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6785bae36ebd226de7c28e703085ad60fe77a112, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17835	4089	4037	4037
q2	q3	10811	1441	817	817
q4	4688	524	348	348
q5	7562	2321	2161	2161
q6	243	181	138	138
q7	1001	792	650	650
q8	9359	1692	1638	1638
q9	5154	4996	4988	4988
q10	6404	2203	1903	1903
q11	436	284	257	257
q12	631	428	304	304
q13	18149	3416	2814	2814
q14	270	260	247	247
q15	q16	823	767	713	713
q17	913	836	938	836
q18	7022	5858	5521	5521
q19	1312	1315	1181	1181
q20	557	471	296	296
q21	6172	2853	2731	2731
q22	459	524	317	317
Total cold run time: 99801 ms
Total hot run time: 31897 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4918	4867	4734	4734
q2	q3	5007	5322	4704	4704
q4	2161	2238	1456	1456
q5	5186	4723	4652	4652
q6	234	185	135	135
q7	1975	1766	1611	1611
q8	2457	2114	2238	2114
q9	7921	7413	7387	7387
q10	4798	4736	4260	4260
q11	542	385	384	384
q12	751	752	552	552
q13	2987	3405	2804	2804
q14	269	282	269	269
q15	q16	678	710	618	618
q17	1302	1277	1277	1277
q18	7328	6997	6693	6693
q19	1153	1098	1111	1098
q20	2238	2237	1956	1956
q21	5377	4681	4579	4579
q22	538	472	418	418
Total cold run time: 57820 ms
Total hot run time: 51701 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172942 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6785bae36ebd226de7c28e703085ad60fe77a112, data reload: false

query5	4338	673	538	538
query6	337	223	198	198
query7	4277	568	316	316
query8	326	238	227	227
query9	8843	4093	4066	4066
query10	454	343	295	295
query11	5793	2611	2205	2205
query12	179	132	125	125
query13	1258	591	445	445
query14	6171	5510	5246	5246
query14_1	4455	4471	4470	4470
query15	205	202	180	180
query16	1009	469	433	433
query17	962	707	596	596
query18	2459	505	347	347
query19	209	221	162	162
query20	130	134	129	129
query21	220	137	118	118
query22	13623	13684	13439	13439
query23	17399	16554	16373	16373
query23_1	16651	16473	16401	16401
query24	7575	1775	1290	1290
query24_1	1302	1318	1309	1309
query25	549	472	440	440
query26	1309	346	181	181
query27	2671	533	349	349
query28	4489	2059	2008	2008
query29	1018	654	525	525
query30	313	236	200	200
query31	1135	1086	968	968
query32	101	79	75	75
query33	574	359	319	319
query34	1191	1130	659	659
query35	797	814	701	701
query36	1455	1414	1271	1271
query37	157	108	92	92
query38	3215	3179	3078	3078
query39	931	932	928	928
query39_1	876	910	873	873
query40	224	149	130	130
query41	74	71	69	69
query42	118	120	113	113
query43	333	338	299	299
query44	
query45	217	205	199	199
query46	1079	1215	737	737
query47	2433	2424	2256	2256
query48	418	428	312	312
query49	652	514	421	421
query50	1048	354	264	264
query51	4381	4329	4244	4244
query52	107	109	95	95
query53	265	287	224	224
query54	328	298	288	288
query55	102	98	94	94
query56	312	324	312	312
query57	1444	1468	1363	1363
query58	318	284	281	281
query59	1598	1659	1512	1512
query60	338	363	334	334
query61	188	179	208	179
query62	700	645	576	576
query63	247	203	212	203
query64	2398	802	642	642
query65	
query66	1730	481	395	395
query67	30115	30047	29990	29990
query68	
query69	465	345	309	309
query70	1015	1022	941	941
query71	307	270	266	266
query72	2963	2672	2385	2385
query73	816	753	442	442
query74	5097	5049	4838	4838
query75	2697	2604	2265	2265
query76	2298	1144	774	774
query77	406	406	341	341
query78	12396	12414	11859	11859
query79	1489	997	779	779
query80	647	546	452	452
query81	453	287	245	245
query82	1387	164	120	120
query83	389	291	252	252
query84	309	143	111	111
query85	875	533	452	452
query86	400	324	337	324
query87	3413	3370	3255	3255
query88	3627	2748	2742	2742
query89	455	386	345	345
query90	1953	171	184	171
query91	180	165	163	163
query92	84	81	75	75
query93	1511	1466	935	935
query94	528	364	306	306
query95	700	383	444	383
query96	1131	848	340	340
query97	2732	2751	2585	2585
query98	243	225	229	225
query99	1180	1153	1042	1042
Total cold run time: 255221 ms
Total hot run time: 172942 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 66.67% (20/30) 🎉
Increment coverage report
Complete coverage report


@Override
public boolean isDeterministic() {
return volatility == FunctionVolatility.IMMUTABLE;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to udf ?

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@linrrzqqq linrrzqqq force-pushed the volatile-udaf-udtf branch from d1c79a3 to 6114612 Compare May 27, 2026 02:54
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking findings found in the PR diff.

Critical checkpoint conclusions:

  • Goal and coverage: The PR extends volatility metadata handling from scalar UDFs to UDAF/UDTF creation, Nereids function wrappers, SHOW FUNCTIONS, and SHOW CREATE FUNCTION. The changed paths appear to accomplish that goal, with unit coverage added for parser/default behavior, SQL rendering, SHOW FUNCTIONS properties, and Nereids wrapper propagation.
  • Scope/focus: The implementation is small and focused. The existing review thread about moving deterministic logic to the common UDF layer is already addressed by the final patch, so I did not duplicate it.
  • Concurrency/lifecycle: No new shared mutable state, threads, locks, static initialization dependencies, or non-obvious lifecycle management were introduced.
  • Configuration/compatibility: No new config items or storage-format changes. Function volatility is already represented in function metadata; this PR primarily allows and propagates it for aggregate/table UDF forms.
  • Parallel code paths: Java/Python and UDAF/UDTF paths were updated consistently for create, catalog conversion, and display.
  • Error handling: Existing analysis exception patterns are preserved; invalid volatility still fails through FunctionVolatility parsing.
  • Tests/results: Added FE unit coverage is relevant. I attempted to run mvn -pl fe-core -am test -Dtest=CreateFunctionTest,FunctionToSqlConverterTest,UdfVolatilityTest,ShowFunctionsCommandTest -DfailIfNoTests=false -Dskip.doc=true, but this runner is missing thirdparty/installed/bin/thrift, so Maven failed in fe-thrift before reaching fe-core tests.
  • Observability/performance/data correctness: No runtime data read/write, transaction, memory tracking, or observability-sensitive path is changed; no performance concern found.

User focus: No additional user-provided review focus was specified.

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 63.64% (21/33) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31327 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 611461279f7e9a917ca57f5b567dd9b364d19fe8, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17838	4075	3959	3959
q2	q3	10748	1356	820	820
q4	4690	471	357	357
q5	7597	2268	2113	2113
q6	241	181	142	142
q7	974	773	662	662
q8	9438	1770	1616	1616
q9	5148	4953	4935	4935
q10	6391	2200	1886	1886
q11	431	273	246	246
q12	630	430	300	300
q13	18112	3360	2750	2750
q14	267	255	240	240
q15	q16	827	769	710	710
q17	1007	962	966	962
q18	6985	5693	5502	5502
q19	1292	1290	1067	1067
q20	525	426	280	280
q21	6007	2541	2464	2464
q22	440	363	316	316
Total cold run time: 99588 ms
Total hot run time: 31327 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4345	4297	4304	4297
q2	q3	4535	4944	4383	4383
q4	2102	2241	1400	1400
q5	4412	4287	4317	4287
q6	229	173	129	129
q7	2445	1945	1669	1669
q8	2536	2215	2197	2197
q9	8167	7866	8042	7866
q10	4768	4742	4289	4289
q11	595	410	385	385
q12	761	798	550	550
q13	3347	3547	2999	2999
q14	290	317	287	287
q15	q16	711	748	641	641
q17	1347	1468	1315	1315
q18	7763	7238	7201	7201
q19	1146	1129	1123	1123
q20	2218	2224	1941	1941
q21	5237	4563	4478	4478
q22	541	454	406	406
Total cold run time: 57495 ms
Total hot run time: 51843 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171602 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 611461279f7e9a917ca57f5b567dd9b364d19fe8, data reload: false

query5	4304	653	529	529
query6	346	224	205	205
query7	4249	548	334	334
query8	331	240	224	224
query9	8792	4052	4068	4052
query10	452	357	302	302
query11	5775	2585	2233	2233
query12	178	128	124	124
query13	1314	590	448	448
query14	6147	5456	5135	5135
query14_1	4449	4431	4433	4431
query15	223	206	182	182
query16	1032	448	426	426
query17	1153	714	580	580
query18	2707	484	358	358
query19	225	208	164	164
query20	139	130	129	129
query21	213	138	119	119
query22	13701	13599	13322	13322
query23	17107	16376	16191	16191
query23_1	16281	16309	16312	16309
query24	7560	1745	1309	1309
query24_1	1301	1327	1321	1321
query25	594	514	443	443
query26	1324	331	179	179
query27	2679	555	362	362
query28	4382	1962	2015	1962
query29	1020	663	530	530
query30	304	242	203	203
query31	1136	1077	969	969
query32	95	80	76	76
query33	557	352	311	311
query34	1193	1149	664	664
query35	785	792	713	713
query36	1390	1390	1210	1210
query37	154	108	95	95
query38	3258	3155	3080	3080
query39	940	923	896	896
query39_1	892	877	874	874
query40	234	152	129	129
query41	73	69	66	66
query42	115	113	120	113
query43	327	336	284	284
query44	
query45	220	207	210	207
query46	1128	1233	741	741
query47	2346	2352	2219	2219
query48	405	447	304	304
query49	653	513	403	403
query50	954	358	255	255
query51	4427	4294	4205	4205
query52	107	110	97	97
query53	264	285	207	207
query54	333	288	268	268
query55	97	92	91	91
query56	333	320	337	320
query57	1442	1423	1347	1347
query58	308	286	279	279
query59	1594	1650	1442	1442
query60	330	341	328	328
query61	225	157	153	153
query62	697	649	591	591
query63	239	199	202	199
query64	2362	806	634	634
query65	
query66	1671	479	352	352
query67	29113	29610	29523	29523
query68	
query69	480	342	306	306
query70	1019	1014	967	967
query71	296	271	268	268
query72	3032	2712	2464	2464
query73	807	775	444	444
query74	5083	4961	4777	4777
query75	2708	2604	2272	2272
query76	2303	1153	783	783
query77	402	420	338	338
query78	12340	12533	11977	11977
query79	1445	1039	776	776
query80	647	546	462	462
query81	455	279	239	239
query82	1369	152	124	124
query83	369	283	249	249
query84	266	142	112	112
query85	916	558	465	465
query86	405	347	316	316
query87	3480	3350	3248	3248
query88	3603	2732	2697	2697
query89	446	390	355	355
query90	1894	181	182	181
query91	182	197	137	137
query92	80	80	75	75
query93	1409	1479	997	997
query94	524	356	305	305
query95	683	474	344	344
query96	1035	762	352	352
query97	2758	2713	2616	2616
query98	237	231	232	231
query99	1188	1151	1023	1023
Total cold run time: 253449 ms
Total hot run time: 171602 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 19.42% (27/139) 🎉
Increment coverage report
Complete coverage report

Copy link
Copy Markdown
Contributor

@yujun777 yujun777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Udf also extends VolatileExpression ?

@@ -50,6 +51,7 @@ public class JavaUdaf extends AggregateFunction implements ExplicitlyCastableSig
private final FunctionSignature signature;
private final DataType intermediateType;
private final NullableMode nullableMode;
private final FunctionVolatility volatility;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need add VolatileIdentity field for case FunctionValatility.VOLATILE

@@ -49,6 +50,7 @@ public class JavaUdtf extends TableGeneratingFunction implements ExplicitlyCasta
private final Function.BinaryType binaryType;
private final FunctionSignature signature;
private final NullableMode nullableMode;
private final FunctionVolatility volatility;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need add VolatileIdentity field for case FunctionValatility.VOLATILE

@@ -50,6 +51,7 @@ public class PythonUdaf extends AggregateFunction implements ExplicitlyCastableS
private final FunctionSignature signature;
private final DataType intermediateType;
private final NullableMode nullableMode;
private final FunctionVolatility volatility;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need add VolatileIdentity field for case FunctionValatility.VOLATILE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants