Skip to content

[fix](be) Preserve collect aggregate limit during merge#63361

Merged
HappenLee merged 1 commit into
apache:masterfrom
mrhhsg:codex/fix-collect-merge-max-size
May 19, 2026
Merged

[fix](be) Preserve collect aggregate limit during merge#63361
HappenLee merged 1 commit into
apache:masterfrom
mrhhsg:codex/fix-collect-merge-max-size

Conversation

@mrhhsg
Copy link
Copy Markdown
Member

@mrhhsg mrhhsg commented May 18, 2026

What problem does this PR solve?

Issue Number: DORIS-25679

Related PR: N/A

Problem Summary: collect_list and collect_set aggregate states with a limit could lose their initialized max_size during merge because several merge paths overwrote the left state max_size with the right state unconditionally. If the right state was still uninitialized, later merge/add operations could ignore the requested limit.

Release note

None

Check List (For Author)

  • Test: Unit Test
    • build-support/clang-format.sh
    • build-support/check-format.sh
    • ./run-be-ut.sh --run --filter=VAggCollectTest.:AggregateFunctionCollectTest.
    • build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (failed: local clang-tidy could not analyze due to missing stddef.h in the toolchain include path and an existing unmatched NOLINTEND diagnostic in be/src/core/types.h)
  • Behavior changed: Yes. collect_list/collect_set merge preserves an initialized max_size limit instead of overwriting it with an uninitialized right-side state.
  • Does this need documentation: No

### What problem does this PR solve?

Issue Number: DORIS-25679

Related PR: N/A

Problem Summary: collect_list and collect_set aggregate states with a limit could lose their initialized max_size during merge because several merge paths overwrote the left state max_size with the right state unconditionally. If the right state was still uninitialized, later merge/add operations could ignore the requested limit.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - build-support/clang-format.sh
    - build-support/check-format.sh
    - ./run-be-ut.sh --run --filter=VAggCollectTest.*:AggregateFunctionCollectTest.*
    - build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (failed: local clang-tidy could not analyze due to missing stddef.h in the toolchain include path and an existing unmatched NOLINTEND diagnostic in be/src/core/types.h)
- Behavior changed: Yes. collect_list/collect_set merge preserves an initialized max_size limit instead of overwriting it with an uninitialized right-side state.
- Does this need documentation: No
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 18, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking issues found.

Critical checkpoint conclusions:

  • Goal and proof: The change preserves an already-initialized collect_list/collect_set limit during aggregate-state merge, and the added BE unit test exercises the affected limited merge variants.
  • Scope: The code change is small and focused; it removes only the unconditional rhs max_size overwrite and keeps surrounding merge behavior unchanged.
  • Concurrency/lifecycle: The touched aggregate state merge paths do not introduce shared state, new threads, static initialization, or lifecycle ownership changes.
  • Config/compatibility/protocol: No configuration, storage format, FE-BE protocol, or rolling-upgrade compatibility concern was introduced. Serialized state layout is unchanged.
  • Parallel paths: The affected string/list/complex limited merge paths were updated consistently; unaffected no-limit paths remain unchanged.
  • Tests: Targeted BE unit coverage was added. I did not run tests in this review environment.
  • Observability/transactions/data writes: Not applicable; this is local aggregate-state behavior with no transaction, persistence, or logging surface change.
  • Performance: No new hot-path overhead was added; the merge paths avoid additional allocations beyond existing insertion behavior.

User focus: No additional user-provided review focus was present.

@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 18, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 30894 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 729f308c3f91c0c7015ad74c1b53c59c90c261fb, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17929	4107	3905	3905
q2	q3	10761	1393	809	809
q4	4688	473	350	350
q5	7548	2235	2116	2116
q6	238	176	140	140
q7	993	776	637	637
q8	9500	1587	1425	1425
q9	5197	4947	4904	4904
q10	6373	2079	1796	1796
q11	435	271	245	245
q12	642	428	291	291
q13	18095	3381	2777	2777
q14	257	257	234	234
q15	q16	818	783	711	711
q17	1022	1005	984	984
q18	6957	5673	5515	5515
q19	1331	1249	1040	1040
q20	540	407	277	277
q21	5902	2642	2427	2427
q22	442	366	311	311
Total cold run time: 99668 ms
Total hot run time: 30894 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4384	4164	4203	4164
q2	q3	4481	4956	4328	4328
q4	2212	2206	1389	1389
q5	4429	4319	4302	4302
q6	233	176	133	133
q7	2085	2080	1770	1770
q8	2639	2227	2200	2200
q9	8073	7936	7944	7936
q10	4570	4541	4096	4096
q11	595	434	386	386
q12	746	754	522	522
q13	3416	3617	2962	2962
q14	296	311	284	284
q15	q16	708	747	638	638
q17	1397	1352	1476	1352
q18	8091	7251	7313	7251
q19	1206	1095	1090	1090
q20	2193	2206	1922	1922
q21	5345	4753	4471	4471
q22	522	459	412	412
Total cold run time: 57621 ms
Total hot run time: 51608 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 168492 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 729f308c3f91c0c7015ad74c1b53c59c90c261fb, data reload: false

query5	4320	658	520	520
query6	341	213	193	193
query7	4259	572	314	314
query8	331	245	221	221
query9	8824	3983	3970	3970
query10	453	343	297	297
query11	5787	2350	2194	2194
query12	182	129	125	125
query13	1295	568	420	420
query14	5920	5343	5011	5011
query14_1	4300	4302	4306	4302
query15	208	201	181	181
query16	974	445	423	423
query17	941	698	606	606
query18	2428	483	355	355
query19	213	203	161	161
query20	135	144	137	137
query21	231	136	113	113
query22	13559	13518	13322	13322
query23	17280	16340	15993	15993
query23_1	16110	16142	16137	16137
query24	7397	1764	1284	1284
query24_1	1305	1304	1315	1304
query25	596	510	436	436
query26	1334	314	185	185
query27	2667	587	343	343
query28	4477	1926	1925	1925
query29	1055	634	517	517
query30	309	244	201	201
query31	1135	1090	951	951
query32	90	80	77	77
query33	570	354	306	306
query34	1144	1133	636	636
query35	777	784	679	679
query36	1339	1378	1164	1164
query37	150	106	95	95
query38	3234	3160	3096	3096
query39	936	931	931	931
query39_1	884	881	888	881
query40	241	151	129	129
query41	72	70	68	68
query42	111	111	111	111
query43	327	325	298	298
query44	
query45	216	203	196	196
query46	1071	1243	738	738
query47	2302	2369	2236	2236
query48	410	414	307	307
query49	649	506	406	406
query50	988	344	246	246
query51	4349	4251	4219	4219
query52	109	109	96	96
query53	267	280	207	207
query54	331	302	264	264
query55	95	94	85	85
query56	304	316	308	308
query57	1424	1426	1347	1347
query58	325	289	275	275
query59	1562	1619	1375	1375
query60	334	370	298	298
query61	161	160	161	160
query62	665	640	558	558
query63	258	204	200	200
query64	2394	799	638	638
query65	
query66	1770	471	361	361
query67	29979	29967	29158	29158
query68	
query69	458	353	305	305
query70	1025	983	981	981
query71	293	277	269	269
query72	2994	2646	2463	2463
query73	862	803	425	425
query74	5107	4913	4748	4748
query75	2684	2608	2285	2285
query76	2318	1160	759	759
query77	390	403	332	332
query78	12194	12294	11680	11680
query79	1445	1037	731	731
query80	651	545	448	448
query81	454	279	240	240
query82	1361	159	122	122
query83	353	273	250	250
query84	312	145	107	107
query85	915	554	449	449
query86	388	331	309	309
query87	3421	3377	3207	3207
query88	3521	2704	2647	2647
query89	433	387	345	345
query90	1985	181	179	179
query91	175	162	137	137
query92	79	76	75	75
query93	1459	1447	867	867
query94	548	350	303	303
query95	676	390	352	352
query96	1039	744	328	328
query97	2711	2685	2589	2589
query98	236	231	226	226
query99	1115	1115	968	968
Total cold run time: 252840 ms
Total hot run time: 168492 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.56% (27809/37804)
Line Coverage 57.51% (301175/523737)
Region Coverage 54.77% (251838/459844)
Branch Coverage 56.28% (108822/193350)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.59% (27821/37804)
Line Coverage 57.53% (301315/523737)
Region Coverage 54.77% (251873/459844)
Branch Coverage 56.31% (108869/193350)

@mrhhsg mrhhsg marked this pull request as ready for review May 19, 2026 03:29
Copy link
Copy Markdown
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HappenLee HappenLee merged commit 8f696e0 into apache:master May 19, 2026
35 of 36 checks passed
github-actions Bot pushed a commit that referenced this pull request May 19, 2026
Problem Summary: collect_list and collect_set aggregate states with a
limit could lose their initialized max_size during merge because several
merge paths overwrote the left state max_size with the right state
unconditionally. If the right state was still uninitialized, later
merge/add operations could ignore the requested limit.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - build-support/clang-format.sh
    - build-support/check-format.sh
- ./run-be-ut.sh --run
--filter=VAggCollectTest.*:AggregateFunctionCollectTest.*
- build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (failed:
local clang-tidy could not analyze due to missing stddef.h in the
toolchain include path and an existing unmatched NOLINTEND diagnostic in
be/src/core/types.h)
- Behavior changed: Yes. collect_list/collect_set merge preserves an
initialized max_size limit instead of overwriting it with an
uninitialized right-side state.
- Does this need documentation: No
@mrhhsg mrhhsg deleted the codex/fix-collect-merge-max-size branch May 20, 2026 02:14
yiguolei pushed a commit that referenced this pull request May 20, 2026
…63361 (#63383)

Cherry-picked from #63361

Co-authored-by: Jerry Hu <hushenggang@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants