Skip to content

[Fix](topn) Reject non-positive topn count argument#63350

Open
linrrzqqq wants to merge 2 commits into
apache:masterfrom
linrrzqqq:fix-topn-negative-arg
Open

[Fix](topn) Reject non-positive topn count argument#63350
linrrzqqq wants to merge 2 commits into
apache:masterfrom
linrrzqqq:fix-topn-negative-arg

Conversation

@linrrzqqq
Copy link
Copy Markdown
Collaborator

@linrrzqqq linrrzqqq commented May 18, 2026

Problem Summary:

According to Doris doc: topn count must be a positive integer, but before this, negative numbers and 0 were not handled specially, returning {}, leading to unexpected results.

before

Doris> SELECT TOPN(page_id, 0) as top_zero_pages FROM page_visits;
+----------------+
| top_zero_pages |
+----------------+
| {}             |
+----------------+

Doris> SELECT TOPN(page_id, -1) as top_neg_one_pages FROM page_visits;
+-------------------+
| top_neg_one_pages |
+-------------------+
| {}                |
+-------------------+

now:

Doris> SELECT TOPN(page_id, 0) as top_pages FROM page_visits;
ERROR 1105 (HY000): errCode = 2, detailMessage = topn requires second parameter must be a constant positive integer: topn(cast(page_id as VARCHAR(65533)), 0)

Doris> SELECT TOPN(page_id, -1) as top_pages FROM page_visits;
ERROR 1105 (HY000): errCode = 2, detailMessage = topn requires second parameter must be a constant positive integer: topn(cast(page_id as VARCHAR(65533)), -1)

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

morrySnow
morrySnow previously approved these changes May 18, 2026
@morrySnow
Copy link
Copy Markdown
Contributor

/review

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 18, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one blocking issue. The new validation is in the right area for Nereids and the added negative/zero coverage is useful, but it runs after signature coercion to IntegerType. That makes an existing positive out-of-range literal test fail as a non-positive value after Java int narrowing.

Critical checkpoint conclusions:

  • Goal/test: partially accomplished for negative and zero Nereids topn counts, but existing topn_weighted type-coercion coverage is not updated and appears to regress.
  • Scope/focus: the change is small and focused.
  • Concurrency/lifecycle/config/compatibility/transaction/write paths: not involved.
  • Parallel paths: no active original-planner path found; Nereids path is the relevant path.
  • Conditional checks: the new condition needs to account for post-coercion narrowing/out-of-range literals.
  • Test coverage/results: added tests cover -1 and 0, but an existing regression test with 2147483648 is now impacted and no expected result/error update is included.
  • Observability/performance: not applicable.

User focus: no additional user-provided focus points were supplied.

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 13.33% (2/15) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31554 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 134b05fcf0121a6eb4fbe45e264e749069154431, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17677	3956	3943	3943
q2	q3	10794	1383	816	816
q4	4683	477	362	362
q5	7642	2238	2072	2072
q6	235	181	145	145
q7	977	777	633	633
q8	9447	1687	1585	1585
q9	5111	4981	4984	4981
q10	6382	2063	1787	1787
q11	429	279	251	251
q12	629	424	302	302
q13	18196	3337	2783	2783
q14	261	256	235	235
q15	q16	839	775	701	701
q17	996	984	960	960
q18	6931	5670	5563	5563
q19	1367	1259	1073	1073
q20	540	543	305	305
q21	6308	2841	2751	2751
q22	465	380	306	306
Total cold run time: 99909 ms
Total hot run time: 31554 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4925	4691	4634	4634
q2	q3	4916	5269	4649	4649
q4	2170	2292	1434	1434
q5	5035	4914	4657	4657
q6	232	181	134	134
q7	1848	1801	1547	1547
q8	2413	2186	2051	2051
q9	7716	7599	7282	7282
q10	4487	4439	4059	4059
q11	540	415	384	384
q12	699	723	526	526
q13	3031	3380	2833	2833
q14	264	277	255	255
q15	q16	681	700	608	608
q17	1290	1251	1256	1251
q18	7324	6940	6999	6940
q19	1145	1116	1101	1101
q20	2243	2219	1935	1935
q21	5415	4824	4661	4661
q22	535	476	411	411
Total cold run time: 56909 ms
Total hot run time: 51352 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169217 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 134b05fcf0121a6eb4fbe45e264e749069154431, data reload: false

query5	4335	654	510	510
query6	323	218	199	199
query7	4246	569	322	322
query8	333	236	230	230
query9	8837	4092	4096	4092
query10	452	344	298	298
query11	5808	2449	2182	2182
query12	195	126	127	126
query13	1285	601	430	430
query14	6049	5382	5043	5043
query14_1	4382	4348	4337	4337
query15	210	202	183	183
query16	993	469	381	381
query17	1035	708	597	597
query18	2442	486	347	347
query19	220	199	168	168
query20	135	138	143	138
query21	233	148	116	116
query22	13643	13560	13370	13370
query23	17254	16376	16039	16039
query23_1	16103	16267	16185	16185
query24	7378	1768	1290	1290
query24_1	1307	1311	1320	1311
query25	582	506	457	457
query26	1309	313	173	173
query27	2705	581	342	342
query28	4422	1942	1941	1941
query29	1001	656	514	514
query30	306	236	200	200
query31	1127	1078	947	947
query32	90	77	77	77
query33	544	361	302	302
query34	1184	1128	657	657
query35	777	804	681	681
query36	1366	1347	1138	1138
query37	154	106	93	93
query38	3227	3190	3089	3089
query39	974	944	903	903
query39_1	893	889	878	878
query40	233	154	131	131
query41	75	69	67	67
query42	111	116	113	113
query43	331	341	297	297
query44	
query45	216	204	197	197
query46	1074	1177	768	768
query47	2370	2416	2220	2220
query48	411	457	306	306
query49	643	514	410	410
query50	1047	369	249	249
query51	4358	4288	4237	4237
query52	109	108	98	98
query53	267	294	210	210
query54	346	281	274	274
query55	99	99	87	87
query56	310	342	316	316
query57	1446	1408	1333	1333
query58	310	284	273	273
query59	1545	1631	1408	1408
query60	335	343	313	313
query61	211	152	154	152
query62	684	612	565	565
query63	238	208	208	208
query64	2398	811	642	642
query65	
query66	1703	479	354	354
query67	30005	29937	29868	29868
query68	
query69	470	339	305	305
query70	1030	1064	964	964
query71	314	280	267	267
query72	2958	2685	2417	2417
query73	842	766	427	427
query74	5079	4902	4708	4708
query75	2671	2610	2259	2259
query76	2272	1135	767	767
query77	387	418	334	334
query78	12230	12252	11666	11666
query79	1524	1011	744	744
query80	774	543	443	443
query81	471	282	238	238
query82	1382	158	131	131
query83	353	275	251	251
query84	294	144	111	111
query85	895	520	486	486
query86	446	327	298	298
query87	3467	3367	3191	3191
query88	3522	2630	2597	2597
query89	458	383	338	338
query90	1813	178	179	178
query91	178	164	141	141
query92	77	79	77	77
query93	1468	1474	872	872
query94	619	364	321	321
query95	676	373	355	355
query96	1005	757	332	332
query97	2703	2693	2560	2560
query98	268	233	226	226
query99	1133	1111	1016	1016
Total cold run time: 253169 ms
Total hot run time: 169217 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 52.17% (12/23) 🎉
Increment coverage report
Complete coverage report

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@github-actions github-actions Bot removed the approved Indicates a PR has been approved by one committer. label May 18, 2026
@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31381 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c8189b285f9072033f1ce9ecf422aa13723ca304, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17639	3966	3940	3940
q2	q3	10748	1372	822	822
q4	4687	476	344	344
q5	7572	2292	2111	2111
q6	247	180	140	140
q7	927	792	627	627
q8	9431	1821	1575	1575
q9	5213	4951	4947	4947
q10	6398	2085	1788	1788
q11	435	280	249	249
q12	635	414	300	300
q13	18116	3380	2733	2733
q14	272	263	248	248
q15	q16	823	773	713	713
q17	999	973	1001	973
q18	6896	5797	5482	5482
q19	1302	1222	1105	1105
q20	530	555	310	310
q21	6456	2943	2660	2660
q22	465	519	314	314
Total cold run time: 99791 ms
Total hot run time: 31381 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5098	4706	4677	4677
q2	q3	4855	5247	4661	4661
q4	2112	2207	1613	1613
q5	5027	4679	4627	4627
q6	231	174	134	134
q7	1911	1722	1557	1557
q8	2457	2065	2166	2065
q9	7661	7250	7244	7244
q10	4446	4403	3978	3978
q11	540	383	351	351
q12	713	718	516	516
q13	3037	3424	2812	2812
q14	276	289	250	250
q15	q16	684	701	627	627
q17	1301	1253	1261	1253
q18	7348	6797	6789	6789
q19	1109	1112	1084	1084
q20	2218	2226	1961	1961
q21	5331	4742	4508	4508
q22	522	470	410	410
Total cold run time: 56877 ms
Total hot run time: 51117 ms

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found blocking issues in the actual PR diff. The existing review thread already covers the TopNWeighted overflow/regression-test mismatch, so I did not duplicate it. The same post-coercion overflow problem exists in the newly added TopN and TopNArray checks as distinct instances.

Critical checkpoint conclusions:

  • Goal and tests: rejecting non-positive TopN counts is reasonable, but the implementation does not correctly validate out-of-range positive literals before IntegerType coercion; existing coverage only adds -1/0 cases.
  • Scope/focus: the actual PR diff is small and focused on TopN count validation.
  • Concurrency/lifecycle/configuration/persistence/storage format: not applicable.
  • Compatibility/parallel paths: the same validation pattern was added to all three TopN variants; TopNWeighted is already reported, and TopN/TopNArray need the same fix.
  • Test coverage: missing positive out-of-range count cases for topn and topn_array; the already-known topn_weighted existing regression also needs to be updated or preserved correctly.
  • Observability: not applicable.

User focus: no additional user-provided focus points were supplied.

}
if (!(topNCount instanceof Literal) || ((Literal) topNCount).getDouble() <= 0) {
throw new AnalysisException(
"topn requires second parameter must be a constant positive integer: "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the same post-coercion overflow problem as the already reported TopNWeighted case, but for topn. The second argument is coerced to IntegerType before checkLegalityAfterRewrite, and literal coercion uses Number.intValue(), so topn(col, 2147483648) reaches this check as IntegerLiteral(-2147483648) and is reported as a non-positive value instead of being handled as a positive out-of-range count. Please validate the original literal range before narrowing, or explicitly reject out-of-range counts with coverage for topn.

}
if (!(topNCount instanceof Literal) || ((Literal) topNCount).getDouble() <= 0) {
throw new AnalysisException(
"topn_array requires second parameter must be a constant positive integer: "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a distinct instance of the same overflow bug in topn_array. Because the count argument is coerced to IntegerType before this check, a positive literal such as 2147483648 is narrowed to a negative IntegerLiteral and then rejected as non-positive. Please validate/reject the original literal range before the narrowing cast, and add regression coverage for topn_array positive out-of-range counts.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170689 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c8189b285f9072033f1ce9ecf422aa13723ca304, data reload: false

query5	4368	663	527	527
query6	351	219	206	206
query7	4261	570	310	310
query8	325	239	229	229
query9	8817	4035	3995	3995
query10	439	338	298	298
query11	5819	2390	2167	2167
query12	183	129	126	126
query13	1292	593	428	428
query14	6078	5366	5064	5064
query14_1	4364	4388	4333	4333
query15	213	206	194	194
query16	1023	463	449	449
query17	1143	734	608	608
query18	2709	496	375	375
query19	231	206	179	179
query20	141	136	134	134
query21	223	142	118	118
query22	13617	13659	13407	13407
query23	16963	16356	16107	16107
query23_1	16211	16239	16194	16194
query24	7453	1754	1299	1299
query24_1	1328	1317	1291	1291
query25	578	504	441	441
query26	1324	331	179	179
query27	2673	575	335	335
query28	4401	1956	1924	1924
query29	1050	613	502	502
query30	305	238	203	203
query31	1138	1053	932	932
query32	94	75	71	71
query33	544	372	281	281
query34	1149	1134	663	663
query35	754	809	673	673
query36	1323	1354	1147	1147
query37	154	106	92	92
query38	3227	3146	3038	3038
query39	932	932	906	906
query39_1	885	866	889	866
query40	235	144	126	126
query41	65	62	72	62
query42	114	112	112	112
query43	318	327	297	297
query44	
query45	211	195	197	195
query46	1073	1210	721	721
query47	2299	2411	2193	2193
query48	359	411	298	298
query49	640	501	397	397
query50	1018	357	250	250
query51	4333	4470	4376	4376
query52	109	109	94	94
query53	265	284	203	203
query54	309	273	266	266
query55	95	91	85	85
query56	313	298	310	298
query57	1408	1380	1320	1320
query58	304	275	274	274
query59	1555	1601	1388	1388
query60	334	326	312	312
query61	166	152	154	152
query62	663	616	570	570
query63	243	206	205	205
query64	2416	826	674	674
query65	
query66	1663	514	358	358
query67	29417	29947	29775	29775
query68	
query69	452	346	310	310
query70	1051	1014	928	928
query71	311	269	267	267
query72	3008	2700	2466	2466
query73	838	740	431	431
query74	5079	4872	4749	4749
query75	2697	2607	2238	2238
query76	2286	1149	792	792
query77	404	407	332	332
query78	12309	12106	11605	11605
query79	1480	1038	753	753
query80	672	572	460	460
query81	466	274	245	245
query82	1365	156	125	125
query83	351	272	248	248
query84	272	139	107	107
query85	873	532	463	463
query86	393	347	310	310
query87	3448	3360	3221	3221
query88	3538	2674	2645	2645
query89	441	382	338	338
query90	1994	184	184	184
query91	180	166	143	143
query92	80	78	77	77
query93	1511	1447	883	883
query94	534	350	320	320
query95	678	380	437	380
query96	1057	795	334	334
query97	2697	2702	2562	2562
query98	236	229	227	227
query99	1117	1106	1013	1013
Total cold run time: 252797 ms
Total hot run time: 170689 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 9.52% (2/21) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants