branch-4.1: [improve](partition) Increase partition limit defaults to 20000 and add near-limit metrics #61511 by dataroaring · Pull Request #61765 · apache/doris

dataroaring · 2026-03-26T08:03:14Z

Summary

Cherry-pick of #61511 to branch-4.1.

Raise max_dynamic_partition_num default from 500 to 20000 and max_auto_partition_num from 2000 to 20000 to match modern production workloads
Add warning logs when partition counts exceed 80% of their configured limits, enabling proactive detection before hard failures
Add Prometheus counter metrics (auto_partition_near_limit_count, dynamic_partition_near_limit_count) for monitoring/alerting

Conflict Resolution

Config.java: Trivial context conflict in max_auto_partition_num description formatting — resolved by taking the incoming change (20000 default + updated English description).

Test plan

Verify existing dynamic partition tests pass with new default
Verify auto-partition limit check still errors correctly when exceeded
Verify warning logs appear when partition count is between 80%-100% of limit
Verify new metrics appear in /metrics Prometheus endpoint

…dd near-limit metrics (#61511) - Raise `max_dynamic_partition_num` default from 500 to 20000 and `max_auto_partition_num` from 2000 to 20000 to match modern production workloads - Add warning logs when partition counts exceed 80% of their configured limits, enabling proactive detection before hard failures - Add Prometheus counter metrics (`auto_partition_near_limit_count`, `dynamic_partition_near_limit_count`) for monitoring/alerting - [ ] Verify existing dynamic partition tests pass with new default (tests explicitly set config values, so unaffected) - [ ] Verify auto-partition limit check still errors correctly when exceeded - [ ] Verify warning logs appear when partition count is between 80%-100% of limit - [ ] Verify new metrics appear in `/metrics` Prometheus endpoint - [ ] Test Prometheus alert rule: `rate(doris_fe_auto_partition_near_limit_count[5m]) > 0` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: dataroaring <dataroaring@users.noreply.github.com>

hello-stephen · 2026-03-26T08:03:44Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

Copilot

Pull request overview

This PR cherry-picks improvements to partition limit handling on branch-4.1 by raising default partition limits and adding early-warning signals (logs + Prometheus counters) when partition counts approach configured caps.

Changes:

Increased default max_dynamic_partition_num and max_auto_partition_num to 20000.
Added warning logs when partition counts exceed 80% of the configured limit.
Added Prometheus counter metrics for near-limit events for both auto and dynamic partitions.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java	Adds near-limit warning log + metric increment for auto-partition count checks.
fe/fe-core/src/main/java/org/apache/doris/metric/MetricRepo.java	Registers two new Prometheus counter metrics for partition near-limit warnings.
fe/fe-core/src/main/java/org/apache/doris/common/util/DynamicPartitionUtil.java	Adds near-limit warning log + metric increment for dynamic partition count checks.
fe/fe-common/src/main/java/org/apache/doris/common/Config.java	Raises default partition limit config values to 20000 (and updates English description for auto partitions).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-26T08:10:13Z

fe/fe-common/src/main/java/org/apache/doris/common/Config.java

@@ -2966,9 +2966,8 @@ public class Config extends ConfigBase {
    @ConfField(mutable = true, masterOnly = true, description = {
            "对于自动分区表，防止用户意外创建大量分区，每个 OLAP 表允许的分区数量为`max_auto_partition_num`。默认 2000。",


The Chinese description still says the default is 2000 (“默认 2000。”) but the actual default was changed to 20000. Update the Chinese string to match the new default to avoid misleading operators.

Suggested change

"对于自动分区表，防止用户意外创建大量分区，每个 OLAP 表允许的分区数量为`max_auto_partition_num`。默认 2000。",

"对于自动分区表，防止用户意外创建大量分区，每个 OLAP 表允许的分区数量为`max_auto_partition_num`。默认 20000。",

Copilot · 2026-03-26T08:10:14Z

fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java

            result.setStatus(errorStatus);
            LOG.warn("send create partition error status: {}", result);
            return result;
+        } else if (partitionNum > autoPartitionLimit * 8 / 10) {


autoPartitionLimit * 8 / 10 is computed using int arithmetic and can overflow if max_auto_partition_num is configured to a large value (this is a mutable config). Use long arithmetic (cast before multiply) or compare via partitionNum * 10 > autoPartitionLimit * 8 using long to avoid overflow/truncation pitfalls.

Suggested change

} else if (partitionNum > autoPartitionLimit * 8 / 10) {

} else if ((long) partitionNum * 10 > (long) autoPartitionLimit * 8) {

Copilot · 2026-03-26T08:10:14Z

fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java

+        } else if (partitionNum > autoPartitionLimit * 8 / 10) {
+            LOG.warn("Table {}.{} auto partition count {} is approaching limit {} (>80%)."
+                        + " Consider increasing max_auto_partition_num.",
+                    db.getFullName(), olapTable.getName(), partitionNum, autoPartitionLimit);
+            if (MetricRepo.isInit) {
+                MetricRepo.COUNTER_AUTO_PARTITION_NEAR_LIMIT.increase(1L);
+            }


This warning (and counter increment) will trigger on every createPartition call once the table is above the 80% threshold, which can produce noisy logs and rapidly increasing counters in busy clusters. Consider adding throttling/deduping (e.g., log at most once per table per time window, or only when crossing the threshold) and similarly gate the metric increment to threshold-crossing events rather than per-request.

Suggested change

} else if (partitionNum > autoPartitionLimit * 8 / 10) {

LOG.warn("Table {}.{} auto partition count {} is approaching limit {} (>80%)."

+ " Consider increasing max_auto_partition_num.",

db.getFullName(), olapTable.getName(), partitionNum, autoPartitionLimit);

if (MetricRepo.isInit) {

MetricRepo.COUNTER_AUTO_PARTITION_NEAR_LIMIT.increase(1L);

}

} else {

// Only emit the warning and increment the metric when crossing the 80% threshold.

// Estimate the partition count before this request by subtracting the number of

// partitions requested to be added. This avoids noisy logs/metrics when the table

// is already above the threshold.

int prevPartitionNumEstimate = partitionNum - addPartitionClauseMap.size();

if (prevPartitionNumEstimate < 0) {

prevPartitionNumEstimate = 0;

}

int threshold80 = autoPartitionLimit * 8 / 10;

if (partitionNum > threshold80 && prevPartitionNumEstimate <= threshold80) {

LOG.warn("Table {}.{} auto partition count {} is approaching limit {} (>80%)."

+ " Consider increasing max_auto_partition_num.",

db.getFullName(), olapTable.getName(), partitionNum, autoPartitionLimit);

if (MetricRepo.isInit) {

MetricRepo.COUNTER_AUTO_PARTITION_NEAR_LIMIT.increase(1L);

}

}

Copilot · 2026-03-26T08:10:15Z

fe/fe-core/src/main/java/org/apache/doris/common/util/DynamicPartitionUtil.java

+                LOG.warn("Dynamic partition count {} is approaching limit {} (>80%)."
+                        + " Consider increasing max_dynamic_partition_num.",
+                        expectCreatePartitionNum, dynamicPartitionLimit);
+                if (MetricRepo.isInit) {
+                    MetricRepo.COUNTER_DYNAMIC_PARTITION_NEAR_LIMIT.increase(1L);
+                }


Similar to the auto-partition path, this will warn and increment the counter on every analysis call above the 80% threshold, which can be very frequent (DDL validations and retries). Consider throttling/deduping, or incrementing only on threshold crossing to keep logs/metrics actionable and avoid alert fatigue.

dataroaring · 2026-03-26T08:44:35Z

run buildall

hello-stephen · 2026-03-26T10:08:07Z

FE UT Coverage Report

Increment line coverage 40.91% (9/22) 🎉
Increment coverage report
Complete coverage report

dataroaring requested a review from yiguolei as a code owner March 26, 2026 08:03

Copilot AI review requested due to automatic review settings March 26, 2026 08:03

Copilot AI reviewed Mar 26, 2026

View reviewed changes

Copilot started reviewing on behalf of dataroaring March 26, 2026 08:11 View session

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

branch-4.1: [improve](partition) Increase partition limit defaults to 20000 and add near-limit metrics #61511#61765

branch-4.1: [improve](partition) Increase partition limit defaults to 20000 and add near-limit metrics #61511#61765
dataroaring wants to merge 1 commit intobranch-4.1from
pick/61511-branch-4.1

dataroaring commented Mar 26, 2026

Uh oh!

hello-stephen commented Mar 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

dataroaring commented Mar 26, 2026

Uh oh!

hello-stephen commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -2966,9 +2966,8 @@ public class Config extends ConfigBase {
		@ConfField(mutable = true, masterOnly = true, description = {
		"对于自动分区表，防止用户意外创建大量分区，每个 OLAP 表允许的分区数量为`max_auto_partition_num`。默认 2000。",

	} else if (partitionNum > autoPartitionLimit * 8 / 10) {
	} else if ((long) partitionNum * 10 > (long) autoPartitionLimit * 8) {

Conversation

dataroaring commented Mar 26, 2026

Summary

Conflict Resolution

Test plan

Uh oh!

hello-stephen commented Mar 26, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

dataroaring commented Mar 26, 2026

Uh oh!

hello-stephen commented Mar 26, 2026

FE UT Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants