Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions docs/cloud/features/04_automatic_scaling/01_overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
sidebar_position: 1
sidebar_label: 'Overview'
slug: /manage/scaling
description: 'Overview of automatic scaling in ClickHouse Cloud'
keywords: ['autoscaling', 'auto scaling', 'scaling', 'horizontal', 'vertical', 'bursts']
title: 'Automatic scaling'
doc_type: 'guide'
---

import ScalePlanFeatureBadge from '@theme/badges/ScalePlanFeatureBadge'

# Automatic scaling

Scaling is the ability to adjust available resources to meet client demands. Scale and Enterprise (with standard 1:4 profile) tier services can be scaled horizontally by calling an API programmatically, or changing settings on the UI to adjust system resources. These services can also be **autoscaled** vertically to meet application demands.

<ScalePlanFeatureBadge feature="Automatic vertical scaling"/>

:::note
Scale and Enterprise tiers support both single and multi-replica services, whereas, the Basic tier supports only single replica services. Single replica services are meant to be fixed in size and don't allow vertical or horizontal scaling. You can upgrade to the Scale or Enterprise tier to scale your services.
:::

## How scaling works in ClickHouse Cloud {#how-scaling-works-in-clickhouse-cloud}

Currently, ClickHouse Cloud supports vertical autoscaling and manual horizontal scaling for Scale tier services.

For Enterprise tier services scaling works as follows:

- **Horizontal scaling**: Manual horizontal scaling will be available across all standard and custom profiles on the enterprise tier.
- **Vertical scaling**:
- Standard profiles (1:4) will support vertical autoscaling.
- Custom profiles (`highMemory` and `highCPU`) don't support vertical autoscaling or manual vertical scaling. However, these services can be scaled vertically by contacting support.

:::note
Scaling in ClickHouse Cloud happens in what we call a ["Make Before Break" (MBB)](/cloud/features/mbb) approach.
This adds one or more replicas of the new size before removing the old replicas, preventing any loss of capacity during scaling operations.
By eliminating the gap between removing existing replicas and adding new ones, MBB creates a more seamless and less disruptive scaling process.
It is especially beneficial in scale-up scenarios, where high resource utilization triggers the need for additional capacity, since removing replicas prematurely would only exacerbate the resource constraints.
As part of this approach, we wait up to an hour to let any existing queries complete on the older replicas before removing them.
This balances the need for existing queries to complete, while at the same time ensuring that older replicas don't linger around for too long.
:::

## Learn more {#learn-more}

- [Vertical autoscaling](/cloud/features/autoscaling/vertical) — Automatic CPU and memory scaling based on usage
- [Horizontal scaling](/cloud/features/autoscaling/horizontal) — Manual replica scaling via API or UI
- [Make Before Break (MBB)](/cloud/features/mbb) — How ClickHouse Cloud performs seamless scaling operations
- [Automatic idling](/cloud/features/autoscaling/idling) — Cost savings through automatic service suspension
- [Scaling recommendations](/cloud/features/autoscaling/scaling-recommendations) — Understanding scaling recommendations
- [Scheduled scaling](/cloud/features/autoscaling/scaling-recommendations) — Understanding the Scheduled Scaling feature, which lets you define exactly when your service should scale up or down, independent of real-time metrics
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
sidebar_position: 2
sidebar_label: 'Vertical autoscaling'
slug: /cloud/features/autoscaling/vertical
description: 'Configuring vertical autoscaling in ClickHouse Cloud'
keywords: ['autoscaling', 'auto scaling', 'vertical', 'scaling', 'CPU', 'memory']
title: 'Vertical autoscaling'
doc_type: 'guide'
---

import Image from '@theme/IdealImage';
import auto_scaling from '@site/static/images/cloud/manage/AutoScaling.png';
import ScalePlanFeatureBadge from '@theme/badges/ScalePlanFeatureBadge'

<ScalePlanFeatureBadge feature="Automatic vertical scaling"/>

Scale and Enterprise tier services support autoscaling based on CPU and memory usage. Service usage is constantly monitored over a lookback window to make scaling decisions. If the usage rises above or falls below certain thresholds, the service is scaled appropriately to match the demand.

## Configuring vertical auto scaling {#configuring-vertical-auto-scaling}

The scaling of ClickHouse Cloud Scale or Enterprise services can be adjusted by organization members with the **Admin** role. To configure vertical autoscaling, go to the **Settings** tab for your service and adjust the minimum and maximum memory, along with CPU settings as shown below.

:::note
Single replica services can't be scaled for all tiers.
:::

<Image img={auto_scaling} size="lg" alt="Scaling settings page" border/>

Set the **Maximum memory** for your replicas at a higher value than the **Minimum memory**. The service will then scale as needed within those bounds. These settings are also available during the initial service creation flow. Each replica in your service will be allocated the same memory and CPU resources.

You can also choose to set these values the same, essentially "pinning" the service to a specific configuration. Doing so will immediately force scaling to the desired size you picked.

It's important to note that this will disable any auto scaling on the cluster, and your service won't be protected against increases in CPU or memory usage beyond these settings.

:::note
For Enterprise tier services, standard 1:4 profiles will support vertical autoscaling. Custom profiles don’t support vertical autoscaling or manual vertical scaling. However, these services can be scaled vertically by contacting support.
:::
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
sidebar_position: 3
sidebar_label: 'Horizontal scaling'
slug: /cloud/features/autoscaling/horizontal
description: 'Manual horizontal scaling in ClickHouse Cloud'
keywords: ['horizontal scaling', 'scaling', 'replicas', 'manual scaling', 'spikes', 'bursts']
title: 'Horizontal scaling'
doc_type: 'guide'
---

import Image from '@theme/IdealImage';
import scaling_patch_request from '@site/static/images/cloud/manage/scaling-patch-request.png';
import scaling_patch_response from '@site/static/images/cloud/manage/scaling-patch-response.png';
import scaling_configure from '@site/static/images/cloud/manage/scaling-configure.png';
import scaling_memory_allocation from '@site/static/images/cloud/manage/scaling-memory-allocation.png';
import ScalePlanFeatureBadge from '@theme/badges/ScalePlanFeatureBadge'

## Manual horizontal scaling {#manual-horizontal-scaling}

<ScalePlanFeatureBadge feature="Manual horizontal scaling"/>

You can use ClickHouse Cloud [public APIs](https://clickhouse.com/docs/cloud/manage/api/swagger#/paths/~1v1~1organizations~1:organizationId~1services~1:serviceId~1scaling/patch) to scale your service by updating the scaling settings for the service or adjust the number of replicas from the cloud console.

**Scale** and **Enterprise** tiers also support single-replica services. Services once scaled out, can be scaled back in to a minimum of a single replica. Note that single replica services have reduced availability and aren't recommended for production usage.

:::note
Services can scale horizontally to a maximum of 20 replicas. If you need additional replicas, please contact our support team.
:::

### Horizontal scaling via API {#horizontal-scaling-via-api}

To horizontally scale a cluster, issue a `PATCH` request via the API to adjust the number of replicas. The screenshots below show an API call to scale out a `3` replica cluster to `6` replicas, and the corresponding response.

<Image img={scaling_patch_request} size="lg" alt="Scaling PATCH request" border/>

*`PATCH` request to update `numReplicas`*

<Image img={scaling_patch_response} size="md" alt="Scaling PATCH response" border/>

*Response from `PATCH` request*

If you issue a new scaling request or multiple requests in succession, while one is already in progress, the scaling service will ignore the intermediate states and converge on the final replica count.

### Horizontal scaling via UI {#horizontal-scaling-via-ui}

To scale a service horizontally from the UI, you can adjust the number of replicas for the service on the **Settings** page.

<Image img={scaling_configure} size="md" alt="Scaling configuration settings" border/>

*Service scaling settings from the ClickHouse Cloud console*

Once the service has scaled, the metrics dashboard in the cloud console should show the correct allocation to the service. The screenshot below shows the cluster having scaled to total memory of `96 GiB`, which is `6` replicas, each with `16 GiB` memory allocation.

<Image img={scaling_memory_allocation} size="md" alt="Scaling memory allocation" border />
30 changes: 30 additions & 0 deletions docs/cloud/features/04_automatic_scaling/05_automatic_idling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
sidebar_position: 5
sidebar_label: 'Automatic idling'
slug: /cloud/features/autoscaling/idling
description: 'Automatic idling and adaptive idling in ClickHouse Cloud'
keywords: ['idling', 'automatic idling', 'adaptive idling', 'cost savings', 'pause']
title: 'Automatic idling'
doc_type: 'guide'
---

## Automatic idling {#automatic-idling}
In the **Settings** page, you can also choose whether or not to allow automatic idling of your service when it is inactive for a certain duration (i.e. when the service isn't executing any user-submitted queries). Automatic idling reduces the cost of your service, as you're not billed for compute resources when the service is paused.

### Adaptive Idling {#adaptive-idling}
ClickHouse Cloud implements adaptive idling to prevent disruptions while optimizing cost savings. The system evaluates several conditions before transitioning a service to idle. Adaptive idling overrides the idling duration setting when any of the below listed conditions are met:
- When the number of parts exceeds the maximum idle parts threshold (default: 10,000), the service isn't idled so that background maintenance can continue
- When there are ongoing merge operations, the service isn't idled until those merges complete to avoid interrupting critical data consolidation
- Additionally, the service also adapts idle timeouts based on server initialization time:
- If server initialization time is less than 15 minutes, no adaptive timeout is applied and the customer-configured default idle timeout is used
- If server initialization time is between 15 and 30 minutes, the idle timeout is set to 15 minutes
- If server initialization time is between 30 and 60 minutes, the idle timeout is set to 30 minutes.
- If server initialization time is more than 60 minutes, the idle timeout is set to 1 hour

:::note
The service may enter an idle state where it suspends refreshes of [refreshable materialized views](/materialized-view/refreshable-materialized-view), consumption from [S3Queue](/engines/table-engines/integrations/s3queue), and scheduling of new merges. Existing merge operations will complete before the service transitions to the idle state. To ensure continuous operation of refreshable materialized views and S3Queue consumption, disable the idle state functionality.
:::

:::danger When not to use automatic idling
Use automatic idling only if your use case can handle a delay before responding to queries, because when a service is paused, connections to the service will time out. Automatic idling is ideal for services that are used infrequently and where a delay can be tolerated. It isn't recommended for services that power customer-facing features that are used frequently.
:::
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
sidebar_position: 6
sidebar_label: 'Scaling recommendations'
slug: /cloud/features/autoscaling/scaling-recommendations
description: 'Understanding scaling recommendations in ClickHouse Cloud'
keywords: ['scaling recommendations', 'recommender', '2-window', 'autoscaling', 'optimization']
title: 'Scaling recommendations'
doc_type: 'guide'
---

import Image from '@theme/IdealImage';
import two_window_recommender from '@site/static/images/cloud/features/autoscaling/two-window-recommender.png';

## Introduction {#introduction}

Auto-scaling database resources requires careful balance: scaling up too slowly can risk performance degradation while scaling down too aggressively can trigger constant oscillations.

ClickHouse Cloud enables faster scale-downs, minimized scaling oscillations, and substantial infrastructure cost reduction for variable workloads, while maintaining the stability needed for production databases
by pairing a two-window recommendation framework with a target-tracking CPU recommendation system.

## CPU-based Scaling {#cpu-based-scaling}

CPU Scaling is based on target tracking which calculates the exact CPU allocation needed to keep utilization at a target level. A scaling action is only triggered if current CPU utilization falls outside a defined band:

| Parameter | Value | Meaning |
|---|---|---|
| Target utilization | 53% | The utilization level ClickHouse aims to maintain |
| High watermark | 75% | Triggers scale-up when CPU exceeds this threshold |
| Low watermark | 37.5% | Triggers scale-down when CPU falls below this threshold |

The recommender evaluates CPU utilization based on historical usage, and determines a recommended CPU size using this formula:

```text
recommended_cpu = max_cpu_usage / target_utilization
```

If the CPU utilization is between 37.5%–75% of allocated capacity, no scaling action is taken. Outside that band, the recommender computes the exact size needed to land back at 53% utilization, and the service is scaled accordingly.

### Example {#cpu-scaling-example}

A service allocated 4 vCPU experiences a spike to 3.8 vCPU usage (~95% utilization), crossing the 75% high watermark.
The recommender calculates: `3.8 / 0.53 ≈ 7.2 vCPU`, and rounds up to the next available size (8 vCPU). Once load subsides and usage drops below 37.5% (1.5 vCPU), the recommender scales back down proportionally.

## Memory-based recommendations {#memory-based-recommendations}

ClickHouse Cloud automatically recommends memory sizes based on your service's actual usage patterns.
The recommender analyzes usage over a lookback window and adds headroom to handle spikes and prevent out-of-memory (OOM) errors.

The recommender looks at three signals:
- **Query memory**: The peak memory used during query execution
- **Resident memory**: The peak memory held by the process overall
- **OOM events**: Whether queries or replicas have recently run out of memory

### How headroom is calculated {#how-headroom-is-calculated}

For query and resident memory, the amount of headroom added depends on how predictable your usage is:

- **Stable usage (low variation)**: 1.25x multiplier — more headroom, since usage is consistent and unlikely to spike unexpectedly
- **Spiky usage (high variation)**: 1.1x multiplier — less headroom, to avoid over-provisioning for workloads that already vary widely

If OOM events are detected, the recommender applies a more aggressive **1.5x multiplier** to ensure the service has enough memory to recover.

### Final recommendation {#final-recommendation}

The system takes the highest value across all signals:

```text
desired_memory = max(
query_memory × skew_multiplier,
resident_memory × skew_multiplier,
resident_memory × 1.5, // if query OOMs detected
rss_at_crash × 1.5 // if pod OOMs detected
)
```

## Two-window recommender {#two-window-recommender}

Instead of using a single window, ClickHouse Cloud uses two lookback windows with different time ranges:
- **Small Window (3 hours)**: Captures recent usage patterns, enables faster scale-down
- **Large Window (30 hours)**: Ensures we scale up in a single step to the maximum usage seen in the longer lookback window, rather than multiple gradual scale-ups. This is critical because scaling takes time and invalidates local caches; so it is safer to scale up in a single step.

Each window independently generates a recommendation using both memory and CPU analysis.
The system then merges these recommendations based on the scaling direction each window suggests, as shown in the figure below:

<Image img={two_window_recommender} size="lg" alt="Two-window recommender merging logic" />

For a deep dive into the design decisions of the recommender, see ["Smarter Auto-Scaling for ClickHouse: The Two-Window Approach
"](https://clickhouse.com/blog/smarter-auto-scaling#the-two-window-solution)
Loading
Loading