Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions modules/migrate/pages/kubernetes/helm-to-operator.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,50 @@ You should see your overrides in YAML format. You'll need to configure your Redp

TIP: Before implementing any changes in your production environment, Redpanda Data recommends testing the migration in a non-production environment.

== Rolling restart during migration

When you apply the Redpanda custom resource, the operator triggers a rolling restart of all broker pods. The rolling restart is unavoidable during migration, regardless of how closely your `clusterSpec` values match the existing Helm values.

=== How the rolling restart works

The migration triggers the following sequence:

. *Resource adoption*: The operator uses server-side apply to take ownership of the existing Helm-managed StatefulSet. It patches the StatefulSet with new labels (generation, config version, and ownership) and sets the Redpanda custom resource as the owner reference.
. *StatefulSet spec update*: The operator re-renders the StatefulSet from its own templates. Any difference, even metadata or label changes, creates a new `ControllerRevision`.
. *Pod rolling*: The operator compares each pod's `StatefulSetRevisionLabel` against the latest `ControllerRevision`. Because the existing pods were created under the old Helm-managed revision, the operator flags every pod for rolling.
. *One-at-a-time deletion*: The operator deletes pods one at a time:
+
--
* Checks cluster health through the admin API.
* Deletes the pod if the cluster is healthy, then requeues after 10 seconds.
* Waits for the cluster to stabilize before rolling the next pod.
* Skips deletion and requeues if the cluster is unhealthy.
--

=== Impact by cluster configuration

[cols="2a,4a"]
|===
| Scenario | Impact

| 3+ brokers, replication factor (RF) ≥ 3
| No data loss and no downtime for consumers or producers configured with `acks=all`. Individual broker restarts cause brief partition leader elections, typically a few seconds each. Redpanda uses Raft consensus, so writes require a majority quorum. With RF=3 and one broker restarting, the remaining two brokers maintain quorum and writes continue.

| 3 brokers, RF = 1
| Partitions on the restarting broker are unavailable for the duration of that broker's restart.

| Single broker
| Full outage for the duration of the restart.
|===

=== Recommended producer settings

To avoid message loss during the rolling restart, configure producers with the following settings:

* `acks=all` (or `-1`): Ensures the Raft quorum (majority of replicas) commits the write before acknowledging it.
* `retries`: Handles `NOT_LEADER_FOR_PARTITION` errors during leader elections. Set this to a high value.
* `enable.idempotence=true`: Prevents duplicate messages from retries.

== Migrate to the Redpanda Operator and Helm

To migrate to the latest Redpanda Operator and use it to manage your Helm deployment, follow these steps.
Expand Down
Loading