redpanda-data · micheleRP · Mar 24, 2026 · Mar 20, 2026 · Mar 20, 2026 · Mar 20, 2026
@@ -73,6 +73,50 @@ You should see your overrides in YAML format. You'll need to configure your Redp
 
 TIP: Before implementing any changes in your production environment, Redpanda Data recommends testing the migration in a non-production environment.
 
+== Rolling restart during migration
+
+When you apply the Redpanda custom resource, the operator triggers a rolling restart of all broker pods. The rolling restart is unavoidable during migration, regardless of how closely your `clusterSpec` values match the existing Helm values.
+
+=== How the rolling restart works
+
+The migration triggers the following sequence:
+
+. *Resource adoption*: The operator uses server-side apply to take ownership of the existing Helm-managed StatefulSet. It patches the StatefulSet with new labels (generation, config version, and ownership) and sets the Redpanda custom resource as the owner reference.
+. *StatefulSet spec update*: The operator re-renders the StatefulSet from its own templates. Any difference, even metadata or label changes, creates a new `ControllerRevision`.
+. *Pod rolling*: The operator compares each pod's `StatefulSetRevisionLabel` against the latest `ControllerRevision`. Because the existing pods were created under the old Helm-managed revision, the operator flags every pod for rolling.
+. *One-at-a-time deletion*: The operator deletes pods one at a time:
++
+--
+* Checks cluster health through the admin API.
+* Deletes the pod if the cluster is healthy, then requeues after 10 seconds.
+* Waits for the cluster to stabilize before rolling the next pod.
+* Skips deletion and requeues if the cluster is unhealthy.
+--
+
+=== Impact by cluster configuration
+
+[cols="2a,4a"]
+|===
+| Scenario | Impact
+
+| 3+ brokers, replication factor (RF) ≥ 3
+| No data loss and no downtime for consumers or producers configured with `acks=all`. Individual broker restarts cause brief partition leader elections, typically a few seconds each. Redpanda uses Raft consensus, so writes require a majority quorum. With RF=3 and one broker restarting, the remaining two brokers maintain quorum and writes continue.
+
+| 3 brokers, RF = 1
+| Partitions on the restarting broker are unavailable for the duration of that broker's restart.
+
+| Single broker
+| Full outage for the duration of the restart.
+|===
+
+=== Recommended producer settings
+
+To avoid message loss during the rolling restart, configure producers with the following settings:
+
+* `acks=all` (or `-1`): Ensures the Raft quorum (majority of replicas) commits the write before acknowledging it.
+* `retries`: Handles `NOT_LEADER_FOR_PARTITION` errors during leader elections. Set this to a high value.
+* `enable.idempotence=true`: Prevents duplicate messages from retries.
+
 == Migrate to the Redpanda Operator and Helm
 
 To migrate to the latest Redpanda Operator and use it to manage your Helm deployment, follow these steps.