logicalclocks · maismail · Mar 26, 2026 · Feb 13, 2026 · Feb 17, 2026 · Feb 17, 2026
diff --git a/docs/setup_installation/admin/ha-dr/dr.md b/docs/setup_installation/admin/ha-dr/dr.md
@@ -121,15 +121,22 @@ For S3 object storage, you can also configure a bucket lifecycle policy to expir
 
 ## Restore
 
+Hopsworks supports two restore modes:
+
+- **New cluster restore**: Install a fresh cluster and restore data from a backup during installation.
+- **In-place restore**: Restore data onto an existing running cluster via `helm upgrade`.
+
 !!! Note
-    Restore is only supported in a newly created cluster; in-place restore is not supported. Use the exact Hopsworks version that was used to create the backup.
+    Use the exact Hopsworks version that was used to create the backup.
 
-The restore process has two phases:
+### New Cluster Restore
+
+The new cluster restore process has two phases:
 
 - Restore Kubernetes objects required for the cluster restore.
 - Install the cluster with Helm using the correct backup IDs.
 
-### Restore Kubernetes objects
+#### Restore Kubernetes objects
 
 Restore the Kubernetes objects that were backed up using Velero.
 
@@ -202,19 +209,18 @@ done
 
 # Restores the latest - if specific backup is needed then backupName instead
 echo "=== Creating Velero Restore object for k8s-backups-main ==="
-RESTORE_SUFFIX=$(date +%s)
 kubectl apply -f - <<EOF
 apiVersion: velero.io/v1
 kind: Restore
 metadata:
-  name: k8s-backups-main-restore-$RESTORE_SUFFIX
+  name: k8s-backups-main
   namespace: velero
 spec:
   scheduleName: k8s-backups-main
 EOF
 
 echo "=== Waiting for Velero restore to finish ==="
-until [ "$(kubectl get restore k8s-backups-main-restore-$RESTORE_SUFFIX -n velero -o jsonpath='{.status.phase}' 2>/dev/null)" = "Completed" ]; do
+until [ "$(kubectl get restore k8s-backups-main -n velero -o jsonpath='{.status.phase}' 2>/dev/null)" = "Completed" ]; do
   echo "Still waiting..."; sleep 5;
 done
 
@@ -224,14 +230,14 @@ kubectl apply -f - <<EOF
 apiVersion: velero.io/v1
 kind: Restore
 metadata:
-  name: k8s-backups-users-resources-restore-$RESTORE_SUFFIX
+  name: k8s-backups-users-resources
   namespace: velero
 spec:
   scheduleName: k8s-backups-users-resources
 EOF
 
 echo "=== Waiting for Velero restore to finish ==="
-until [ "$(kubectl get restore k8s-backups-users-resources-restore-$RESTORE_SUFFIX -n velero -o jsonpath='{.status.phase}' 2>/dev/null)" = "Completed" ]; do
+until [ "$(kubectl get restore k8s-backups-users-resources -n velero -o jsonpath='{.status.phase}' 2>/dev/null)" = "Completed" ]; do
   echo "Still waiting..."; sleep 5;
 done
 ```
@@ -248,7 +254,7 @@ kubectl get configmap opensearch-backups-metadata -n hopsworks -o json \
 | sort -nr
 ```
 
-### Restore on Cluster installation
+#### Restore on Cluster installation
 
 To restore a cluster during installation, configure the backup ID in the values YAML file:
 
@@ -262,7 +268,7 @@ global:
       backupId: "254811200"
 ```
 
-#### Customizations
+##### Customizations
 
 !!! Warning
     Even if you override the backup IDs for RonDB and Opensearch, you must still set `.global._hopsworks.restoreFromBackup.backupId` to ensure HopsFS is restored.
@@ -327,3 +333,135 @@ olk:
               payload:
                 indices: "-myindex"
 ```
+
+### In-Place Restore
+
+!!! Note
+    In-place restore is available from Hopsworks version 4.8.0.
+
+In-place restore allows you to restore data onto an existing running cluster using `helm upgrade`. Unlike a new cluster restore, this does not require provisioning a fresh cluster — the existing stateful services are shut down, wiped if necessary, and restored from backup.
+
+!!! Warning
+    In-place restore **replaces all existing data** in the cluster with the backup data. Any data written after the backup was taken will be lost.
+
+!!! Info
+    After a fresh install from backup (new cluster restore), in-place restores can only be performed using backups taken **after** that fresh install, because the cluster certificates are regenerated during installation. To restore to a backup that was taken **before** the fresh install, you must perform another new cluster restore from that backup instead of an in-place restore.
+
+#### In-place restore prerequisites
+
+- A running Hopsworks cluster deployed via Helm.
+- A previously created backup with a known backup ID.
+- Object storage configured and accessible with the backup data.
+- Velero installed and configured as described in the [prerequisites](#prerequisites).
+
+#### Identify the backup ID
+
+Get the backup ID from the **Cluster Settings > Backup** tab or by using the following commands.
+
+```bash
+# RonDB backup IDs (newest first)
+kubectl get configmap rondb-backups-metadata -n hopsworks -o json \
+| jq -r '.data | to_entries[] | select(.value | fromjson | .state == "SUCCESS") | .key' \
+| sort -nr
+
+# Opensearch backup IDs (newest first)
+kubectl get configmap opensearch-backups-metadata -n hopsworks -o json \
+| jq -r '.data | to_entries[] | select(.value | fromjson | .state == "SUCCESS") | .key' \
+| sort -nr
+
+# Velero backup IDs for the main schedule (newest first)
+kubectl get backups -n velero -o json \
+| jq -r '[.items[] | select(.spec.storageLocation == "hopsworks-bsl" and .metadata.labels["velero.io/schedule-name"] == "k8s-backups-main" and .status.phase == "Completed")] | sort_by(.status.completionTimestamp) | reverse[] | .metadata.name'
+
+# Velero backup IDs for the users schedule (newest first)
+kubectl get backups -n velero -o json \
+| jq -r '[.items[] | select(.spec.storageLocation == "hopsworks-bsl" and .metadata.labels["velero.io/schedule-name"] == "k8s-backups-users-resources" and .status.phase == "Completed")] | sort_by(.status.completionTimestamp) | reverse[] | .metadata.name'
+```
+
+#### Run the in-place restore
+
+Configure the restore in the values file and run `helm upgrade`:
+
+```yaml
+global:
+  _hopsworks:
+    backups:
+      enabled: true
+      schedule: "@weekly"
+    restoreFromBackup:
+      backupId: "254811200"
+      inPlace: true
+      forceDataClear: true
+
+# Optional: specify Velero backup IDs. If not set, the latest completed backup is used.
+hopsworks:
+  velero:
+    restore:
+      mainScheduleBackupId: "k8s-backups-main-20260213T153627Z"
+      usersScheduleBackupId: "k8s-backups-users-resources-20260213T153627Z"
+```
+
+Then run:
+
+```bash
+helm upgrade hopsworks hopsworks/hopsworks --version <CHART_VERSION> \
+  --namespace hopsworks \
+  -f values.yaml \
+  --timeout 1200s
+```
+
+You can also pass the restore flags directly on the command line:
+
+```bash
+helm upgrade hopsworks hopsworks/hopsworks --version <CHART_VERSION> \
+  --namespace hopsworks \
+  --set-string global._hopsworks.restoreFromBackup.backupId="254811200" \
+  --set global._hopsworks.restoreFromBackup.inPlace=true \
+  --set global._hopsworks.restoreFromBackup.forceDataClear=true \
+  --set-string hopsworks.velero.restore.mainScheduleBackupId="k8s-backups-main-20260213T153627Z" \
+  --set-string hopsworks.velero.restore.usersScheduleBackupId="k8s-backups-users-resources-20260213T153627Z" \
+  --timeout 1200s
+```
+
+The required flags are:
+
+| Parameter | Description |
+| --------- | ----------- |
+| `global._hopsworks.restoreFromBackup.backupId` | The backup ID to restore from. |
+| `global._hopsworks.restoreFromBackup.inPlace` | Must be `true` to enable in-place restore mode. |
+| `global._hopsworks.restoreFromBackup.forceDataClear` | Must be `true` to confirm that existing data will be replaced. This is a safety mechanism to prevent accidental data loss. |
+
+The following flags are optional. If not set, the latest available Velero backup will be used:
+
+| Parameter | Description |
+| --------- | ----------- |
+| `hopsworks.velero.restore.mainScheduleBackupId` | The Velero backup ID for the main schedule (`k8s-backups-main`). |
+| `hopsworks.velero.restore.usersScheduleBackupId` | The Velero backup ID for the users schedule (`k8s-backups-users-resources`). |
+
+!!! Important
+    After a successful restore, remove the `restoreFromBackup` blocks from your values file and run `helm upgrade` to apply the change.
+    If left in place, these blocks can cause subsequent upgrades to fail or behave unexpectedly.
+
+#### Re-running an in-place restore
+
+In-place restore creates marker resources to prevent accidental re-runs. If you need to run the restore again with the same backup ID, delete the marker resources first:
+
+```bash
+# Delete the HopsFS restore job
+kubectl delete job hopsfs-inplace-restore-<BACKUP_ID> -n hopsworks --ignore-not-found=true
+
+# Delete the RonDB restore jobs
+kubectl delete job restore-native-backup-<BACKUP_ID> -n hopsworks --ignore-not-found=true
+kubectl delete job setup-mysqld-dont-remove-<BACKUP_ID> -n hopsworks --ignore-not-found=true
+
+# Delete the Opensearch restore job
+kubectl delete job opensearch-restore-default-default-<BACKUP_ID> -n hopsworks --ignore-not-found=true
+
+# Delete the velero restore objects, use the exact backup name or schedule name
+kubectl delete restore.velero.io k8s-backups-main -n velero --ignore-not-found=true
+kubectl delete restore.velero.io k8s-backups-users-resources -n velero --ignore-not-found=true
+```
+
+#### In-place restore customizations
+
+The same customization options for [RonDB and Opensearch](#customizations) backup IDs apply to in-place restore. You can override individual service backup IDs while keeping the global backup ID for HopsFS.