Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
5c9506a
wip: add fencing mechanism
mayankshah1607 Jan 21, 2026
c021fb2
rename to suspended-instances
mayankshah1607 Jan 21, 2026
6043b7a
add BackupSnapshots feature gate
mayankshah1607 Jan 21, 2026
93c732b
implement reconciler logic
mayankshah1607 Jan 21, 2026
bb96b8a
implement offline executor
mayankshah1607 Jan 21, 2026
22cb071
naming improvements
mayankshah1607 Jan 21, 2026
478f8d9
add to scheme
mayankshah1607 Jan 21, 2026
8bff157
fix reconcile state
mayankshah1607 Jan 21, 2026
0c906ea
bug fixes and improvements
mayankshah1607 Jan 21, 2026
77a96b6
naming improvements and fixes
mayankshah1607 Jan 21, 2026
7f0587f
ran make generate
mayankshah1607 Jan 21, 2026
7b33275
fix PGBackup field validations
mayankshah1607 Jan 22, 2026
7e412fe
refactors and stability improvements
mayankshah1607 Jan 22, 2026
d30df00
update cr.yaml examples
mayankshah1607 Jan 22, 2026
ff0059c
Merge branch 'main' into K8SPG-771
mayankshah1607 Jan 22, 2026
0e2537c
organize imports
mayankshah1607 Jan 22, 2026
1a551b1
misspells
mayankshah1607 Jan 22, 2026
d4e09b5
improve fencing logic
mayankshah1607 Jan 22, 2026
cdb8f43
add extra validation
mayankshah1607 Jan 22, 2026
c53a00a
improvements to suspended logic
mayankshah1607 Jan 23, 2026
77bb6a8
finalizer renaming
mayankshah1607 Jan 23, 2026
c8a2daa
remove enabled field
mayankshah1607 Jan 23, 2026
808fd85
implement snapshot schedules
mayankshah1607 Jan 23, 2026
ef509d7
linting
mayankshah1607 Jan 23, 2026
2d93ca5
implement in-place restore
mayankshah1607 Jan 23, 2026
78032c4
update cr.yaml example
mayankshah1607 Jan 23, 2026
bf04199
linting
mayankshah1607 Jan 23, 2026
581a629
no need to use configmap to track
mayankshah1607 Jan 23, 2026
834d27c
more improvements
mayankshah1607 Jan 23, 2026
645cf65
typo
mayankshah1607 Jan 27, 2026
e9196e4
wip: PiTR
mayankshah1607 Jan 27, 2026
2b4981f
support snapshots in WALWatcher
mayankshah1607 Jan 27, 2026
1cbdcab
bug fix
mayankshah1607 Jan 27, 2026
f2fdd02
update comments
mayankshah1607 Jan 27, 2026
0566dcf
linting
mayankshah1607 Jan 27, 2026
85d9783
add PGO_FEATURE_GATES env variable to deploy
mayankshah1607 Jan 27, 2026
4f48769
add retry mechanism
mayankshah1607 Jan 28, 2026
6e4c7b8
implement checkpointing
mayankshah1607 Jan 29, 2026
38857f3
add restore_command wrapper for snapshots
mayankshah1607 Jan 30, 2026
aa53460
add logic for creating snapshot signal file
mayankshah1607 Jan 30, 2026
bcd5d04
handle leader ep reconcile
mayankshah1607 Jan 30, 2026
047b653
more improvements
mayankshah1607 Jan 30, 2026
6dd9948
add e2e test
mayankshah1607 Jan 30, 2026
e906e4c
wip: fix pitr
mayankshah1607 Jan 30, 2026
598038a
in-place restore improvements
mayankshah1607 Feb 9, 2026
ecb41dc
update e2e test
mayankshah1607 Feb 9, 2026
e0f155f
Merge branch 'main' into K8SPG-771
mayankshah1607 Feb 9, 2026
04039f7
update test runs
mayankshah1607 Feb 9, 2026
1f58190
test fixes
mayankshah1607 Feb 9, 2026
3888409
Update build/postgres-operator/restore_command.sh
mayankshah1607 Feb 9, 2026
ba9feef
support for WAL and Tablespace volumes
mayankshah1607 Feb 9, 2026
9f9620d
fix tests
mayankshah1607 Feb 10, 2026
a48e58a
Merge branch 'main' into K8SPG-771
mayankshah1607 Feb 10, 2026
04dcc56
workaround improvements
mayankshah1607 Feb 10, 2026
4d0d91b
Merge branch 'main' into K8SPG-771
mayankshah1607 Feb 10, 2026
05dd8fc
update examples
mayankshah1607 Feb 10, 2026
bdb8cab
fix unit test
mayankshah1607 Feb 10, 2026
99bf9e2
remove unused code
mayankshah1607 Feb 10, 2026
457231b
add unit test shouldFailSnapshot
mayankshah1607 Feb 10, 2026
d25ce94
add unit tests for backup helpers
mayankshah1607 Feb 10, 2026
de310a9
linting
mayankshah1607 Feb 10, 2026
b523626
cleanup unused code
mayankshah1607 Feb 11, 2026
3da1a10
formatting
mayankshah1607 Feb 11, 2026
7b12244
code cleanup
mayankshah1607 Feb 11, 2026
3f78ad8
status improvements
mayankshah1607 Feb 11, 2026
05361f7
add more unit tests
mayankshah1607 Feb 11, 2026
c9942f9
update e2e test assertions
mayankshah1607 Feb 11, 2026
ab2c612
fix inconsistencies & address copilot comments
mayankshah1607 Feb 11, 2026
0e9caea
remove sh prefix
mayankshah1607 Feb 11, 2026
54603df
checkpoint timeout default to 5m
mayankshah1607 Feb 11, 2026
6679366
linting
mayankshah1607 Feb 11, 2026
a99b660
POSIX-compliant script
mayankshah1607 Feb 11, 2026
f9294d6
fix potential nil-ptr
mayankshah1607 Feb 11, 2026
c7d3340
fix error messag
mayankshah1607 Feb 11, 2026
34a95a2
fix invalid status update
mayankshah1607 Feb 11, 2026
abd599a
fix retries to get latest objecgt
mayankshah1607 Feb 11, 2026
a7fe133
bugfix: restore reconcile can use incorrect volume spec
mayankshah1607 Feb 11, 2026
620c3de
add unit test
mayankshah1607 Feb 11, 2026
de31ed9
fix linting suggestions
mayankshah1607 Feb 11, 2026
21b71da
prepare job should work when using separate wal volumes
mayankshah1607 Feb 11, 2026
4fb1ef9
update unit tests
mayankshah1607 Feb 11, 2026
de9c305
spelling errors
mayankshah1607 Feb 11, 2026
df353e1
allow configuring checkpoint timeout
mayankshah1607 Feb 11, 2026
0e8879c
fix restore wrapper
mayankshah1607 Feb 11, 2026
26ed3f8
consistent field naming
mayankshah1607 Feb 11, 2026
71c5430
fix retry loop
mayankshah1607 Feb 11, 2026
1028d23
no need to check PITR while unsuspending
mayankshah1607 Feb 11, 2026
5357c50
error message consistency
mayankshah1607 Feb 11, 2026
b4910f5
remove duplicate code
mayankshah1607 Feb 11, 2026
3d314a2
use fmt.Errorf
mayankshah1607 Feb 11, 2026
e20a3cb
catch potential nil error
mayankshah1607 Feb 11, 2026
4e2203c
allow skipping checkpointing
mayankshah1607 Feb 12, 2026
05386e1
Merge branch 'main' into K8SPG-771
mayankshah1607 Feb 12, 2026
df83311
Merge branch 'main' into K8SPG-771
mayankshah1607 Feb 13, 2026
685d172
address copilot comments
mayankshah1607 Feb 13, 2026
239bba7
Merge branch 'main' into K8SPG-771
mayankshah1607 Feb 13, 2026
e7279c9
fix error message
mayankshah1607 Feb 13, 2026
216ebbe
Merge branch 'main' into K8SPG-771
mayankshah1607 Feb 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,13 @@ spec:
type: object
spec:
properties:
method:
default: pgbackrest
description: Method with which to perform the backup
enum:
- pgbackrest
- volumeSnapshot
type: string
options:
description: |-
Command line options to include when running the pgBackRest backup command.
Expand All @@ -80,14 +87,17 @@ spec:
pgCluster:
type: string
repoName:
description: The name of the pgBackRest repo to run the backup command
against.
description: |-
The name of the pgBackRest repo to run the backup command against.
This is required when method is 'pgbackrest'.
pattern: ^repo[1-4]
type: string
required:
- pgCluster
- repoName
type: object
x-kubernetes-validations:
- message: repoName is required when method is 'pgbackrest'
rule: self.method == "volumeSnapshot" || has(self.repoName)
status:
properties:
backupName:
Expand Down Expand Up @@ -391,6 +401,24 @@ spec:
required:
- name
type: object
snapshot:
properties:
dataVolumeSnapshotRef:
description: Name of the VolumeSnapshot containing data volume
contents.
type: string
tablespaceVolumeSnapshotRefs:
additionalProperties:
type: string
description: |-
Names of the VolumeSnapshots containing tablespace volume contents.
Key is the name of the tablespace, value is the name of the VolumeSnapshot.
type: object
walVolumeSnapshotRef:
description: Name of the VolumeSnapshot containing WAL volume
contents.
type: string
type: object
state:
type: string
storageType:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7110,6 +7110,51 @@ spec:
trackLatestRestorableTime:
description: Enable tracking latest restorable time
type: boolean
volumeSnapshots:
description: VolumeSnapshots configuration
properties:
className:
description: Name of the VolumeSnapshotClass to use.
type: string
mode:
default: offline
description: Mode of the VolumeSnapshot.
enum:
- offline
type: string
offlineConfig:
description: |-
Configuration for offline snapshot operations.
Ignored if mode is not offline.
properties:
checkpoint:
description: Checkpoint configuration for offline snapshot
operations.
properties:
enabled:
default: true
description: If set, a checkpoint is requested.
type: boolean
timeoutSeconds:
default: 300
description: |-
Timeout for the checkpoint operation.
Ignored if checkpoint is not enabled.
format: int32
minimum: 30
type: integer
type: object
type: object
schedule:
description: |-
Defines the Cron schedule for a VolumeSnapshot.
Follows the standard Cron schedule syntax:
https://k8s.io/docs/concepts/workloads/controllers/cron-jobs/#cron-schedule-syntax
minLength: 6
type: string
required:
- className
type: object
type: object
x-kubernetes-validations:
- message: At least one repository must be configured when backups
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,17 +67,33 @@ spec:
pgCluster:
description: The name of the PerconaPGCluster to perform restore.
type: string
x-kubernetes-validations:
- message: pgCluster is an immutable field
rule: self == oldSelf
repoName:
description: |-
The name of the pgBackRest repo within the source PostgresCluster that contains the backups
that should be utilized to perform a pgBackRest restore when initializing the data source
for the new PostgresCluster.
pattern: ^repo[1-4]
type: string
x-kubernetes-validations:
- message: repoName is an immutable field
rule: self == oldSelf
volumeSnapshotBackupName:
description: The name of the backup to perform in-place volume snapshot
restores from.
type: string
x-kubernetes-validations:
- message: volumeSnapshotBackupName is an immutable field
rule: self == oldSelf
required:
- pgCluster
- repoName
type: object
x-kubernetes-validations:
- message: either repoName or volumeSnapshotBackupName must be set
rule: ((has(self.repoName) && self.repoName != "") || (has(self.volumeSnapshotBackupName)
&& self.volumeSnapshotBackupName != ""))
status:
properties:
completed:
Expand Down
1 change: 1 addition & 0 deletions build/postgres-operator/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ COPY build/postgres-operator/init-entrypoint.sh /usr/local/bin
COPY build/postgres-operator/postgres-entrypoint.sh /usr/local/bin
COPY build/postgres-operator/postgres-liveness-check.sh /usr/local/bin
COPY build/postgres-operator/postgres-readiness-check.sh /usr/local/bin
COPY build/postgres-operator/restore-command-wrapper.sh /usr/local/bin
COPY hack/tools/queries /opt/crunchy/conf

RUN chgrp -R 0 /opt/crunchy/conf && chmod -R g=u opt/crunchy/conf
Expand Down
1 change: 1 addition & 0 deletions build/postgres-operator/init-entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ install -o "$(id -u)" -g "$(id -g)" -m 0755 -D "/usr/local/bin/postgres-entrypoi
install -o "$(id -u)" -g "$(id -g)" -m 0755 -D "/usr/local/bin/postgres-liveness-check.sh" "${CRUNCHY_BINDIR}/bin/postgres-liveness-check.sh"
install -o "$(id -u)" -g "$(id -g)" -m 0755 -D "/usr/local/bin/postgres-readiness-check.sh" "${CRUNCHY_BINDIR}/bin/postgres-readiness-check.sh"
install -o "$(id -u)" -g "$(id -g)" -m 0755 -D "/usr/local/bin/relocate-extensions.sh" "${CRUNCHY_BINDIR}/bin/relocate-extensions.sh"
install -o "$(id -u)" -g "$(id -g)" -m 0755 -D "/usr/local/bin/restore-command-wrapper.sh" "${CRUNCHY_BINDIR}/bin/restore-command-wrapper.sh"
10 changes: 10 additions & 0 deletions build/postgres-operator/restore-command-wrapper.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/sh
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to do this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need any bash-specific features in this script, so using /bin/sh is better for portability (even though our images are expected to support bash). Do you think we need bash?

set -e

# When this marker exists (e.g. after a snapshot restore), skip all WAL recovery by
# exiting non-zero. Do not remove the file so every restore_command call is skipped.
if [ -f "${PGDATA}/skip-wal-recovery" ]; then
exit 1
fi

exec "$@"
3 changes: 3 additions & 0 deletions cmd/postgres-operator/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import (
"time"
"unicode"

volumesnapshotv1 "github.com/kubernetes-csi/external-snapshotter/client/v8/apis/volumesnapshot/v1"
"github.com/pkg/errors"
"go.opentelemetry.io/otel"
uzap "go.uber.org/zap"
Expand Down Expand Up @@ -125,6 +126,8 @@ func main() {
// Add Percona custom resource types to scheme
assertNoError(v2.AddToScheme(mgr.GetScheme()))

assertNoError(volumesnapshotv1.AddToScheme(mgr.GetScheme()))

// add all PostgreSQL Operator controllers to the runtime manager
err = addControllersToManager(ctx, mgr)
assertNoError(err)
Expand Down
97 changes: 93 additions & 4 deletions config/crd/bases/pgv2.percona.com_perconapgclusters.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,13 @@ spec:
type: object
spec:
properties:
method:
default: pgbackrest
description: Method with which to perform the backup
enum:
- pgbackrest
- volumeSnapshot
type: string
options:
description: |-
Command line options to include when running the pgBackRest backup command.
Expand All @@ -79,14 +86,17 @@ spec:
pgCluster:
type: string
repoName:
description: The name of the pgBackRest repo to run the backup command
against.
description: |-
The name of the pgBackRest repo to run the backup command against.
This is required when method is 'pgbackrest'.
pattern: ^repo[1-4]
type: string
required:
- pgCluster
- repoName
type: object
x-kubernetes-validations:
- message: repoName is required when method is 'pgbackrest'
rule: self.method == "volumeSnapshot" || has(self.repoName)
status:
properties:
backupName:
Expand Down Expand Up @@ -390,6 +400,24 @@ spec:
required:
- name
type: object
snapshot:
properties:
dataVolumeSnapshotRef:
description: Name of the VolumeSnapshot containing data volume
contents.
type: string
tablespaceVolumeSnapshotRefs:
additionalProperties:
type: string
description: |-
Names of the VolumeSnapshots containing tablespace volume contents.
Key is the name of the tablespace, value is the name of the VolumeSnapshot.
type: object
walVolumeSnapshotRef:
description: Name of the VolumeSnapshot containing WAL volume
contents.
type: string
type: object
state:
type: string
storageType:
Expand Down Expand Up @@ -7515,6 +7543,51 @@ spec:
trackLatestRestorableTime:
description: Enable tracking latest restorable time
type: boolean
volumeSnapshots:
description: VolumeSnapshots configuration
properties:
className:
description: Name of the VolumeSnapshotClass to use.
type: string
mode:
default: offline
description: Mode of the VolumeSnapshot.
enum:
- offline
type: string
offlineConfig:
description: |-
Configuration for offline snapshot operations.
Ignored if mode is not offline.
properties:
checkpoint:
description: Checkpoint configuration for offline snapshot
operations.
properties:
enabled:
default: true
description: If set, a checkpoint is requested.
type: boolean
timeoutSeconds:
default: 300
description: |-
Timeout for the checkpoint operation.
Ignored if checkpoint is not enabled.
format: int32
minimum: 30
type: integer
type: object
type: object
schedule:
description: |-
Defines the Cron schedule for a VolumeSnapshot.
Follows the standard Cron schedule syntax:
https://k8s.io/docs/concepts/workloads/controllers/cron-jobs/#cron-schedule-syntax
minLength: 6
type: string
required:
- className
type: object
type: object
x-kubernetes-validations:
- message: At least one repository must be configured when backups
Expand Down Expand Up @@ -21968,17 +22041,33 @@ spec:
pgCluster:
description: The name of the PerconaPGCluster to perform restore.
type: string
x-kubernetes-validations:
- message: pgCluster is an immutable field
rule: self == oldSelf
repoName:
description: |-
The name of the pgBackRest repo within the source PostgresCluster that contains the backups
that should be utilized to perform a pgBackRest restore when initializing the data source
for the new PostgresCluster.
pattern: ^repo[1-4]
type: string
x-kubernetes-validations:
- message: repoName is an immutable field
rule: self == oldSelf
volumeSnapshotBackupName:
description: The name of the backup to perform in-place volume snapshot
restores from.
type: string
x-kubernetes-validations:
- message: volumeSnapshotBackupName is an immutable field
rule: self == oldSelf
required:
- pgCluster
- repoName
type: object
x-kubernetes-validations:
- message: either repoName or volumeSnapshotBackupName must be set
rule: ((has(self.repoName) && self.repoName != "") || (has(self.volumeSnapshotBackupName)
&& self.volumeSnapshotBackupName != ""))
status:
properties:
completed:
Expand Down
2 changes: 2 additions & 0 deletions config/manager/default/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ spec:
value: "1"
- name: PPROF_BIND_ADDRESS
value: "0"
- name: PGO_FEATURE_GATES
value: ""
ports:
- containerPort: 8080
name: metrics
Expand Down
1 change: 1 addition & 0 deletions deploy/backup.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@ metadata:
spec:
pgCluster: cluster1
repoName: repo1
# method: volumeSnapshot
# options:
# - --type=full
Loading
Loading