Skip to content

Repo-based Standby - Cross-cluster HA/DR strategy with pgBackRest and internal S3 storage dependencies #4485

@zohebk8s

Description

@zohebk8s

Overview: Repo-based Standby Cluster

I am designing a cross-cluster PostgreSQL architecture using the Crunchy Postgres for Kubernetes (PGO) operator. I am currently evaluating the resiliency of our deployment in a multi-cluster scenario.

Architecture:

Cluster 1 (OpenShift): Hosts the Primary Postgres instance and the local S3-compatible Object Storage (ODF/NooBaa) used for WAL archiving and backups.

Cluster 2 (OpenShift): Hosts a Standby instance and a secondary pgBackRest instance configured to pull from the S3 bucket in Cluster 1.

Scenario:

We are evaluating the blast radius of a total control-plane and storage-plane failure in Cluster 1.

Questions:

1.Storage Dependency: Given that our pgBackRest repository is pinned to the S3 bucket inside Cluster 1, does this architecture inherently violate our Disaster Recovery (DR) RTO? In the event of a total Cluster 1 failure, the Standby in Cluster 2 loses both its replication stream and access to its WAL/backup repository. Is there a "native" PGO/pgBackRest configuration to handle S3 repository failover, or is external, independent object storage the only production-grade path forward?

2.Replica Topology: For a production-grade deployment targeting high availability and resilience against node-level failures, what is the recommended instance count per cluster? We are debating between 1 Primary + 1 Replica versus 1 Primary + 2 Replicas. Does PGO specifically benefit from the quorum provided by the 3-node configuration in terms of preventing split-brain scenarios during network partitions in an OpenShift environment?

Context:

Our environment is built on OpenShift with ODF/NooBaa, and we are aiming for a high-security, sovereign infrastructure design. Any guidance on achieving storage-level redundancy during a failover would be greatly appreciated.

Regards,
Zoheb Shaik

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions