Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 55 additions & 18 deletions docs/configuration/pgdog.toml/general.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,28 @@ Delay running idle healthchecks at PgDog startup to give databases (and pools) t

Default: **`5_000`** (5s)

### `connection_recovery`

Controls if server connections are recovered or dropped if a client abruptly disconnects.

Available options:

- `recover` (default)
- `rollback_only`
- `drop`

`rollback_only` will only attempt to `ROLLBACK` any unfinished transactions but won't attempt to resynchronize connections. `drop` will close connections, without attempting recovery.

### `client_connection_recovery`

Controls whether to disconnect clients upon encountering connection pool errors (e.g., checkout timeout). Set this to `drop` if your clients are async / use pipelining mode.

Available options:

- `recover` (default)
- `drop`


## Timeouts

These settings control how long PgDog waits for maintenance tasks to complete. These timeouts make sure PgDog can recover
Expand Down Expand Up @@ -261,21 +283,6 @@ Enable load balancer [HTTP health checks](../../features/load-balancer/healthche

Default: **none** (disabled)

## Service discovery

### `broadcast_address`

Send multicast packets to this address on the local network. Configuring this setting enables
mutual service discovery. Instances of PgDog running on the same network will be able to see
each other.

Default: **none** (disabled)

### `broadcast_port`

The port used for sending and receiving broadcast messages.

Default: **`6433`**

## Monitoring

Expand Down Expand Up @@ -410,11 +417,41 @@ Available options:

Default: **`auto`**

### `system_catalogs_omnisharded`
### `system_catalogs`

Enables sticky routing for system catalog tables and treats them as [omnisharded](../../features/sharding/omnishards.md) tables. This makes tools like `psql` work out of the box.
Changes how system catalog tables (like `pg_database`, `pg_class`, etc.) are treated by the query router. Default behavior is to assume they are the same on all shards and send queries referencing them to a random shard. This makes tools like `psql` work out of the box.

Default: **`true`** (enabled)
Available options:

- `omnisharded`
- `omnisharded_sticky` (default)
- `sharded`

Default: **`omnisharded_sticky`** (enabled)

### `omnisharded_sticky`

If turned on, queries touching [omnisharded](../../features/sharding/omnishards.md) tables are always sent to the same shard for any given client connection. The shard is determined at random on connection creation.

Default: **`false`**

### `resharding_copy_format`

Which format to use for `COPY` statements during [resharding](../../features/sharding/resharding/index.md).

Available options:

- `binary` (default)
- `text`

`text` format is required when migrating from `INTEGER` to `BIGINT` primary keys during resharding.

### `reload_schema_on_ddl`

!!! warning
This setting is intended for local development / CI / single node PgDog deployments.

Automatically reload the schema cache used by PgDog to route queries upon detecting DDL statements (e.g., `CREATE TABLE`, `ALTER TABLE`, etc.).

## Logging

Expand Down
106 changes: 106 additions & 0 deletions docs/features/connection-recovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
icon: material/connection
---

# Connection recovery

PostgreSQL database connections are expensive to create so PgDog does its best not to close them unless absolutely necessary. In case a client disconnects before fully processing a query response, PgDog will attempt to preserve the connection using several recovery steps.

## Abandoned transactions

If a client disconnects abruptly while inside a transaction, the transaction is considered abandoned and PgDog will automatically execute a `ROLLBACK`, making sure none of its changes are persisted in the database.

This is a common occurrence if there is a bug that causes the application to crash while executing multiple statements inside a manually started transaction, for example:

=== "Rails"
```ruby
ActiveRecord.transaction do
user = User.find(5)
# crash happens here.
end
```
=== "SQLAlchemy"
```python
with session.begin():
user = session.get(User, 5)
# crash happens here.
```
=== "Go"
```go
tx, _ := db.Begin()
row := tx.QueryRow("SELECT * FROM users WHERE id = $1", 5)
// crash happens here.
```

### Connection storms

By preserving connections, PgDog protects the database against connection storms. Other connection poolers like PgBouncer close server connections without attempting any recovery.

When the application restarts, the pooler must recreate all of these connections at once, causing thousands of server connections to be opened and closed in rapid succession. This leads to unnecessary contention on database resources and can cause 100% CPU spikes on the database.

## Abandoned queries

A client can abruptly disconnect while receiving query response data from the server. This can happen due to out-of-memory errors or hardware failure, for example:

=== "Rails"
```ruby
orders = Order.where(user_id: 5)
# ^ crash happens inside `pg`,
# while receiving multiple rows
```
=== "SQLAlchemy"
```python
orders = session.execute(
select(Order).where(Order.user_id == 5)
).all()
# ^ crash happens while receiving multiple rows
```
=== "Go"
```go
rows, _ := db.Query("SELECT * FROM orders WHERE user_id = $1", 5)
for rows.Next() {
// crash happens here while iterating over rows
}
```

PgDog will detect this and drain server connections, restoring them to their normal state, before returning them back to the connection pool. The drain mechanism works by receiving and discarding `DataRow` messages and sending [`Sync`](https://www.postgresql.org/docs/current/protocol-message-formats.html#PROTOCOL-MESSAGE-FORMATS-SYNC) to the server to resynchronize the extended protocol state.

Just like [abandoned transactions](#abandoned-transactions), this protects PostgreSQL databases from connection storms caused by unreliable clients. If the client was executing a transaction, it will be rolled back as well.

### Configuration

Connection recovery is an optional feature, enabled by default. You can change how it behaves through configuration:

```toml
[general]
connection_recovery = "recover"
```

| Configuration value | Description |
|-|-|
| `recover` | Attempt full connection recovery, including rollback and resynchronization. This is the default. |
| `rollback_only` | Rollback abandoned transactions but drop the connection if a query was abandoned mid-response. |
| `drop` | Disable connection recovery and close the server connection (identical to PgBouncer). |

To make sure abandoned server connections don't block normal operations, PgDog supports a configurable timeout on the recovery operation. If connection recovery doesn't complete in time, the connection will be closed:

```toml
[general]
rollback_timeout = 5_000
```

## Client connections

Just like server connections, PgDog can maintain client connections (application --> PgDog) during incidents. This helps preserve application-side connection pools and avoids re-creating thousands of connections unnecessarily.

While enabled by default, some applications don't behave well when their queries return errors instead of results. Therefore, this feature is configurable and can be disabled:

```toml
[general]
client_connection_recovery = "drop"
```

| Configuration value | Description |
|-|-|
| `recover` | Attempt to maintain client connections open after database-related errors, like `checkout timeout`. |
| `drop` | Disable connection recovery and close the client connection (identical to PgBouncer). |
6 changes: 3 additions & 3 deletions docs/features/sharding/omnishards.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,11 +106,11 @@ tables = [
]
```

This is configurable with the `system_catalogs_omnisharded` setting in [`pgdog.toml`](../../configuration/pgdog.toml/general.md#system_catalogs_omnisharded):
This is configurable with the `system_catalogs` setting in [`pgdog.toml`](../../configuration/pgdog.toml/general.md#system_catalogs_omnisharded):

```toml
[general]
system_catalogs_omnisharded = true
system_catalogs = "omnisharded_sticky"
```

If enabled (it is by default), commands like `\d`, `\d+` and others sent from `psql` will start to return correct results.
If enabled (it is by default), commands like `\d`, `\d+` and others sent from `psql` will return correct results.
Loading