Skip to content

Commit a5ba406

Browse files
authored
feat(webapp,redis): handle UNBLOCKED during ElastiCache role change (#3549)
## Summary When ElastiCache demotes a primary to replica — during a Multi-AZ failover or a vertical node-type change — the demoting primary issues an `UNBLOCKED` reply to any in-flight blocking commands (`BLPOP`, `BRPOP`, `BLMOVE`, `XREADGROUP ... BLOCK`, etc.) to clear them before the role flips. ioredis surfaces these as `ReplyError` to caller code. The shared `defaultReconnectOnError` added in #3548 only matches `READONLY` and `LOADING`. This extends it to `UNBLOCKED` so the disconnect-reconnect-retry cycle handles BLPOP-shaped errors the same way the existing two cases handle non-blocking-command errors. ## Fix ```ts export function defaultReconnectOnError(err: Error): boolean | 1 | 2 { const msg = err.message ?? ""; if ( msg.startsWith("READONLY") || msg.startsWith("LOADING") || msg.startsWith("UNBLOCKED") ) { return 2; } return false; } ``` Returning `2` tells ioredis to disconnect, reconnect, and re-issue the command. For a BLPOP that means a fresh BLPOP against the new primary instead of the `UNBLOCKED` error escaping to the caller. ## Test plan - [ ] CI green - [ ] Trigger a Multi-AZ failover or a vertical scale event on an ElastiCache replication group whose clients are running blocking commands and confirm no `UNBLOCKED` errors surface to caller code during the cutover.
1 parent 567e2a2 commit a5ba406

2 files changed

Lines changed: 19 additions & 1 deletion

File tree

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
area: webapp
3+
type: improvement
4+
---
5+
6+
Extend the shared ioredis `reconnectOnError` hook (PR #3548) to also match `UNBLOCKED` reply errors so blocking commands like BLPOP transparently reconnect-and-retry when the ElastiCache primary forces them to unblock during a node role change.

internal-packages/redis/src/index.ts

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@ export { Redis, type Callback, type RedisOptions, type Result, type RedisCommand
99
* reply errors to caller code over a healthy TCP/TLS connection (the
1010
* client keeps talking to a node whose role swapped underneath it).
1111
*
12+
* UNBLOCKED is the BLPOP-shaped case: the Redis primary forcibly
13+
* unblocks any blocking command on a connection whose node is about
14+
* to be demoted, returning an UNBLOCKED reply. Surfaced 65 times on
15+
* engine/v1/worker-actions/dequeue at the cutover instant during the
16+
* TRI-8873 test-cloud scale-up dry-run.
17+
*
1218
* Returning 2 tells ioredis to disconnect, reconnect, and retry the
1319
* command that triggered the error. After reconnect, DNS / SG routing
1420
* should land on a writable primary.
@@ -18,7 +24,13 @@ export { Redis, type Callback, type RedisOptions, type Result, type RedisCommand
1824
*/
1925
export function defaultReconnectOnError(err: Error): boolean | 1 | 2 {
2026
const msg = err.message ?? "";
21-
if (msg.startsWith("READONLY") || msg.startsWith("LOADING")) return 2;
27+
if (
28+
msg.startsWith("READONLY") ||
29+
msg.startsWith("LOADING") ||
30+
msg.startsWith("UNBLOCKED")
31+
) {
32+
return 2;
33+
}
2234
return false;
2335
}
2436

0 commit comments

Comments
 (0)