Skip to content

Commit 4ed6523

Browse files
fix: handle connection drops (UND_ERR_SOCKET) and prevent process crash
- Add .catch() on response.json() in 200 and non-200 branches to handle body-read failures - Retry on socket/abort errors (terminated, UND_ERR_SOCKET, UND_ERR_ABORTED) via onError() - Treat fetch-level and body-read socket errors consistently; reject with actual error when not retrying - Add SDK engineering investigation doc for UND_ERR_SOCKET handling Made-with: Cursor
1 parent 57bb919 commit 4ed6523

File tree

2 files changed

+140
-4
lines changed

2 files changed

+140
-4
lines changed
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# SDK Engineering Investigation: Connection Drops / UND_ERR_SOCKET Handling
2+
3+
**Context:** Customer (Berlitz) experienced intermittent Node.js build crashes when the Contentstack SDK (v3.17.1) fetched from the CDA during AWS CodeBuild. The process terminated with `TypeError: terminated` and `[cause]: SocketError: other side closed (code: UND_ERR_SOCKET)`.
4+
5+
**Scope:** Investigate how the Contentstack SDK and Node 22’s undici fetch layer handle connection drops/socket closures, and whether the SDK can catch these errors to retry or return a formatted error and prevent process crash.
6+
7+
---
8+
9+
## 1. Request Flow: SDK → Fetch → Undici
10+
11+
| Layer | Component | Role |
12+
|-------|-----------|------|
13+
| App | Customer code (e.g. Astro build) | Calls `Stack.ContentType(...).Query().find()` or `.fetch()` |
14+
| SDK | `src/core/lib/request.js``fetchRetry()` | Builds URL/options, calls `fetch()`, handles response and retries |
15+
| Runtime | `src/runtime/node/http.js` | Re-exports global `fetch` (Node 18+ built-in) |
16+
| Node | Built-in `fetch` | Implemented by **undici** (bundled in Node 18+) |
17+
| Undici | Fetch / TLSSocket | Performs HTTP, surfaces errors via rejected promise and `error.cause` |
18+
19+
- In **Node 22**, the global `fetch` is provided by Node’s bundled **undici**. The SDK does not import undici directly; it uses whatever `fetch` the Node runtime exposes (`runtime/http.js` → global `fetch`).
20+
- When the **remote server closes the TLS connection** (e.g. CDN/edge closes the socket), undici:
21+
- Emits the error internally (e.g. `Fetch.onAborted`, `Fetch.terminate`).
22+
- Rejects the **fetch promise** with a `TypeError('terminated', { cause: SocketError })`, where `cause.code === 'UND_ERR_SOCKET'`.
23+
- Alternatively, if the connection closes **after** the response object is returned but **during** body consumption, the promise returned by **`response.json()`** (or `response.text()` / body read) rejects with the same kind of error.
24+
25+
So:
26+
- **Fetch-level:** The `fetch(url, options)` promise rejects with `TypeError: terminated` and `error.cause.code === 'UND_ERR_SOCKET'` (or `UND_ERR_ABORTED`).
27+
- **Body-read-level:** The `response.json()` promise rejects with the same when the socket is closed while reading the body.
28+
29+
---
30+
31+
## 2. Previous SDK Behavior (Gaps)
32+
33+
### 2.1 Where the crash came from
34+
35+
- **Unhandled rejection in the 200 branch**
36+
For `response.ok && response.status === 200`, the SDK did:
37+
- `const data = response.json();`
38+
- `data.then(json => { ... resolve(json); });`
39+
- **No `.catch()`** on that `data` promise.
40+
If the remote closed the connection **during** body read, `response.json()` rejected with `TypeError: terminated`. That rejection was **unhandled** and could trigger Node’s unhandled-rejection behavior and **terminate the process**.
41+
42+
- **Fetch-level rejection was caught but not retried**
43+
The outer `fetch(...).catch((error) => { reject(error); })` did catch fetch-level errors (e.g. connection closed before/during response). So the **fetch** rejection itself did not leave an unhandled rejection. However:
44+
- Socket/abort errors were **not** retried; only HTTP status–based retries (e.g. 408, 429) were done via `retryCondition`.
45+
- So a single UND_ERR_SOCKET led to one rejected promise. If the **caller** did not handle that rejection (e.g. missing `.catch()` on a parallel or fire-and-forget call), it could still crash the process.
46+
47+
- **Non-200 branch**
48+
The non-200 path had `.catch(() => reject({ status, statusText }))` on `data.then(...)`, so body-read failures were caught, but:
49+
- The real error was discarded (no retry for socket/abort, and the rejected value was a generic `{ status, statusText }`).
50+
51+
### 2.2 Summary of previous behavior
52+
53+
| Scenario | Handled? | Retried? | Result |
54+
|----------|----------|----------|--------|
55+
| Fetch rejects (e.g. socket closed before/during response) | Yes (outer .catch) | No | Reject once → crash if caller doesn’t handle |
56+
| `response.json()` rejects in 200 branch (socket closed during body) | **No** | No | **Unhandled rejection → process crash** |
57+
| `response.json()` rejects in non-200 branch | Yes | No | Reject with generic object |
58+
59+
---
60+
61+
## 3. Current SDK Behavior (After Fix)
62+
63+
The following is implemented in **`src/core/lib/request.js`** (same behavior for SDK versions that include this fix).
64+
65+
### 3.1 Detecting socket/abort errors
66+
67+
The SDK treats an error as a **socket/abort** error when:
68+
69+
- `error.message === 'terminated'`, or
70+
- `error.cause && (error.cause.code === 'UND_ERR_SOCKET' || error.cause.code === 'UND_ERR_ABORTED')`
71+
72+
This matches how Node 22 / undici surface connection drops and aborts.
73+
74+
### 3.2 Catching and handling
75+
76+
- **200 branch**
77+
- `data.then(...).catch((err) => { ... })` is attached to the promise from `response.json()`.
78+
- If that promise rejects (e.g. UND_ERR_SOCKET during body read):
79+
- The error is **caught** (no unhandled rejection).
80+
- If it is a socket/abort error and `retryLimit > 0`, the SDK calls `onError(err)` and **retries** with the existing backoff.
81+
- Otherwise it **rejects** the Request promise with the same `err`, so the caller gets a proper rejection they can handle.
82+
83+
- **Non-200 branch**
84+
- `.catch((err) => { ... })` is used on the `data.then(...)` chain.
85+
- Same logic: socket/abort → retry when `retryLimit > 0`, else reject with `err` (or `{ status, statusText }` if `err` is missing).
86+
87+
- **Fetch-level**
88+
- The outer `fetch(...).catch((error) => { ... })` still catches when the **fetch** promise rejects (e.g. connection closed before or during response).
89+
- If the error is socket/abort and `retryLimit > 0`, the SDK calls `onError(error)` and **retries**.
90+
- Otherwise it **rejects** with the same `error`.
91+
92+
### 3.3 Retry behavior
93+
94+
- Retries use the existing **fetchOptions**: `retryLimit` (default 5), `retryDelay` (default 300 ms), and optional `retryDelayOptions` (e.g. base or customBackoff).
95+
- No change to the existing retry contract; socket/abort errors are now **eligible** for the same retry path as other retriable failures.
96+
97+
---
98+
99+
## 4. Conclusion
100+
101+
| Question | Answer |
102+
|----------|--------|
103+
| How does the SDK interact with undici? | Via the global `fetch` in Node (Node runtime). The SDK does not use undici directly. |
104+
| How does Node 22 / undici surface connection drops? | By rejecting the `fetch` promise or the `response.json()` (body) promise with `TypeError('terminated', { cause: SocketError })` and `cause.code === 'UND_ERR_SOCKET'` (or `UND_ERR_ABORTED`). |
105+
| Can the SDK catch Fetch.onAborted / UND_ERR_SOCKET? | **Yes.** Both the fetch-level rejection and the body-read (e.g. `response.json()`) rejection are caught in `request.js`. |
106+
| Does the SDK initiate a retry for these errors? | **Yes.** When the error is identified as socket/abort and `retryLimit > 0`, the SDK uses the existing `onError()` path and retries with the configured delay/backoff. |
107+
| Does the SDK return a formatted error instead of crashing? | **Yes.** If retries are exhausted or the error is not socket/abort, the SDK **rejects** the Request promise with the same error object (so the caller can inspect `error.message`, `error.cause`, and `error.cause.code`). The process does not crash from an unhandled rejection in the SDK. |
108+
109+
**Summary:** The SDK now catches connection drops and socket closures (Fetch.onAborted / UND_ERR_SOCKET) at both fetch and body-read level, retries them when possible using the existing retry mechanism, and otherwise rejects the returned promise with the underlying error. This prevents unhandled exceptions from crashing the user’s Node.js build process while keeping errors identifiable (e.g. for logging or 422 handling).
110+
111+
---
112+
113+
## 5. References
114+
115+
- **Request implementation:** `src/core/lib/request.js` (fetchRetry, 200/non-200 branches, outer fetch .catch).
116+
- **Node runtime:** `src/runtime/node/http.js` (re-exports global `fetch`).
117+
- **Customer error:** `TypeError: terminated` with `[cause]: SocketError: other side closed`, `code: 'UND_ERR_SOCKET'` (e.g. from Node `internal/deps/undici`).
118+
- **Node 22:** Uses bundled undici for `fetch`; socket errors are surfaced as above.

src/core/lib/request.js

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,14 @@ function fetchRetry (stack, queryParams, fetchOptions, resolve, reject, retryDel
112112
for (let index = 0; index < plugins.length && typeof plugins[index].onResponse === 'function'; index++) { json = plugins[index].onResponse(stack, request, response, json); }
113113

114114
resolve(json);
115+
}).catch((err) => {
116+
if (fetchOptions.debug) fetchOptions.logHandler('error', err);
117+
const isSocketOrAbort = (err && (err.message === 'terminated' || (err.cause && (err.cause.code === 'UND_ERR_SOCKET' || err.cause.code === 'UND_ERR_ABORTED'))));
118+
if (isSocketOrAbort && retryLimit > 0) {
119+
onError(err);
120+
} else {
121+
reject(err);
122+
}
115123
});
116124
} else {
117125
const { status, statusText } = response;
@@ -124,13 +132,23 @@ function fetchRetry (stack, queryParams, fetchOptions, resolve, reject, retryDel
124132
if (fetchOptions.debug) fetchOptions.logHandler('error', errorDetails);
125133
reject(errorDetails);
126134
}
127-
}).catch(() => {
128-
if (fetchOptions.debug) fetchOptions.logHandler('error', { status, statusText });
129-
reject({ status, statusText });
135+
}).catch((err) => {
136+
if (fetchOptions.debug) fetchOptions.logHandler('error', err);
137+
const isSocketOrAbort = (err && (err.message === 'terminated' || (err.cause && (err.cause.code === 'UND_ERR_SOCKET' || err.cause.code === 'UND_ERR_ABORTED'))));
138+
if (isSocketOrAbort && retryLimit > 0) {
139+
onError(err);
140+
} else {
141+
reject(err || { status, statusText });
142+
}
130143
});
131144
}
132145
}).catch((error) => {
133146
if (fetchOptions.debug) fetchOptions.logHandler('error', error);
134-
reject(error);
147+
const isSocketOrAbort = (error && (error.message === 'terminated' || (error.cause && (error.cause.code === 'UND_ERR_SOCKET' || error.cause.code === 'UND_ERR_ABORTED'))));
148+
if (isSocketOrAbort && retryLimit > 0) {
149+
onError(error);
150+
} else {
151+
reject(error);
152+
}
135153
});
136154
}

0 commit comments

Comments
 (0)