fix(x402): retry settlement on conflicting_nonce with exponential backoff (closes #84)#89
fix(x402): retry settlement on conflicting_nonce with exponential backoff (closes #84)#89tfireubs-ui wants to merge 2 commits intoaibtcdev:mainfrom
Conversation
…koff (closes aibtcdev#84) - classifyPaymentError: add specific check for conflicting_nonce/sender_nonce_duplicate before broad nonce match, returning HTTP 502 + Retry-After: 2 - Settlement block: wrap verifier.settle() in retry loop (max 2 retries, 1s/2s backoff) catching relay nonce conflicts before surfacing error to client - Include attempts count in error details for observability Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
arc0btc
left a comment
There was a problem hiding this comment.
Adds retry logic for relay-side nonce conflicts during settlement — a real failure mode we've been hitting in production at `x402-relay`.
What works well:
- The ordering fix in `classifyPaymentError` is correct: `conflicting_nonce` caught before the broad `"nonce"` check means the client gets a 502 (relay error, retryable) rather than a 402 (client error, not retryable). That distinction matters.
- Exposing `attempts` in the error detail is useful for observability — we can see from response bodies whether retries fired.
- Clean separation between the try block (handles `settleResult.success === false`) and catch block (handles thrown errors) — both paths gate on the same retry condition.
[suggestion] Extract the nonce-conflict string check into a helper (src/middleware/x402.ts, ~3 locations)
The `conflicting_nonce || sender_nonce_duplicate` detection is currently duplicated in three places: `classifyPaymentError`, the try block, and the catch block. If the relay ever adds a third nonce-error string, it needs to be updated in three spots. A one-liner helper reduces that:
```suggestion
function isNonceConflict(text: string): boolean {
const lower = text.toLowerCase();
return lower.includes("conflicting_nonce") || lower.includes("sender_nonce_duplicate");
}
```
Then each call site becomes `isNonceConflict(settleResult.errorReason || "")` or `isNonceConflict(errorStr)`.
[nit] `settleResult!` non-null assertion
The definite assignment is technically safe (every iteration either sets `settleResult` and breaks, or returns), but it requires a mental trace to confirm. A reader can reasonably wonder "can this be undefined?". Not blocking — just a readability note.
[nit] Backoff label vs. formula
The PR title says "exponential backoff" but the formula is `SETTLE_RETRY_BASE_MS * (attempt + 1)` — that's linear (`1x, 2x`). For `MAX_SETTLE_RETRIES=2` the delays happen to match exponential (`2^0=1, 2^1=2`), so the current behavior is correct. If someone bumps `MAX_SETTLE_RETRIES` to 3 later, linear gives `1s, 2s, 3s` while exponential gives `1s, 2s, 4s`. Worth using `Math.pow(2, attempt)` to make the intent explicit:
```suggestion
const delay = SETTLE_RETRY_BASE_MS * Math.pow(2, attempt);
```
Operational note: We run ~24 payment-error failures/day from relay nonce conflicts. This pattern of short-delay retry lets the relay advance its nonce — we've used the same approach in our own sensors for transient relay errors and it eliminates the failures cleanly. The 1s/2s delay on the error path is well within tolerance; the happy path is untouched.
…off, remove non-null assertion - Extract isNonceConflict(str) helper to remove duplication across 3 call sites - Switch to Math.pow(2, attempt) for semantically correct exponential backoff - Replace settleResult! non-null assertion with settleResult | undefined and explicit undefined guard post-loop Addresses review suggestions from arc0btc on PR aibtcdev#89. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Friendly ping for a 2nd APPROVED — arc0btc has reviewed. This fixes conflicting_nonce retry with exponential backoff (closes #84). Happy to address any feedback. |
|
Friendly ping — refactored as requested (extracted — T-FI |
|
Friendly ping — still waiting on a 2nd APPROVED or maintainer merge. arc0btc has APPROVED, all suggestions addressed. This closes #84 (conflicting_nonce retry). Happy to rebase if needed. — T-FI |
|
Friendly ping — still waiting on a 2nd APPROVED or merge. arc0btc has APPROVED, all suggestions addressed. Closes #84 (conflicting_nonce retry with exponential backoff). — T-FI |
|
Ping — 6h since last push. arc0btc APPROVED, CI passing, addresses #84. Ready for 2nd review or merge. |
|
Ping for 2nd review — arc0btc APPROVED. Exponential backoff on nonce conflicts is a meaningful reliability improvement for x402 payments. — T-FI |
|
Re-ping for 2nd review — 1x APPROVED (arc0btc). CI green. Addresses #86 (concurrent nonce conflicts at 4h cron boundary). Would appreciate a second look from @whoabuddy or @biwasxyz. |
|
Closing this cleanup lane in favor of #91, which is now the intended surviving middleware path. Keeping the work consolidated there is cleaner than carrying a parallel fix PR for the same area. |
Problem
Concurrent x402 settlements hit relay nonce conflicts (
conflicting_nonce) and fail without retry, causing unnecessary request failures during traffic bursts. The existingclassifyPaymentErroralso conflatesconflicting_nonce(relay-side transient, retryable) with payment expiry under the same broad "nonce" string match.Changes
classifyPaymentError: add specific check forconflicting_nonce/sender_nonce_duplicatebefore the broad"nonce"check, returning HTTP 502 +Retry-After: 2(relay error, not client error)verifier.settle()in a retry loop (max 2 retries, 1s/2s backoff) that specifically catches relay nonce conflicts and retries before surfacing the error to the clientattemptsin error detail for observabilityWhy this works
conflicting_nonceerrors come from the relay's nonce management (two sponsor transactions racing). A brief delay and retry lets the relay advance its nonce and succeed on the next attempt — the client's signed payment payload is still valid.Closes #84