Skip to content

feat(e2e): add retry for transient BAPI errors in integration tests#8081

Open
jacekradko wants to merge 12 commits intomainfrom
jacek/rate-limit-retry-bapi
Open

feat(e2e): add retry for transient BAPI errors in integration tests#8081
jacekradko wants to merge 12 commits intomainfrom
jacek/rate-limit-retry-bapi

Conversation

@jacekradko
Copy link
Member

@jacekradko jacekradko commented Mar 14, 2026

Summary

  • Adds a transparent retry wrapper (withRetry) around the ClerkClient used in integration test setup/teardown
  • Wraps all async BAPI calls (user creation, org management, etc.) with automatic retry on transient HTTP errors
  • Retries on 429 (rate limit), 502 (bad gateway), 503 (service unavailable), and 504 (gateway timeout)
  • Uses exponential backoff with jitter, respects Retry-After headers (capped at 30s), max 5 retries
  • Applied at the single createClerkClient call site in testUtils/index.ts — zero changes to individual service files or tests

Motivation

Multiple concurrent CI runs share the same Clerk development instances, causing 429 rate limit errors and occasional 5xx transient failures (e.g. 503 Service Unavailable) that fail integration tests. Instead of failing, tests will now automatically retry with backoff, logging [Retry] 503 for users.createUser, attempt 1/5, waiting 1234ms for CI observability.

Test plan

  • pnpm build passes
  • Monitor CI logs for [Retry] messages to confirm the wrapper activates on real errors
  • Verify reduction in transient-error-related test failures over the next few days

Summary by CodeRabbit

  • Tests
    • Added automatic retries for integration test API calls with exponential backoff and jitter to improve resilience.
    • Applied retry behavior broadly to async operations used during test runs.
    • Integrated retry tracking and a summary output at test teardown to surface retry statistics and outcomes.

@changeset-bot
Copy link

changeset-bot bot commented Mar 14, 2026

⚠️ No Changeset found

Latest commit: da7d936

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link

vercel bot commented Mar 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
clerk-js-sandbox Ready Ready Preview, Comment Mar 17, 2026 2:03am

Request Review

@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 14, 2026

Open in StackBlitz

@clerk/agent-toolkit

npm i https://pkg.pr.new/@clerk/agent-toolkit@8081

@clerk/astro

npm i https://pkg.pr.new/@clerk/astro@8081

@clerk/backend

npm i https://pkg.pr.new/@clerk/backend@8081

@clerk/chrome-extension

npm i https://pkg.pr.new/@clerk/chrome-extension@8081

@clerk/clerk-js

npm i https://pkg.pr.new/@clerk/clerk-js@8081

@clerk/dev-cli

npm i https://pkg.pr.new/@clerk/dev-cli@8081

@clerk/expo

npm i https://pkg.pr.new/@clerk/expo@8081

@clerk/expo-passkeys

npm i https://pkg.pr.new/@clerk/expo-passkeys@8081

@clerk/express

npm i https://pkg.pr.new/@clerk/express@8081

@clerk/fastify

npm i https://pkg.pr.new/@clerk/fastify@8081

@clerk/hono

npm i https://pkg.pr.new/@clerk/hono@8081

@clerk/localizations

npm i https://pkg.pr.new/@clerk/localizations@8081

@clerk/nextjs

npm i https://pkg.pr.new/@clerk/nextjs@8081

@clerk/nuxt

npm i https://pkg.pr.new/@clerk/nuxt@8081

@clerk/react

npm i https://pkg.pr.new/@clerk/react@8081

@clerk/react-router

npm i https://pkg.pr.new/@clerk/react-router@8081

@clerk/shared

npm i https://pkg.pr.new/@clerk/shared@8081

@clerk/tanstack-react-start

npm i https://pkg.pr.new/@clerk/tanstack-react-start@8081

@clerk/testing

npm i https://pkg.pr.new/@clerk/testing@8081

@clerk/ui

npm i https://pkg.pr.new/@clerk/ui@8081

@clerk/upgrade

npm i https://pkg.pr.new/@clerk/upgrade@8081

@clerk/vue

npm i https://pkg.pr.new/@clerk/vue@8081

commit: da7d936

@jacekradko jacekradko changed the title feat(integration): add rate-limit retry for BAPI calls feat(e2e): add rate-limit retry for backend API calls Mar 14, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 14, 2026

📝 Walkthrough

Walkthrough

A new retry utility was added at integration/testUtils/retryableClerkClient.ts that returns a proxied ClerkClient applying automatic retry logic to asynchronous API calls. The wrapper implements configurable retryable HTTP status codes (429, 502, 503, 504), honors server retryAfter, uses exponential backoff with jitter and a retry cap, and tracks retry statistics. withRetry() is used in integration/testUtils/index.ts when constructing the Clerk client, and printRetrySummary() is invoked in integration/tests/global.teardown.ts.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(e2e): add retry for transient BAPI errors in integration tests' accurately and specifically summarizes the main change—adding a retry mechanism for transient API errors in integration tests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@jacekradko jacekradko changed the title feat(e2e): add rate-limit retry for backend API calls feat(e2e): add retry for transient BAPI errors in integration tests Mar 16, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration/testUtils/retryableClerkClient.ts`:
- Around line 1-92: Add unit tests for the retry utility focusing on behavior in
createProxy/retryOnFailure/getRetryDelay/withRetry: write tests that mock a
ClerkClient method to return a rejected Promise with status 429/502/503/504 and
assert it is retried up to MAX_RETRIES (use jest.spyOn on the mocked method to
count calls), tests that an error with a non-retryable status does not retry, a
test that when isClerkAPIResponseError-style error includes retryAfter the delay
uses retryAfter seconds (use jest.useFakeTimers and advanceTimersByTime to
validate wait), a test that once MAX_RETRIES is reached the original error is
thrown, and a test that synchronous (non-Promise) methods are passed through
unwrapped; also assert console.warn/log calls (or printRetrySummary output) to
verify retry reporting. Use the exported withRetry to wrap a thin mocked client
and reference the functions/classes by name: withRetry, createProxy,
retryOnFailure, getRetryDelay, and printRetrySummary.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 653cffb3-c354-4e65-b206-18f8dce194e6

📥 Commits

Reviewing files that changed from the base of the PR and between 61c7ec6 and da7d936.

📒 Files selected for processing (1)
  • integration/testUtils/retryableClerkClient.ts

Comment on lines +1 to +92
import type { ClerkClient } from '@clerk/backend';
import { isClerkAPIResponseError } from '@clerk/shared/error';

const MAX_RETRIES = 5;
const BASE_DELAY_MS = 1000;
const JITTER_MAX_MS = 500;
const MAX_RETRY_DELAY_MS = 30_000;
const RETRYABLE_STATUS_CODES = new Set([429, 502, 503, 504]);

const retryStats = { totalRetries: 0, callsRetried: new Set<string>() };

function sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}

function getRetryDelay(error: unknown, attempt: number): number {
if (isClerkAPIResponseError(error) && typeof error.retryAfter === 'number') {
return Math.min(error.retryAfter * 1000, MAX_RETRY_DELAY_MS);
}
return BASE_DELAY_MS * Math.pow(2, attempt) + Math.random() * JITTER_MAX_MS;
}

function recordRetry(path: string): void {
retryStats.totalRetries++;
retryStats.callsRetried.add(path);
}

export function printRetrySummary(): void {
if (retryStats.totalRetries === 0) {
console.log('[Retry] No retries occurred during this run.');
return;
}
const methods = [...retryStats.callsRetried].join(', ');
console.warn(
`[Retry] Summary: ${retryStats.totalRetries} retries across ${retryStats.callsRetried.size} API calls (${methods})`,
);
}

async function retryOnFailure<T>(firstAttempt: Promise<T>, fn: () => Promise<T>, path: string): Promise<T> {
for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
try {
return attempt === 0 ? await firstAttempt : await fn();
} catch (error) {
const isRetryable = isClerkAPIResponseError(error) && RETRYABLE_STATUS_CODES.has(error.status);
if (!isRetryable || attempt === MAX_RETRIES) {
throw error;
}
recordRetry(path);
const delayMs = getRetryDelay(error, attempt);
console.warn(
`[Retry] ${error.status} for ${path}, attempt ${attempt + 1}/${MAX_RETRIES}, waiting ${Math.round(delayMs)}ms`,
);
await sleep(delayMs);
}
}
// Unreachable, but satisfies TypeScript
throw new Error('Unreachable');
}

function createProxy(target: unknown, path: string[] = []): unknown {
if (target === null || (typeof target !== 'object' && typeof target !== 'function')) {
return target;
}

return new Proxy(target as object, {
get(obj, prop, receiver) {
if (typeof prop === 'symbol') {
return Reflect.get(obj, prop, receiver);
}
const value = Reflect.get(obj, prop, receiver);
if (typeof value === 'function') {
return (...args: unknown[]) => {
const result = value.apply(obj, args);
// Only wrap promises (async API calls), pass through sync returns
if (result && typeof result === 'object' && typeof result.then === 'function') {
const fullPath = [...path, prop].join('.');
return retryOnFailure(result, () => value.apply(obj, args), fullPath);
}
return result;
};
}
if (typeof value === 'object' && value !== null) {
return createProxy(value, [...path, prop]);
}
return value;
},
});
}

export function withRetry(client: ClerkClient): ClerkClient {
return createProxy(client) as ClerkClient;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if tests exist for retryableClerkClient
fd -t f 'retryableClerkClient' --extension ts --extension test.ts
rg -l 'retryableClerkClient|withRetry' --glob '*test*' --glob '*spec*'

Repository: clerk/javascript

Length of output: 105


Add tests for the retry utility.

This new utility contains significant retry logic (exponential backoff, retry conditions, status code filtering, proxy-based interception) that should have test coverage. Consider tests for:

  • Retry behavior on 429/502/503/504 status codes
  • No retry on other error codes
  • retryAfter header handling
  • Max retry limit enforcement
  • Proxy correctly wraps async calls but passes through sync calls
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@integration/testUtils/retryableClerkClient.ts` around lines 1 - 92, Add unit
tests for the retry utility focusing on behavior in
createProxy/retryOnFailure/getRetryDelay/withRetry: write tests that mock a
ClerkClient method to return a rejected Promise with status 429/502/503/504 and
assert it is retried up to MAX_RETRIES (use jest.spyOn on the mocked method to
count calls), tests that an error with a non-retryable status does not retry, a
test that when isClerkAPIResponseError-style error includes retryAfter the delay
uses retryAfter seconds (use jest.useFakeTimers and advanceTimersByTime to
validate wait), a test that once MAX_RETRIES is reached the original error is
thrown, and a test that synchronous (non-Promise) methods are passed through
unwrapped; also assert console.warn/log calls (or printRetrySummary output) to
verify retry reporting. Use the exported withRetry to wrap a thin mocked client
and reference the functions/classes by name: withRetry, createProxy,
retryOnFailure, getRetryDelay, and printRetrySummary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants