feat: add TokenProvider for composable bearer-token auth#1710
feat: add TokenProvider for composable bearer-token auth#1710felixweinberger wants to merge 5 commits intomainfrom
Conversation
🦋 Changeset detectedLatest commit: 3aa0cd6 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@modelcontextprotocol/client
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
|
📌 Commit It's presented as a delta on top of the Who breaks: only users who hand-implement async token() { return (await this.tokens())?.access_token; }Built-in providers ( What stays scoped out: Happy to drop this commit if we prefer the additive path, or drop the earlier commits if we take this one. |
| * For OAuth flows, use {@linkcode OAuthClientProvider} which extends this interface, | ||
| * or one of the built-in providers ({@linkcode index.ClientCredentialsProvider | ClientCredentialsProvider} etc.). | ||
| */ | ||
| export interface AuthProvider { |
There was a problem hiding this comment.
The core new abstraction
Adds a minimal `() => Promise<string | undefined>` function type as a lightweight alternative to OAuthClientProvider, for scenarios where bearer tokens are managed externally (gateway/proxy patterns, service accounts, API keys). - New TokenProvider type + withBearerAuth(getToken, fetchFn?) helper - New tokenProvider option on StreamableHTTPClientTransport and SSEClientTransport, used as fallback after authProvider in _commonHeaders(). authProvider takes precedence when both set. - On 401 with tokenProvider (no authProvider), transports throw UnauthorizedError — no retry, since tokenProvider() is already called before every request and would likely return the same rejected token. Callers catch UnauthorizedError, invalidate external cache, reconnect. - Exported previously-internal auth helpers for building custom flows: applyBasicAuth, applyPostAuth, applyPublicAuth, executeTokenRequest. - Tests, example, docs, changeset. Zero breakage. Bughunter fleet review: 28 findings submitted, 2 confirmed, both addressed.
689e8c4 to
3f4d125
Compare
Transports now accept AuthProvider { token(), onUnauthorized() } instead
of being typed as OAuthClientProvider. OAuthClientProvider extends
AuthProvider, so built-in providers work unchanged — custom
implementations add two methods (both TypeScript-enforced).
Core changes:
- New AuthProvider interface — transports only need token() +
onUnauthorized(), not the full 21-member OAuth interface
- OAuthClientProvider extends AuthProvider; onUnauthorized() is
required (not optional) on OAuthClientProvider since OAuth providers
that omit it lose all 401 recovery. The 4 built-in providers
implement both methods, delegating to new handleOAuthUnauthorized
helper.
- Transports call authProvider.token() in _commonHeaders() — one code
path, no precedence rules
- Transports call authProvider.onUnauthorized() on 401, retry once —
~50 lines of inline OAuth orchestration removed per transport.
Circuit breaker via _authRetryInFlight (reset in outer catch so
transient onUnauthorized failures don't permanently disable
retries).
- Response body consumption deferred until after the onUnauthorized
branch so custom implementations can read ctx.response.text()
- WWW-Authenticate extraction guarded with headers.has() check
(pre-existing inconsistency; the SSE connect path already did this)
- finishAuth() and 403 upscoping gated on isOAuthClientProvider()
- TokenProvider type + tokenProvider option deleted — subsumed by
{ token: async () => ... } as authProvider
Simple case: { authProvider: { token: async () => apiKey } } — no
class needed, TypeScript structural typing.
auth() and authInternal() (227 LOC of OAuth orchestration) untouched.
They still take OAuthClientProvider. Only the transport/provider
boundary moved.
See docs/migration.md and docs/migration-SKILL.md for before/after.
3f4d125 to
2961101
Compare
travisbreaks
left a comment
There was a problem hiding this comment.
Thorough and well-documented refactoring. The AuthProvider abstraction is a clear improvement: the one-liner bearer token pattern ({ token: async () => key }) eliminates the 8-member stub problem that every non-OAuth user hit. Migration docs are excellent.
A few observations on the design:
1. token() called before every request
The docs say transports call token() before every request. If token() involves async work (cache lookup, refresh check, remote call), this adds latency to every single MCP request. Consider:
- Documenting that
token()should be fast (return cached value, refresh in background) - Or adding a
tokenCacheDurationoption so the transport can skip callingtoken()on rapid successive requests
2. Concurrent 401 handling
The _authRetryInFlight circuit breaker prevents infinite retry loops, but what happens when multiple concurrent requests all get 401 simultaneously? If onUnauthorized does a token refresh, N concurrent requests could trigger N parallel refresh flows. Only the first should refresh; others should wait for the result.
This was likely also a problem with the old _hasCompletedAuthFlow approach, but worth addressing in this refactoring if possible. A shared promise that coalesces concurrent onUnauthorized calls would prevent token endpoint flooding.
3. Breaking change surface area
The changeset says @modelcontextprotocol/client: major. Custom OAuthClientProvider implementations must add token() and onUnauthorized(). The migration guide covers this well, but worth considering: could token() have a default implementation on OAuthClientProvider that calls this.tokens()?.access_token? That would reduce the breaking surface to zero for the common case.
4. Type guard pattern
isOAuthClientProvider() is a runtime type guard. Since OAuthClientProvider extends AuthProvider, this is the right approach for gating OAuth-specific features (like finishAuth()). Clean.
5. Exported auth helpers
Exposing applyBasicAuth, applyPostAuth, applyPublicAuth, executeTokenRequest is a good move for custom flow builders. These were previously internal, so worth noting in the docs that they are now part of the public API surface and subject to semver.
Strong PR. The main concern is the concurrent 401 coalescing question.
Alternative to the breaking 'extends AuthProvider' approach. Instead of
requiring OAuthClientProvider implementations to add token() +
onUnauthorized(), the transport constructor classifies the authProvider
option once and adapts OAuth providers via adaptOAuthProvider().
- OAuthClientProvider interface is unchanged from v1
- Transport option: authProvider?: AuthProvider | OAuthClientProvider
- Constructor: if OAuth, store both original (for finishAuth/403) and
adapted (for _commonHeaders/401) — classification happens once, no
runtime type guards in the hot path
- 4 built-in providers no longer need token()/onUnauthorized()
- migration.md/migration-SKILL.md entries removed — nothing to migrate
- Changeset downgraded to minor
Net -142 lines vs the breaking approach. Same transport simplification,
zero migration burden. Duck-typing via isOAuthClientProvider()
('tokens' + 'clientMetadata' in provider) at construction only.
Four fixes from claude[bot] review on the AuthProvider approach: 1. Drain 401 response body after onUnauthorized() succeeds, before the retry. Unconsumed bodies block socket recycling in undici. All three 401 sites now drain before return. 2. _startOrAuthSse() 401 retry was return await, causing onerror to fire twice (recursive call's catch + outer catch both fire). Changed to return (not awaited) matching the send() pattern. Removed the try/finally, added flag reset to success path + outer catch instead. 3. Migration docs still referenced SdkErrorCode.ClientHttpAuthentication for the 401-after-auth case, but that throw site was replaced by _authRetryInFlight which throws UnauthorizedError. Updated both migration.md and migration-SKILL.md. 4. Pre-existing: 403 upscoping auth() call passed this._fetch instead of this._fetchWithInit, dropping custom requestInit options during token requests. All other auth() calls in this transport already used _fetchWithInit.
The 401-after-re-auth case (circuit breaker trips) should throw a distinct error from the normal 'token rejected' case: - First 401 with no onUnauthorized → UnauthorizedError — caller re-auths externally and reconnects - Second 401 after onUnauthorized succeeded → SdkError with ClientHttpAuthentication — server is misbehaving, don't blindly retry, escalate The previous commit collapsed these into UnauthorizedError, which risks callers catching it, re-authing, and looping. Restored the SdkError throw at all three 401 sites when _authRetryInFlight is already set. Reverted migration doc changes — ClientHttpAuthentication is not dead code.
| return this._startOrAuthSse(options); | ||
| } | ||
| await response.text?.().catch(() => {}); | ||
| if (this._authRetryInFlight) { | ||
| throw new SdkError(SdkErrorCode.ClientHttpAuthentication, 'Server returned 401 after re-authentication', { | ||
| status: 401 | ||
| }); | ||
| } | ||
| if (this._authProvider) { | ||
| throw new UnauthorizedError(); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
🔴 _authRetryInFlight is never reset when _startOrAuthSse() gets 401, onUnauthorized succeeds, and the recursive retry gets 405. The 405 path returns at line 243 without hitting either reset point (line 252 requires response.ok, line 255 requires a thrown error), permanently disabling onUnauthorized for subsequent send() 401s. Fix: add this._authRetryInFlight = false before the return at line 243, or use try/finally like the SSE transport does at line 158.
Extended reasoning...
Bug description
In _startOrAuthSse(), when a 401 triggers a successful onUnauthorized call, _authRetryInFlight is set to true at line 222 and the function recurses via return this._startOrAuthSse(options) at line 230. If the recursive call receives a 405 (server does not support GET SSE), it enters if (!response.ok) at line 213, skips the 401 block, drains the body, and hits if (response.status === 405) { return; } at line 242-243.
Why neither reset point fires
The 405 code path exits via a normal return inside the if (!response.ok) block. The two places that reset _authRetryInFlight are:
- Line 252: this._authRetryInFlight = false — only reachable when response.ok is true (outside the if (!response.ok) block)
- Line 255: in the catch block — only reachable when an error is thrown
Since 405 is !response.ok but not an error (it returns normally), neither reset fires.
Step-by-step proof
- Client connects via StreamableHTTP to a server that requires auth but does not support GET SSE.
- _startOrAuthSse() fetches GET — server returns 401.
- onUnauthorized is present and _authRetryInFlight is false, so the code enters the retry branch.
- _authRetryInFlight is set to true (line 222). onUnauthorized() succeeds (refreshes token).
- return this._startOrAuthSse(options) recurses (line 230).
- Recursive _startOrAuthSse() fetches GET with the new token — server returns 405 (no GET SSE support).
- Code enters if (!response.ok), skips the 401 check, drains body, hits if (response.status === 405) { return; } at line 243.
- _authRetryInFlight is never reset — stays permanently true.
- Later, the access token expires. Client calls send(message) — server returns 401.
- send() checks !this._authRetryInFlight (line ~502) — it is false because the flag is stuck true.
- The onUnauthorized branch is skipped entirely — UnauthorizedError is thrown immediately, even though onUnauthorized could have recovered.
Contrast with the SSE transport
The SSE transport's onerror handler (lines 154-160) correctly uses .finally(() => { this._authRetryInFlight = false; }), which always resets the flag regardless of the return path. The _startOrAuthSse path lacks this protection.
Impact
This is a realistic scenario — many MCP servers support POST but return 405 on GET SSE. After a 401-then-405 sequence during connect, the transport permanently loses its ability to recover from 401 errors on subsequent send() calls. The flag does self-heal when the send() catch block at line 621 resets _authRetryInFlight, but only after one unnecessary UnauthorizedError is thrown and propagated to the caller. The existing test at tokenProvider.test.ts line 156-178 exercises exactly this 401-to-405 flow but does not verify the flag is reset afterward.
Fix
Add this._authRetryInFlight = false before the return at line 243, or restructure to use try/finally around the flag like the SSE transport does.
| if (this._authProvider.onUnauthorized && this._last401Response && !this._authRetryInFlight) { | ||
| this._authRetryInFlight = true; | ||
| const response = this._last401Response; | ||
| this._authProvider | ||
| .onUnauthorized({ response, serverUrl: this._url, fetchFn: this._fetchWithInit }) | ||
| .then(() => this._startOrAuth()) | ||
| .then(resolve, reject) | ||
| .finally(() => { | ||
| this._authRetryInFlight = false; | ||
| }); | ||
| return; |
There was a problem hiding this comment.
🟡 When onUnauthorized() rejects in the SSE connect onerror handler (e.g., network error during token refresh), the error propagates via .then(resolve, reject) but this.onerror?.(error) is never called. Every other error path in both transports (lines 163-165, 169-171, send() catch blocks) consistently calls both reject/throw AND this.onerror. This is a regression from the pre-PR _authThenStart() which called this.onerror in its catch block.
Extended reasoning...
What the bug is
In the SSE transport onerror handler (lines 154-160), when onUnauthorized() rejects, the promise chain .onUnauthorized(...).then(() => this._startOrAuth()).then(resolve, reject) routes the rejection to reject(error), but never calls this.onerror?.(error). This is inconsistent with every other error path in the same handler and across both transports.
The specific code path
Lines 154-161 show the retry path:
this._authProvider
.onUnauthorized({ response, serverUrl: this._url, fetchFn: this._fetchWithInit })
.then(() => this._startOrAuth())
.then(resolve, reject) // reject is called, but onerror is not
.finally(() => { this._authRetryInFlight = false; });Compare with the non-retry 401 path at lines 163-165:
const error = new UnauthorizedError();
reject(error);
this.onerror?.(error); // onerror IS calledAnd the SseError path at lines 169-171:
const error = new SseError(event.code, event.message, event);
reject(error);
this.onerror?.(error); // onerror IS calledThe same pattern (both reject/throw AND onerror) is also used in SSE send() catch (line 296-298), StreamableHTTP _startOrAuthSse catch (line 254-256), and StreamableHTTP send() catch (line 618-620).
Step-by-step proof
transport.start()is called, which calls_startOrAuth().- The EventSource connects; the server returns 401.
- The
onerrorhandler fires withevent.code === 401. onUnauthorizedis defined,_last401Responseis set,_authRetryInFlightis false — we enter the retry branch.onUnauthorized()is called but throws (e.g., DNS failure reaching the auth server during token refresh).- The rejection flows through
.then(resolve, reject)—reject(error)is called. - The caller of
start()sees the error via promise rejection (good). - However,
this.onerror?.(error)is never invoked — any monitoring or logging wired through theonerrorcallback misses this failure entirely.
Why this is a regression
The pre-PR _authThenStart() method explicitly called this.onerror?.(error as Error) in its catch block before rethrowing, so the onerror notification channel was always used. The new promise chain structure dropped this call.
Impact
The error IS properly propagated to the caller of start() via promise rejection, so the error is not lost. The impact is limited to users who rely on the onerror callback for monitoring/logging — they will not see this specific failure mode. The SSE transport is also deprecated in favor of StreamableHTTP.
Fix
Note that a naive fix like .then(resolve, (error) => { reject(error); this.onerror?.(error); }) could cause double onerror notifications when the recursive _startOrAuth() itself fails (since its own error paths call onerror). The onUnauthorized() rejection case is safe to handle this way since no inner _startOrAuth() runs, but the _startOrAuth() rejection case needs care. A clean fix would separate the two failure modes, e.g.:
this._authProvider
.onUnauthorized({ response, serverUrl: this._url, fetchFn: this._fetchWithInit })
.then(
() => this._startOrAuth().then(resolve, reject),
(error) => { reject(error); this.onerror?.(error); }
)
.finally(() => { this._authRetryInFlight = false; });This calls onerror only when onUnauthorized itself fails, while letting _startOrAuth handle its own onerror notifications internally.
| await response.text?.().catch(() => {}); | ||
| if (this._authRetryInFlight) { | ||
| throw new SdkError(SdkErrorCode.ClientHttpAuthentication, 'Server returned 401 after re-authentication', { | ||
| status: 401 | ||
| }); | ||
| } | ||
| if (this._authProvider) { | ||
| throw new UnauthorizedError(); | ||
| } | ||
|
|
||
| // Purposely _not_ awaited, so we don't call onerror twice | ||
| return this.send(message); | ||
| } | ||
|
|
||
| const text = await response.text?.().catch(() => null); | ||
| throw new Error(`Error POSTing to endpoint (HTTP ${response.status}): ${text}`); |
There was a problem hiding this comment.
🟡 When no authProvider is configured and the server returns 401, the code enters the 401 block (line 264/503 checks only response.status === 401), drains the body at line 282/522 via await response.text(), then falls through all conditional throws (none apply without an authProvider). The generic error handler at line 293/533 calls response.text() again on the already-consumed body, producing Error POSTing to endpoint (HTTP 401): null instead of including the actual response body. Fix: move the body drain after the 401 block, or gate the entire 401 block with && this._authProvider like the old code did.
Extended reasoning...
What the bug is
When no authProvider is configured on the transport and the server returns HTTP 401, the response body is consumed inside the 401 handling block but the code falls through without throwing, causing the generic error handler to read an already-consumed body.
Code flow (SSE send, no authProvider, 401 response)
- Line 264:
if (response.status === 401)enters unconditionally (the old code had&& this._authProviderhere) - Line 271:
this._authProvider?.onUnauthorizedis falsy (no authProvider), skip the retry branch - Line 282:
await response.text?.().catch(() => {})drains the response body - Line 283:
this._authRetryInFlightisfalse, skip - Line 288:
this._authProviderisundefined, skip - Falls out of the 401 block without throwing
- Line 293:
const text = await response.text?.().catch(() => null)body already consumed,textisnull - Line 294: throws
Error POSTing to endpoint (HTTP 401): null-- actual 401 body content is lost
The identical pattern exists in StreamableHTTP send() at lines 522-533.
Why existing code does not prevent it
The old code (before this PR) read response.text() once at the top of the !response.ok block, before the 401 check, and also gated the 401 block with && this._authProvider. Both guards have been removed in this refactoring. The new code enters the 401 block unconditionally and drains the body on a path that does not throw when there is no authProvider.
Step-by-step concrete example
A user creates a StreamableHTTPClientTransport with no authProvider (connecting to a server that unexpectedly requires auth). The server returns 401 with body {"error": "authentication_required", "message": "Please provide an API key"}. In the old code, the error message would include this body text. In the new code, it becomes Error POSTing to endpoint (HTTP 401): null because the body was already drained inside the 401 block at line 282/522.
Impact
The error is still thrown correctly with the right HTTP status code, so behavior is functionally correct. The impact is limited to error message quality: developers debugging 401 errors without an authProvider will see null instead of the server's actual error message. This is a regression from the pre-PR behavior where response bodies were preserved in error messages.
How to fix
Either:
- Move the unconditional
await response.text?.().catch(() => {})at line 282/522 to only execute on paths that throw within the 401 block - Or re-add
&& this._authProviderto the outer 401 check so the block is skipped entirely when no authProvider is configured, letting the generic error handler read the body first
Three approaches to making client auth composable, presented as three commits for side-by-side review. They are mutually exclusive — pick one shape before merge.
TokenProvider9aea20fbAuthProviderviaextends29611017AuthProvidervia adapter65b5099dApproach A: additive
TokenProvider(9aea20fb)Adds
TokenProvider = () => Promise<string | undefined>alongside the existingOAuthClientProvider. Transports gain atokenProvideroption;authProvidertakes precedence when both are set. On 401 with onlytokenProvider, throwsUnauthorizedError(no retry). Zero breakage, but two knobs with a precedence rule.Approach B:
AuthProviderviaextends(29611017)Transports take only
AuthProvider. Built-in providers work unchanged. CustomOAuthClientProviderimplementations must addtoken()+onUnauthorized()(TypeScript-enforced, ~5 lines, delegating tohandleOAuthUnauthorized). Transports drop ~50 lines of inline OAuth orchestration each.Approach C:
AuthProvidervia adapter (65b5099d)Same transport simplification as B, but
OAuthClientProvideris completely unchanged — zero migration. Transport classifies once at construction; the adapter synthesizestoken()fromtokens()?.access_tokenandonUnauthorized()fromhandleOAuthUnauthorized(). Simple case is still{ token: async () => apiKey }.Motivation and Context
OAuthClientProviderbundles token-production with OAuth-flow orchestration (21 members, 8 required). Gateway/proxy patterns, service accounts, pre-provisioned API keys, enterprise SSO — none of these fit the interactive browser-redirect model, and today have to stub the redirect machinery or wrapfetchmanually.Key finding: transports only ever call
.tokens()onOAuthClientProviderdirectly — all 20 other members are consumed byauth(). That made the decomposition surgical:auth()/authInternal()(227 LOC) remain untouched in all three approaches.How Has This Been Tested?
claude[bot]review on B/C (all addressed:_authRetryInFlightreset,migration-SKILL.md, response body consumption,onUnauthorizedrequired-ness,WWW-Authenticateguard consistency)finishAuthguard) filled with targeted testsBreaking Changes
A: none. B: custom
OAuthClientProviderimplementations add two methods. C: none.Types of changes
Checklist
Additional context
Also exported (all approaches):
applyBasicAuth,applyPostAuth,applyPublicAuth,executeTokenRequest.Review shortcuts:
git show 9aea20fbgit show 29611017git show 65b5099d(−142 lines vs B: removes theextends+ required methods + migration docs, adds the adapter)