Implement Client Secret Expiry and Renewal for Dynamic Client Registration#5377
Open
Sanskarzz wants to merge 8 commits into
Open
Implement Client Secret Expiry and Renewal for Dynamic Client Registration#5377Sanskarzz wants to merge 8 commits into
Sanskarzz wants to merge 8 commits into
Conversation
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Contributor
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5377 +/- ##
==========================================
+ Coverage 68.74% 68.77% +0.02%
==========================================
Files 626 627 +1
Lines 63496 63711 +215
==========================================
+ Hits 43650 43815 +165
- Misses 16602 16639 +37
- Partials 3244 3257 +13 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ToolHive already persisted DCR
client_idandclient_secret, but did not preserve the RFC 7591 / RFC 7592 metadata needed to renew expiring client secrets. Long-running remote-auth workloads could therefore keep using cached credentials until an upstream such as Keycloak rejected token refreshes after the DCR client secret expired.This PR adds the missing DCR client-secret lifecycle:
client_secret_expires_at,registration_access_token,registration_client_uri, token endpoint auth method, and the callback port actually registered during DCR.PUT registration_client_uri) using the stored registration access token.This also addresses the previous PR review feedback:
slog.Infocalls withslog.Debug.oauthproto.ToolHiveMCPClientNameand the actual registered callback port in the RFC 7592 update body.[]byte -> string -> readerallocation.gosecsuppression on the Authorization header.registration_client_urivalidation to reject root-path URLs.SaveStatein DCR credential persistence.Fixes #3631
Type of change
Test plan
task test)task test-e2e)task lint-fix)Targeted package tests run:
go test ./pkg/auth/remote ./pkg/auth/discovery ./pkg/auth/oauth ./pkg/runner ./pkg/oauthproto ./pkg/auth/dcrResult:
Targeted lint command run:
Result:
Manual Keycloak validation is documented in
manual-testing-ratelimit-vmcp.mdand was used to validate the full renewal flow locally:localhostfor browser-based OAuth.mcp-server-audiencedefault client scope to the DCR client.cached_reg_token_ref,cached_reg_client_uri,cached_token_auth_method, andcached_dcr_callback_port.cached_secret_expiryinto the renewal window and verified startup performed exactly one RFC 7592 renewal before restoring cached OAuth tokens.client_secret_expires_at: 0response clearscached_secret_expiryback to the zero time value.API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.Changes
pkg/auth/discovery/discovery.goOAuthFlowResult, including the registered callback port.pkg/auth/remote/config.gopkg/auth/remote/handler.gopkg/auth/remote/secret_renewal.gopkg/runner/runner.gopkg/auth/remote/secret_renewal_test.goDoes this introduce a user-facing change?
Yes. Remote-auth workloads that use DCR can now continue operating when an OAuth provider issues expiring client secrets, provided the provider also returns RFC 7592 registration management metadata.
Providers that return
client_secret_expires_at: 0or omit RFC 7592 metadata remain backward compatible: ToolHive treats the secret as non-expiring or logs a warning and falls back to the existing secret until re-authentication is required.Special notes for reviewers
ClientCredentialsPersistercallback is internal to this repository and now carries DCR renewal metadata plus the registered callback port.registration_client_uriis stored in config because it is endpoint metadata, not a secret.Manual Testing: DCR Client Secret Expiry and RFC 7592 Renewal
This file is intentionally named
manual-testing-ratelimit-vmcp.mdbecause that was the requested path. The steps below validate theautosecretDCRbranch, not vMCP rate limiting.What This Validates
cached_secret_expirycached_reg_token_refcached_reg_client_uricached_token_auth_methodcached_dcr_callback_portPUT.Reference: Keycloak documents its client registration endpoint as
/realms/<realm>/clients-registrations/<provider>and the OpenID Connect DCR provider as/realms/<realm>/clients-registrations/openid-connect[/<client id>]: https://www.keycloak.org/securing-apps/client-registrationPrerequisites
dockerorpodmankindkubectlhelmkotaskjqUse separate terminals for long-running port-forward and
thv runcommands.1. Create a Kind Cluster and Deploy ToolHive Operator
Wait for the operator:
Expected result:
Running.2. Deploy Keycloak
For this local browser-based OAuth flow, make Keycloak advertise
localhostinstead of the in-clusterkeycloakhostname. Without this, Keycloak redirects the browser tokeycloak:<port>and can lose its restart-login cookie.If the hostname patch restarts Keycloak, the development H2 realm data can be reset. Re-run the realm setup once after the patched Keycloak is ready. The setup script currently expects Keycloak on local port
8080, so start a temporary8080port-forward in another terminal while this script runs:Then run:
Start a persistent Keycloak port-forward in a separate terminal and keep it running.
The example uses local port
8081because8080is often already occupied:In another terminal, fetch an admin token. Keycloak admin tokens are short-lived, so re-run this block whenever a later admin API command returns
401or an empty response:Expected result:
ADMIN_TOKENis non-empty.localhostconsistently for Keycloak admin, DCR, and ToolHive OAuth calls. Mixinglocalhostandkeycloakin browser flows can lose Keycloak's restart-login cookie.3. Deploy an Authenticated Remote MCP Server
Start the MCP server port-forward in a separate terminal:
Expected result:
http://localhost:9090.4. Create a Confidential DCR Client in Keycloak
ToolHive's normal CLI DCR path registers a public PKCE client. To exercise client-secret renewal with Keycloak, seed ToolHive with a confidential DCR client created through Keycloak's OpenID Connect client registration endpoint.
Create a Keycloak client-registration initial access token:
Use that initial access token for DCR:
Expected result:
client_id,client_secret,registration_access_token, andregistration_client_uri.registration_client_uriuseshttp://localhost:8081/.... If it still useshttp://keycloak:8081/..., Keycloak is still advertising the old hostname; recheck the hostname patch above before continuing.If this returns
403, rerun without-fto see Keycloak's error body:5. Give the DCR Client the MCP Server Audience
The sample
mcpserver-with-auth.yamlrequires themcp-serveraudience. Assign the existingmcp-server-audienceclient scope to the DCR client.Expected result:
401 Unauthorizedfrom this admin endpoint meansADMIN_TOKENexpired; rerun this step's first command to refresh it.6. Configure ToolHive Secrets
Use the encrypted provider for writable local secrets.
Expected result:
7. Create an Initial OAuth Session
Run a local ToolHive proxy against the authenticated remote MCP server.
Complete the browser login:
toolhive-useruser123After authentication succeeds and the workload starts, stop it without deleting the saved run config:
Expected result:
8. Seed Cached DCR Renewal Metadata
Force the cached secret into the proactive renewal window by setting the expiry to one hour from now.
Linux:
EXPIRY="$(date -u -d '+1 hour' +%Y-%m-%dT%H:%M:%SZ)"macOS:
EXPIRY="$(date -u -v+1H +%Y-%m-%dT%H:%M:%SZ)"Patch the saved run config:
Expected result:
cached_secret_expiryis about one hour in the future.9. Trigger Automatic Renewal
Save the old registration token:
OLD_REG_TOKEN="$(bin/thv secret get OAUTH_REG_TOKEN_dcr_manual)"Start the saved workload again so ToolHive reads the patched run config:
Expected debug logs:
Cached client secret is expiring or expired; attempting renewal before token restoreAttempting RFC 7592 client secret renewalSuccessfully renewed client secret via RFC 7592Inspect proxy logs:
After startup succeeds, stop it before patching the config again:
Validate persistence:
Expected result:
OAUTH_CLIENT_SECRET_*orOAUTH_REG_TOKEN_*reference in the run config.NEW_REG_TOKENdiffers fromOLD_REG_TOKEN.cached_secret_expiryis no longer within the forced one-hour renewal window, or is cleared if Keycloak omitsclient_secret_expires_at.10. Check for Double Renewal
Search the debug output from step 9.
Expected result:
Attempting RFC 7592 client secret renewallog appears for a single startup.CachedSecretExpirywas not updated in memory correctly.11. Validate Soft Failure While Expiring Soon
Patch the registration URI to a failing non-root path and keep the expiry in the future:
Start the workload again:
Expected result:
Client secret renewal failed.Stop the workload before restoring the config:
Restore the valid registration URI:
12. Validate Hard Failure After Expiry
Patch the expiry into the past and use the failing URI again:
Start the workload again:
Expected result:
client secret expired atandrenewal failed.Restore the valid registration URI if you want to keep testing:
13. Cleanup
Stop all port-forwards, then remove the test cluster:
Optional local cleanup:
Notes and Limitations
client_secretorclient_secret_expires_atin that path.client_secret_expires_atduring ToolHive-managed DCR, step 9 is not needed. Wait until the cached expiry is within 24 hours, then restart the workload.