Skip to content

Service API layer with OAuth2 client credentials for service APIs. Onboarding dummy tool, LFX Lens prod tool#39

Merged
emsearcy merged 9 commits intomainfrom
service-api-auth-layer
Apr 8, 2026
Merged

Service API layer with OAuth2 client credentials for service APIs. Onboarding dummy tool, LFX Lens prod tool#39
emsearcy merged 9 commits intomainfrom
service-api-auth-layer

Conversation

@josep-reyero
Copy link
Copy Markdown
Contributor

@josep-reyero josep-reyero commented Mar 24, 2026

Summary

Replace shared API keys with OAuth2 client credentials (M2M JWTs) for service API authentication (LFX Lens, Member Onboarding). Each service gets its own Auth0 resource server — no more shared secrets.

See docs/service-api-architecture.md for the full design: token flows, diagrams, tool inventory, and API schemas.

What changed

  • Auth layer: ClientCredentialsClient for OAuth2 client credentials with token caching; TokenSource interface replaces APIKey in the service API client
  • LFX Lens tool: lfx_lens_query now calls the real Lens backend with user and session tracking. Lens backend not yet deployed — the tool is ready but will fail until the backend is live.
  • Onboarding tool: onboarding_list_memberships registered with correct args/types but returns dummy response (actual API call commented out inline)
  • Config: LFXMCP_LF_AI_API_KEY replaced with per-service LFXMCP_LENS_API_AUDIENCE and LFXMCP_ONBOARDING_API_AUDIENCE

Dependencies

This PR depends on two auth0-terraform PRs:

  1. auth0-terraform#249 — Creates the Auth0 resource servers for LFX Lens and Member Onboarding. Required before the MCP server can obtain client credentials tokens, and before the services can validate them.

  2. auth0-terraform#248 — Adds the lf_staff custom claim to MCP API access tokens. Required for lfx_lens_query specifically — this tool checks the lf_staff JWT claim to enforce staff-only access. Until #248 is deployed, the Lens tool will reject all requests with "this tool is available to Linux Foundation staff only".

onboarding_list_memberships does not depend on #248 (no staff check).

No new env vars needed to deploy this PR

The service tools only activate when both URL and audience are configured. With current deployment config unchanged, everything works as before.

Test plan

  • go build ./... passes
  • go test ./... passes
  • No remaining references to ApiKey / LF_AI_API_KEY
  • Existing V2 tools unaffected
  • LFX Lens end-to-end tested locally against live Lens backend

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 24, 2026 15:10
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review.

Tip: disable this comment in your organization's Code Review settings.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates internal service API authentication from shared API keys to Auth0-issued OAuth2 client-credentials tokens, and introduces shared service-tool authorization plumbing (slug→UUID resolution + V2 access-check) to enforce project-level authorization before proxying to Lens/Onboarding.

Changes:

  • Added OAuth2 client-credentials token client (cached) and a generic service API HTTP client (TokenSource-based).
  • Added V2 authorization helpers: access-check client, slug resolver (cached), and a ServiceAuth flow used by new service tools.
  • Wired new tools/config into main.go, plus added architecture + env var documentation.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
internal/tools/service.go Shared service-tool infra (ServiceAuth, access relations, staff-claim helpers).
internal/tools/onboarding.go New onboarding tool (currently returns stubbed/dummy response).
internal/tools/lens.go New Lens tool (staff-gated; currently returns stubbed/dummy response).
internal/serviceapi/client.go Generic HTTP client for service APIs using bearer tokens + optional debug wire logging.
internal/serviceapi/client_test.go Tests for GET/POST JSON/multipart behavior and auth header handling.
internal/lfxv2/token_exchange.go Extracted shared generateClientAssertion() helper for reuse.
internal/lfxv2/client_credentials.go New client-credentials grant client with token caching.
internal/lfxv2/access_check.go New access-check client for project relation checks.
internal/lfxv2/access_check_test.go Tests for access-check request/response handling and edge cases.
internal/lfxv2/slug_resolver.go New in-memory slug→UUID resolver for V2 query service.
internal/lfxv2/client.go Added helper to retrieve exchanged V2 token as a string (GetExchangedToken).
cmd/lfx-mcp-server/main.go Config + wiring for per-service audiences/URLs and new tool registration; copies lf_staff claim into TokenInfo.Extra.
docs/service-api-architecture.md Architecture doc describing the 3-token model and authorization flow.
AGENTS.md Documented new environment variables for Lens/Onboarding service configuration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +104 to +107
c.cachedToken = token
// Cache with 5-minute buffer before actual expiry.
c.expiry = time.Now().Add(time.Duration(expiresIn)*time.Second - 5*time.Minute)

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token cache expiry subtracts a fixed 5-minute buffer from expires_in. If expires_in is less than 300 seconds (or is 0/negative due to an upstream issue), this sets c.expiry in the past and can cause repeated token fetches in a tight loop. Consider clamping the buffer (e.g., min(5m, expiresIn/2) or max(expiresIn-buffer, 0)) and validating expires_in > 0 before caching.

Suggested change
c.cachedToken = token
// Cache with 5-minute buffer before actual expiry.
c.expiry = time.Now().Add(time.Duration(expiresIn)*time.Second - 5*time.Minute)
if expiresIn <= 0 {
return "", fmt.Errorf("token endpoint returned non-positive expires_in: %d", expiresIn)
}
c.cachedToken = token
lifetime := time.Duration(expiresIn) * time.Second
// Use a buffer before actual expiry to proactively refresh the token.
// Clamp the buffer to avoid setting expiry in the past for short-lived tokens.
maxBuffer := 5 * time.Minute
buffer := maxBuffer
if lifetime < maxBuffer {
buffer = lifetime / 2
}
c.expiry = time.Now().Add(lifetime - buffer)

Copilot uses AI. Check for mistakes.
Comment thread internal/tools/service.go Outdated
Comment on lines +42 to +47
// This separation exists because:
// - Service APIs use shared API keys with no per-user authorization
// - The MCP server IS the authorization gateway for these tools
// - The V2 access-check (OpenFGA-backed) is the authoritative decision
// - MCP scopes (read:all / manage:all) are for V2 API tools where
// Heimdal handles per-resource authorization
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment here still says service APIs use shared API keys, but this PR replaces them with OAuth2 client credentials (M2M bearer tokens). Updating this bullet would avoid confusion for future readers and keep the in-code documentation aligned with the new auth model.

Copilot uses AI. Check for mistakes.
Comment thread internal/tools/service.go Outdated
// RequireLFStaff checks that the user has the lf_staff custom claim. Returns
// an MCP error result if the user is not staff, or nil if they are.
func RequireLFStaff(req *mcp.CallToolRequest) *mcp.CallToolResult {
if req.Extra != nil && req.Extra.TokenInfo != nil && !IsLFStaff(req.Extra.TokenInfo) {
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RequireLFStaff currently only denies access when TokenInfo is present AND IsLFStaff returns false. If TokenInfo/Extra is missing (e.g., auth misconfiguration or unauthenticated calls), this returns nil and allows access to a staff-only tool. This should deny by default when the staff claim cannot be verified (i.e., require TokenInfo+claim=true).

Suggested change
if req.Extra != nil && req.Extra.TokenInfo != nil && !IsLFStaff(req.Extra.TokenInfo) {
// Deny by default: require a token and a true lf_staff claim.
if req.Extra == nil || req.Extra.TokenInfo == nil || !IsLFStaff(req.Extra.TokenInfo) {

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +205
// wrapWithDebugTransport wraps an HTTP client with wire-level request/response logging.
func wrapWithDebugTransport(client *http.Client, logger *slog.Logger) *http.Client {
base := client.Transport
if base == nil {
base = http.DefaultTransport
}
return &http.Client{
Transport: &serviceDebugTransport{
transport: base,
logger: logger,
},
Timeout: client.Timeout,
}
}

// serviceDebugTransport logs the full HTTP wire dump of every request and response.
type serviceDebugTransport struct {
transport http.RoundTripper
logger *slog.Logger
}

// RoundTrip implements http.RoundTripper.
func (dt *serviceDebugTransport) RoundTrip(req *http.Request) (*http.Response, error) {
reqDump, err := httputil.DumpRequestOut(req, true)
if err != nil {
dt.logger.Error("failed to dump outbound request", "error", err)
} else {
dt.logger.Debug("serviceapi outbound request", "dump", string(reqDump))
}

resp, err := dt.transport.RoundTrip(req)
if err != nil {
dt.logger.Error("serviceapi outbound request failed", "error", err, "url", req.URL.String())
return nil, err
}

respDump, err := httputil.DumpResponse(resp, true)
if err != nil {
dt.logger.Error("failed to dump inbound response", "error", err)
} else {
dt.logger.Debug("serviceapi inbound response", "dump", string(respDump))
}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug transport logs full request/response dumps via httputil.DumpRequestOut/DumpResponse. This will include the Authorization header (M2M bearer token) and potentially sensitive payloads, which is risky even behind a debug flag because logs often end up in centralized systems. Consider redacting Authorization (and other secrets) before dumping, or logging only safe metadata (method, URL, status, duration) with an opt-in body dump that explicitly strips credentials.

Copilot uses AI. Check for mistakes.
@josep-reyero josep-reyero changed the title Replace shared API keys with OAuth2 client credentials for service APIs Service API layer with OAuth2 client credentials for service APIs and dummy tools Mar 24, 2026
@josep-reyero josep-reyero requested a review from emsearcy March 24, 2026 15:54
Copy link
Copy Markdown
Contributor

@emsearcy emsearcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

largest changes would be the access_check response payload, and of course implementing the upstream calls that are currently commented out.

Also, I didn't flag it, but I assume we'll need to change the custom-claim prefix (per my comment in the Auth0 Action to adopt our newer, more-terse domain prefix).

Comment thread internal/lfxv2/access_check.go Outdated
Comment thread internal/lfxv2/access_check.go Outdated
Comment thread internal/tools/service.go Outdated
Comment thread internal/tools/service.go Outdated
Comment thread internal/tools/service.go Outdated
@josep-reyero
Copy link
Copy Markdown
Contributor Author

Also changed the custom-claim prefix lfxPrefix namespace as per the Auth0 Action comment.

@josep-reyero josep-reyero requested a review from emsearcy March 25, 2026 13:11
josep-reyero and others added 7 commits March 30, 2026 17:30
…ens tools

Implement Option 4 (V2 access-check) authorization for internal service APIs
that use shared API keys with no per-user auth. The MCP server acts as the
authorization gateway by calling the V2 access-check endpoint (OpenFGA-backed)
before proxying requests.

- Add AccessCheckClient for V2 access-check with # separator format
- Add SlugResolver with in-memory cache for slug→UUID translation
- Add serviceapi.Client for API-key-authenticated service calls
- Add ServiceAuth shared struct and AuthorizeProject flow
- Add onboarding_list_memberships tool (writer relation)
- Add lfx_lens_query tool (lf_staff claim + auditor relation)
- Add lf_staff custom claim extraction in JWT verifier
- Document new env vars in AGENTS.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
The Lens workflow endpoint no longer expects a project name in the
payload. Remove project_name from the tool args, request struct,
and handler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
Switch service API authentication (LFX Lens, Member Onboarding) from shared
API keys to Auth0 client credentials (M2M JWTs). Each service gets its own
Auth0 resource server, so tokens are scoped per-service and validated
cryptographically via JWKS — no more shared secrets.

Changes:

Auth layer:
- Add ClientCredentialsClient (internal/lfxv2/client_credentials.go) for
  OAuth2 client credentials grants with automatic token caching
- Extract generateClientAssertion to shared package-level function for reuse
  by both token exchange and client credentials flows
- Replace APIKey field with TokenSource interface in serviceapi.Client —
  do() now sets Authorization: Bearer instead of ApiKey

Wiring:
- Replace LFXMCP_LF_AI_API_KEY with per-service audience env vars:
  LFXMCP_LENS_API_AUDIENCE and LFXMCP_ONBOARDING_API_AUDIENCE
- Each service gets its own ClientCredentialsClient using the same M2M
  credentials but a different Auth0 audience

Tools (stubbed):
- lfx_lens_query and onboarding_list_memberships return dummy responses
  until the Auth0 resource servers are deployed and the service APIs are
  updated to validate JWTs

Docs:
- Unify architecture-service-api-auth.md and service-api-auth-flow.md into
  service-api-architecture.md with full flow diagrams, tool inventory,
  naming conventions, and API schemas
- Reflect feedback: onboarding_run_agent supports a preview flag instead of
  a separate onboarding_preview_agent tool

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
…pes on service tools

- Fix V2 access-check response parsing to handle the real tab-delimited
  format (request@user\ttrue|false) instead of the incorrect allow/deny
- Handle unordered results via map-based matching instead of positional indexing
- Enforce MCP JWT scopes (read:all/manage:all) on service tools via
  AddToolWithScopes, so reduced-scope tokens are respected
- Fix RequireLFStaff to deny by default when token info is missing
- Replace manual toolError helper with SDK error propagation for
  forward-compatibility with MCP spec changes
- Update architecture doc to reflect dual-layer auth model and correct
  access-check response format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
…laims/)

Aligns with auth0-terraform PR #248 which moves the lf_staff custom claim
from the legacy lfPrefix (https://sso.linuxfoundation.org/claims/) to the
preferred lfxPrefix (http://lfx.dev/claims/) per Eric's review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
Replace the dry-run stub with the real Lens service call via the new
/lfx-lens/mcp/query endpoint. Send user_id (JWT sub claim) and a
per-request session_id for query attribution. Parse the slim JSON
response to return only the content field. Update the tool description
to better convey Lens capabilities to MCP clients.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
@josep-reyero josep-reyero force-pushed the service-api-auth-layer branch from 2f40c93 to 3a7ef2f Compare March 30, 2026 15:33
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
@josep-reyero josep-reyero changed the title Service API layer with OAuth2 client credentials for service APIs and dummy tools Service API layer with OAuth2 client credentials for service APIs. Onboarding dummy tool, LFX Lens prod tool Mar 30, 2026
Include both service-api-auth-layer tools (onboarding_list_memberships,
lfx_lens_query) and main tools (search_b2b_orgs, list_b2b_org_memberships)
in defaultTools and server registration.

🤖 Generated with [GitHub Copilot](https://github.com/features/copilot) (via OpenCode)

Signed-off-by: Eric Searcy <eric@linuxfoundation.org>
@emsearcy emsearcy merged commit 8aa1ec6 into main Apr 8, 2026
7 checks passed
@emsearcy emsearcy deleted the service-api-auth-layer branch April 8, 2026 16:07
emsearcy added a commit that referenced this pull request Apr 14, 2026
Replace the internal working doc docs/service-api-architecture.md (committed
unintentionally in the implementation PR #39) with a canonical ARCHITECTURE.md
at the repo root describing the current state of the system.

Covers:
- Overview with high-level elk flowchart
- Client authn/authz: end-user OAuth2 JWT (PRM discovery, JWKS verify, scope
  gating), M2M client credentials, static API key stop-gap; stateless HTTP
  mode and per-request newServer() factory
- Upstream authn/authz: CTE (RFC 8693) for end-user callers; MCP-server M2M
  V2 token for M2M/API-key callers; native V2 pass-through vs. MCP-brokered
  service APIs (OpenFGA gate + per-service M2M token)
- Four end-to-end sequence diagrams covering all caller/upstream combinations

Closes #27 (superseded by PR #39 / commit 8de2df8).

🤖 Generated with [GitHub Copilot](https://github.com/features/copilot) (via OpenCode)

Signed-off-by: Eric Searcy <eric@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants