Wire rate limit middleware into proxy runner chain#4652
Open
Wire rate limit middleware into proxy runner chain#4652
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4652 +/- ##
==========================================
- Coverage 68.88% 68.84% -0.05%
==========================================
Files 506 509 +3
Lines 52449 52642 +193
==========================================
+ Hits 36130 36240 +110
- Misses 13525 13607 +82
- Partials 2794 2795 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
15f2cd5 to
90ebabc
Compare
90ebabc to
7063692
Compare
7063692 to
d28ef3e
Compare
d28ef3e to
4d1d6aa
Compare
Implement the rate limit middleware factory and handler that wires into the proxy runner middleware chain. The handler: - Errors if MCP parser context is missing (middleware ordering bug) - Only rate-limits tools/call requests; other methods pass through - Fails open on Redis errors (logs warning, allows request) - Returns HTTP 429 with JSON-RPC -32029 error and Retry-After header The factory pings Redis at startup to fail fast on misconfiguration rather than silently failing open on every request. Tests use a dummy Limiter implementation to verify handler behavior without Redis: allowed, rejected, fail-open, missing context, and non-tools/call passthrough. Part of #4551 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Register the rate limit middleware factory and add it to the proxy runner middleware chain, positioned after MCP parser (needs tool name from context) and before validating webhooks. The operator reconciler maps spec.rateLimiting directly to the RunConfig, carrying the CRD type without intermediate config types. The middleware factory creates a Redis client from session storage config and builds the Limiter at startup. Move populateScalingConfig before PopulateMiddlewareConfigs in the reconciler so SessionRedis config is available when the rate limit middleware reads it. Validate that Redis session storage is configured when rate limiting is enabled, with a clear error message. Part of #4551 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ginkgo-based E2E tests that deploy Redis and an MCPServer with rate limiting in a Kind cluster and verify runtime enforcement: - AC7: Send requests exceeding the shared limit, verify HTTP 429 with JSON-RPC error code -32029 and retryAfterSeconds in error data - AC8: Deploy MCPServer with both shared and per-tool rate limit config, verify the CRD is accepted and the server reaches Running phase Creates a NodePort service targeting the MCPServer proxy pods for test access. Uses the correct Accept header for streamable-http. Part of #4551 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4d1d6aa to
0f476a5
Compare
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rate limiting CRD types and the token-bucket Limiter package were
merged in #4577. This PR wires the limiter into the proxy runner
middleware chain so rate limiting is enforced at runtime.
rate-limits only
tools/callrequests, returns HTTP 429 withJSON-RPC error code
-32029andRetry-Afterheader on rejection,and fails open on Redis errors
parser in the middleware chain
spec.rateLimitingfrom the CRD to the RunConfig in theoperator reconciler, carrying the CRD type directly
Kind cluster and verify rate limit enforcement
Closes #4551
Type of change
Test plan
task test)task test-e2e)task lint-fix)Unit tests (12 new):
missing MCP context (500), non-tools/call passthrough
presence/absence
E2E tests (2 new):
429 with JSON-RPC -32029 and retryAfterSeconds
reaches Running phase
Changes
pkg/ratelimit/middleware.gopkg/runner/middleware.gopkg/runner/config.goRateLimitConfigandRateLimitNamespacefieldspkg/runner/config_builder.goWithRateLimitConfigbuilder optioncmd/thv-operator/controllers/mcpserver_runconfig.gospec.rateLimitingto RunConfigtest/e2e/thv-operator/acceptance_tests/Special notes for reviewers
ParsedMCPRequestis missingfrom context — this catches middleware ordering bugs at runtime
tools/callrequests are rate-limited; other MCP methods passthrough unconditionally
SessionRedisConfig+ theTHV_SESSION_REDIS_PASSWORDenv var, sharing connection info withsession storage
test/e2e/thv-operator/virtualmcp/patternsLarge PR Justification
1140 lines across 11 files, but 844 lines are tests (unit + E2E
infrastructure). The 296 lines of production code span 5 files that
form a single logical change: middleware + wiring + operator mapping.
Generated with Claude Code