Add rate limiting CRD types and token-bucket package#4577
Add rate limiting CRD types and token-bucket package#4577
Conversation
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4577 +/- ##
==========================================
+ Coverage 68.87% 68.92% +0.05%
==========================================
Files 505 507 +2
Lines 52380 52456 +76
==========================================
+ Hits 36076 36155 +79
+ Misses 13516 13510 -6
- Partials 2788 2791 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
Large PR justification has been provided. Thank you!
|
@claude review using the code-review skill in the repo. |
|
Claude finished @jerm-dro's task in 4m 13s —— View job Code Review: Add rate limiting CRD types and token-bucket package
SummaryWell-structured foundation. The package layout, interface design, Redis hash tag for cluster slot affinity, and Lua TIME clock are all solid choices. Tests are thorough and well-organized. Three issues require changes before merge; three are suggestions. Required changes1. Token leak: server token consumed before tool-level check ( In The cleanest fix is a two-phase approach or an atomic multi-key Lua script. A minimal acceptable fix is a code comment explicitly documenting this as a known limitation with a reference to the tracking issue. 2. -- all three HMSET calls should be HSET:
redis.call('HSET', key, 'tokens', maxTokens - 1, 'last_refill', now)
3. Missing The CRD has if b.MaxTokens <= 0 {
return nil, fmt.Errorf("maxTokens must be positive, got %d", b.MaxTokens)
}Suggestions4. When a request is rejected ( 5. Document The current godoc says "minimum wait time" but the function returns 6. Key name comment ( The "global" segment in Minor observations
|
There was a problem hiding this comment.
Pull request overview
This PR introduces the initial building blocks for Redis-backed token-bucket rate limiting for MCPServer, including new CRD fields, a pkg/ratelimit package, and JSON-RPC/HTTP 429 error helpers.
Changes:
- Add
rateLimitingconfiguration to the MCPServer CRD (global/perUser/per-tool buckets). - Implement a Redis Lua-script-backed token bucket and a
Limiterinterface to evaluate server + per-tool limits. - Add helpers for JSON-RPC rate-limit errors (
-32029) and HTTP 429 responses withRetry-After.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/thv-operator/api/v1alpha1/mcpserver_types.go | Adds RateLimit API types and spec.rateLimiting field |
| cmd/thv-operator/api/v1alpha1/mcpserver_types_test.go | Adds JSON roundtrip tests for new rate limit types/fields |
| cmd/thv-operator/api/v1alpha1/zz_generated.deepcopy.go | Deepcopy generation for new CRD types/field |
| deploy/charts/operator-crds/templates/toolhive.stacklok.dev_mcpservers.yaml | CRD schema update for rateLimiting |
| deploy/charts/operator-crds/files/crds/toolhive.stacklok.dev_mcpservers.yaml | Packaged CRD schema update for rateLimiting |
| docs/operator/crd-api.md | Generated CRD API docs include new rate limit types/field |
| pkg/ratelimit/config.go | Parses CRD rate limit config into runtime config + key derivation |
| pkg/ratelimit/config_test.go | Unit tests for config parsing/validation |
| pkg/ratelimit/limiter.go | Implements Limiter that checks server + per-tool buckets |
| pkg/ratelimit/limiter_test.go | Unit tests for limiter behavior and Redis failure mode |
| pkg/ratelimit/errors.go | JSON-RPC error body + HTTP 429 helpers for rate limiting |
| pkg/ratelimit/errors_test.go | Unit tests for error helpers |
| pkg/ratelimit/internal/bucket/bucket.go | Redis-backed token bucket with atomic Lua script + key prefix helper |
| pkg/ratelimit/internal/bucket/bucket_test.go | Unit tests for bucket consume/refill/expiry/retry-after |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
8559673 to
d1c76e7
Compare
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
d1c76e7 to
0fdb684
Compare
0fdb684 to
9495ef1
Compare
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
9495ef1 to
3b8ae2e
Compare
3b8ae2e to
6b4a933
Compare
6b4a933 to
5fbf2bf
Compare
Add RateLimitConfig, RateLimitBucket, and ToolRateLimitConfig CRD types to support configurable token-bucket rate limiting on MCPServer. The rateLimiting field accepts global and per-tool bucket configs with maxTokens and refillPeriod (metav1.Duration) parameters. CEL validation ensures: - At least one of global or tools is configured - Each tool entry has a global bucket set - Redis session storage is required when rateLimiting is set Part of #4551 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduce pkg/ratelimit/ with a public Limiter interface and internal token bucket implementation backed by a Redis Lua script. Public API surface is minimal: - Limiter interface with Allow(ctx, toolName, userID) - Decision result type with Allowed and RetryAfter - NewLimiter constructor accepting CRD types directly Internal (pkg/ratelimit/internal/bucket/): - Atomic multi-key Lua script that checks all buckets before consuming from any, preventing rejected per-tool calls from draining the server-level budget - Uses Redis TIME for clock consistency and miniredis testability - Key format and derivation fully encapsulated in the internal package Tests cover all acceptance criteria for global rate limits: - AC3: requests exceeding maxTokens are rejected - AC4: Retry-After computed from the global bucket refill rate - AC5: per-tool limit on one tool does not affect other tools - AC6: request must pass both server-level and per-tool global limits - Atomic multi-bucket: tool rejection does not drain server budget - Redis unavailability returns error (fail-open is caller's job) - No-op limiter when config is nil Part of #4551 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5fbf2bf to
bfb7eee
Compare
ChrisJBurns
left a comment
There was a problem hiding this comment.
Nice work on the atomic multi-bucket Lua script and the clean internal/public package split. Three issues found — two medium, one low.
| # | Severity | Issue |
|---|---|---|
| 1 | Medium | ToolRateLimitConfig.Global marked +optional but CEL-validated as required |
| 2 | Medium | No duplicate tool name protection on tools array |
| 3 | Low | Negative elapsed time in Lua script not guarded against clock drift |
|
|
||
| // Global defines a server-wide token bucket for this specific tool. | ||
| // +optional | ||
| Global *RateLimitBucket `json:"global,omitempty"` |
There was a problem hiding this comment.
Medium: +optional contradicts CEL-required validation
The +optional marker and omitempty JSON tag tell the API server, kubectl explain, and generated docs that Global is not required, but the CEL rule on line 516 (has(self.global)) immediately rejects objects without it. This creates a confusing UX.
If global is always required (which makes sense since a ToolRateLimitConfig without a bucket is meaningless), remove +optional and omitempty, and use +kubebuilder:validation:Required instead — this eliminates the CEL rule entirely:
| Global *RateLimitBucket `json:"global,omitempty"` | |
| Global *RateLimitBucket `json:"global"` |
Alternatively, if keeping +optional for forward-compatibility with a future perUser field (#4550), add a comment explaining that reasoning.
| // Each entry applies additional rate limits to calls targeting a specific tool name. | ||
| // A request must pass both the server-level limit and the per-tool limit. | ||
| // +optional | ||
| Tools []ToolRateLimitConfig `json:"tools,omitempty"` |
There was a problem hiding this comment.
Medium: No duplicate tool name validation
If a user specifies two entries with the same name (e.g., name: "search" twice), the second silently overwrites the first in the limiter's map[string]*bucket.TokenBucket. Adding +listType=map and +listMapKey=name enables the API server to reject duplicates at admission and makes strategic merge patch work correctly on individual tool entries:
| Tools []ToolRateLimitConfig `json:"tools,omitempty"` | |
| // +listType=map | |
| // +listMapKey=name | |
| // +optional | |
| Tools []ToolRateLimitConfig `json:"tools,omitempty"` |
(This requires removing the existing // +optional line above.)
| if tokens == nil then | ||
| refilled[i] = maxTokens | ||
| else | ||
| local elapsedSec = (now - lastRefill) / 1000000 |
There was a problem hiding this comment.
Low: Negative elapsed time not guarded
If redis.call('TIME') returns a value less than lastRefill (NTP adjustment, leap second, or clock drift during failover), elapsedSec goes negative, which would subtract tokens from the bucket rather than refilling.
Add a guard:
| local elapsedSec = (now - lastRefill) / 1000000 | |
| local elapsedSec = math.max(0, (now - lastRefill) / 1000000) |
Summary
Rate limiting is needed to protect downstream MCP servers from traffic
spikes. RFC THV-0057 specifies a Redis-backed token bucket approach.
This PR lays the foundation for global rate limits on MCPServer:
rateLimitingfield toMCPServerSpecwithglobalandper-tool bucket configs (
maxTokens+refillPeriod)pkg/ratelimit/with a publicLimiterinterface andinternal token bucket backed by an atomic Redis Lua script
config.DurationforrefillPeriodper RFC review feedbackNewLimiterconstructor accepts CRD types directly — no intermediatepublic config types
Middleware wiring, JSON-RPC error response helpers, and per-user rate
limits will follow in stacked PRs (#4550).
Part of #4551
Type of change
Test plan
task test)task lint-fix)14 tests cover:
no-op on nil config, Redis unavailability returns error, userID no-op
Changes
cmd/thv-operator/api/v1alpha1/mcpserver_types.goRateLimitConfig,RateLimitBucket,ToolRateLimitConfigtypes andRateLimitingfieldpkg/ratelimit/limiter.goLimiterinterface,Decisiontype,NewLimiter()constructor accepting CRD types directlypkg/ratelimit/internal/bucket/bucket.goDoes this introduce a user-facing change?
New
rateLimitingfield onMCPServerCRD. Not yet functional — middlewarewiring will follow in a subsequent PR.
Special notes for reviewers
redis.call('TIME')for clock consistency across replicas(and to enable
miniredis.FastForward()in tests)preventing tool rejections from draining the server budget
pkg/ratelimit/internal/bucket/is intentionally internal — only theLimiterinterface and
Decisiontype are publicLarge PR Justification
This exceeds the 400-line guideline but is a single cohesive new package (
pkg/ratelimit/) that cannot be meaningfully split further — the limiter and internal bucket form one logical unit.Generated with Claude Code