Route MCP sessions to specific StatefulSet pods via headless DNS#4638
Route MCP sessions to specific StatefulSet pods via headless DNS#4638
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4638 +/- ##
==========================================
- Coverage 68.88% 68.80% -0.09%
==========================================
Files 505 505
Lines 52437 52521 +84
==========================================
+ Hits 36121 36135 +14
- Misses 13525 13589 +64
- Partials 2791 2797 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
When backendReplicas > 1, the proxy was storing the ClusterIP service
URL as backend_url on session initialize. After a proxy-runner restart,
Redis-recovered sessions were routed through the ClusterIP, which sent
requests to a random pod — one that had never seen the initialize and
returned -32001 ("session not found").
Two root causes fixed:
1. StatefulSet serviceName was set to containerName instead of
mcp-<name>-headless, so Kubernetes never created pod DNS records.
2. On initialize, the proxy now picks a random pod ordinal and stores
its headless DNS URL (e.g. myserver-0.mcp-myserver-headless.default
.svc.cluster.local) as backend_url instead of the ClusterIP. The
Rewrite closure already reads backend_url to route follow-up
requests, so sessions now survive proxy-runner restarts.
The operator populates a new HeadlessServiceConfig (StatefulSet name,
headless service name, namespace, replica count) in ScalingConfig when
backendReplicas > 1. Single-replica and non-Kubernetes deployments are
unaffected.
Fixes #4575
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
There was a problem hiding this comment.
Pull request overview
This PR fixes MCP session routing in Kubernetes when backendReplicas > 1 by ensuring sessions can be routed back to a specific StatefulSet pod (via headless DNS) after a proxy-runner restart, instead of being re-randomized through the ClusterIP service.
Changes:
- Fix StatefulSet
serviceNameto match the created headless service name so per-pod DNS records are generated. - Add headless-service routing config to the operator/runconfig path and transport layer, and attempt to store a pod-specific headless DNS URL in session metadata.
- Add unit + E2E coverage around multi-replica session routing and proxy-runner restart behavior.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/e2e/thv-operator/virtualmcp/mcpserver_scaling_test.go | New E2E tests for Redis-backed sessions across replicas and proxy-runner restart routing. |
| pkg/transport/types/transport.go | Adds pluggable SessionStorage to transport config for non-local session stores (e.g., Redis). |
| pkg/transport/proxy/transparent/transparent_proxy.go | Adds headless-service routing option and stores per-session backend_url metadata. |
| pkg/transport/proxy/transparent/backend_routing_test.go | Unit tests for storing pod-specific backend_url when headless routing is configured. |
| pkg/transport/http.go | Wires session storage + headless routing options into the HTTP transport’s proxy creation. |
| pkg/transport/factory.go | Threads SessionStorage into transports; adds a transport option for headless routing. |
| pkg/runner/runner.go | Creates Redis session storage from scaling config and passes headless routing settings into transports. |
| pkg/runner/config.go | Extends scaling config with HeadlessService metadata for building per-pod DNS URLs. |
| pkg/container/kubernetes/client.go | Fixes StatefulSet serviceName to mcp-<name>-headless to enable StatefulSet pod DNS. |
| cmd/thv-operator/controllers/mcpserver_runconfig.go | Operator populates new headless routing config into the generated RunConfig. |
| .codespellrc | Adds clientA to ignore list for the new tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| } | ||
|
|
||
| // WithPodHeadlessService configures pod-specific routing for multi-replica StatefulSet deployments. | ||
| // When set, each new MCP session is pinned to a randomly selected pod via its headless DNS name | ||
| // (e.g. myserver-0.mcp-myserver-headless.default.svc.cluster.local) so that session routing | ||
| // survives proxy-runner restarts without being sent to the wrong pod. | ||
| // The option is a no-op when replicas <= 1 or when any required field is empty. | ||
| func WithPodHeadlessService(statefulSetName, serviceName, namespace string, replicas int32) Option { | ||
| return func(p *TransparentProxy) { | ||
| if statefulSetName == "" || serviceName == "" || namespace == "" || replicas <= 1 { | ||
| return | ||
| } | ||
| p.headlessService = &podHeadlessService{ | ||
| statefulSetName: statefulSetName, | ||
| serviceName: serviceName, | ||
| namespace: namespace, | ||
| replicas: replicas, | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
blocker: Make headless DNS routing the default for all Kubernetes deployments, and remove the headlessSetter type assertion pattern.
Two related issues:
1. Headless routing should be unconditional for Kubernetes. For a single-replica StatefulSet, myserver-0.mcp-myserver-headless.ns.svc.cluster.local is deterministic (always ordinal 0) and the headless service already exists because the operator creates it. This would eliminate the if val > 1 guard in populateScalingConfig, remove the replicas <= 1 no-op here, and give you one routing code path instead of two. pickPodBackendURL degenerates correctly — rand.Int(rand.Reader, big.NewInt(1)) always returns 0. Non-Kubernetes deployments are unaffected since HeadlessServiceConfig stays nil.
2. The headlessSetter anonymous interface in factory.go is unsafe. If a transport doesn't implement the unexported method, the option silently no-ops — no compile-time check, no runtime warning, no error returned. This means a misconfiguration (e.g. applying the option to a transport type that doesn't support it) is invisible. Put this on types.Config like SessionStorage, so it flows through the same plumbing and doesn't rely on type assertions against unexported methods.
Summary
When backendReplicas > 1, the proxy was storing the ClusterIP service URL as backend_url on session initialize. After a proxy-runner restart, Redis-recovered sessions were routed through the ClusterIP, which sent requests to a random pod — one that had never seen the initialize and returned -32001 ("session not found").
Two root causes fixed:
StatefulSet serviceName was set to containerName instead of mcp--headless, so Kubernetes never created pod DNS records.
On initialize, the proxy now picks a random pod ordinal and stores its headless DNS URL (e.g. myserver-0.mcp-myserver-headless.default .svc.cluster.local) as backend_url instead of the ClusterIP. The Rewrite closure already reads backend_url to route follow-up requests, so sessions now survive proxy-runner restarts.
The operator populates a new HeadlessServiceConfig (StatefulSet name, headless service name, namespace, replica count) in ScalingConfig when backendReplicas > 1. Single-replica and non-Kubernetes deployments are unaffected.
Fixes #4575
Type of change
Test plan
task test)task test-e2e)task lint-fix)Changes
Does this introduce a user-facing change?
Special notes for reviewers
Large PR Justification
This is a complete PR that solves a bug, and is an isolated functionality. Cannot be split.