RFC-0060: Starlark Session Initialization for vMCP by jerm-dro · Pull Request #60 · stacklok/toolhive-rfcs

jerm-dro · 2026-03-25T03:52:01Z

Introduce a Starlark-based session initialization script for vMCP. A single script runs once per session, receives discovered backends and their capabilities, and calls publish() to declare what the agent sees — optionally wrapping handlers with additional logic. Existing config knobs remain fully supported, but customization of vMCP behavior can now be exactly tailored to the use case without adding more knobs. Increasing configurability no longer means decreasing maintainability.

rfcs/THV-0059-starlark-programmable-middleware.md

rfcs/THV-0060-starlark-programmable-middleware.md

rfcs/THV-0059-starlark-programmable-middleware.md

rfcs/THV-0060-starlark-programmable-middleware.md

rfcs/THV-0059-starlark-programmable-middleware.md

rfcs/THV-0060-starlark-programmable-middleware.md

yrobla

Strong RFC overall — the problem framing, prior art, and alternatives section are all excellent. The backends() + publish() model is clean, and the default preset sketch makes backward compatibility concrete rather than aspirational.

Filing 3 blockers, 3 should-address items, 2 questions, and 2 nits.

rfcs/THV-0060-starlark-programmable-middleware.md

JAORMX

I like the proposal and think that overall this is a good step to provide more customizability and programmability to vMCP. I'd be wary of fully relying on this mechanism for vMCP's programmability, API Gateways tend to offer programming languages as pluggability options (as you mentioned, Envoy's usage of Lua) but they often don't recommend them for critical paths and instead opt for other type of programmability or extensions. e.g. ext_proc via a sidecar which is quite standard to see in the wild or even c++ extensions since they support that too. For us it's a little tricky to know without previous data. So I'd say let's start getting it by adding this addition, but still opt for built-in middleware for extensibility since that's the paved path we have today. e.g. rate limiting should still be implemented via dedicated middleware IMO.

yrobla

Good RFC overall — the core argument (config knob interactions grow quadratically, Starlark + built-ins inverts this) is solid and the prior art section grounds it well. Comments below on a few specific areas.

yrobla · 2026-03-31T10:58:48Z

rfcs/THV-0060-starlark-programmable-middleware.md

+- Production-quality `backends()`, `publish()`, `metadata()` built-ins
+- Name resolution, filtering, and overrides via the `default` preset
+- Rate limiting integration
+- `thv vmcp show-preset` command to inspect built-in presets


Restructuring to ship optimizer + authz together makes sense — the bypass isn't a "we'll fix it later" trade-off, it's a correctness bug. find_tool's dispatch table hands out access to tools Cedar would block (#4374). The interim fix in #4385 plugs one hole but the model is still broken, so shipping the optimizer alone would be knowingly shipping a privilege escalation.

On rate limiting: have you considered just implementing THV-0057 as a check_rate_limit() built-in rather than standalone middleware? Mechanism is identical (Redis token bucket), the default preset reads the same rateLimiting config block so non-script users see zero difference, and you avoid writing middleware that you're going to replace anyway. The main gap is that without current_user() you can only key on tool name — no per-user limits yet. That's fine as long as the Phase 2 scope note is explicit about it, otherwise it'll land as a disappointment.

Thanks for the thorough follow-up. On rate limiting — we discussed this in the gateway channel and landed on shipping THV-0057 as standalone middleware first. The main reason is that rate limiting needs to cover both MCPServers and MCPRemoteProxy, not just vMCP sessions. The middleware approach gives us production data on usage patterns before we commit to the Starlark model. The plan is to benchmark both implementations (middleware vs. Starlark built-in with the same Redis backing) once the middleware is stable. I've updated the "Interaction with rate limiting" section to reflect this.

On phases — restructured the implementation plan into three phases:

Phase 1: POC — validates the programming model and benchmarks Starlark vs. native Go middleware for rate limiting.

Phase 2: Backwards compatibility — full feature parity (including optimizer + authz shipped together, current_user(), composite tool deprecation). This phase is substantial and will ship incrementally.

Phase 3: New functionality — speculative capabilities like scrub_pii(), code mode, and future built-ins.

yrobla · 2026-03-31T10:58:48Z

rfcs/THV-0060-starlark-programmable-middleware.md

+
+| Built-in | Signature | Description |
+|----------|-----------|-------------|
+| `current_user()` | `current_user() → struct(sub, email, groups)` | Returns the authenticated user's identity. The user is known at init time, but this built-in is deferred to a future version. |


Worth nailing down the semantics before this ships: is current_user() meant to be called at script top-level only (gives you the session-creation user), or also inside handler closures (gives you the request-time user)? Use Case 5 calls it inside a wrapper function that runs per-request — that only works if it reflects the current request's identity. Those are two pretty different contracts and threading request context into handler calls has real implementation implications. If you leave this ambiguous now you'll probably end up with a breaking change to handler semantics when current_user() actually lands.

current_user() returns the same user for the lifetime of the session — the user who created it. There is no distinction between top-level and handler calls; it is the same value everywhere. This simplifies the implementation (no request context threading) and matches the session-scoped execution model. Updated the built-in description to clarify this.

yrobla · 2026-03-31T10:58:48Z

rfcs/THV-0060-starlark-programmable-middleware.md

+- `tools/list` responses are filtered to remove tools the caller isn't authorized to use
+- `tools/call` requests are gated — unauthorized calls return 403
+
+This means publishing a tool via `publish()` does not bypass authorization. The script controls *what tools exist and how they behave*; the authz middleware controls *who can see and use them*.


The RFC is careful to separate "script controls what tools exist" from "Cedar controls who can use them" — but current_user() makes that line leaky. An admin can write if "admin" in current_user().groups: publish(admin_tool, fn) and now you've got authz logic in two places. That might be totally fine given the trust model (admins write scripts), but it's worth one sentence in the security section saying whether identity-based tool filtering in scripts is intentional or something you want to steer people away from in favor of Cedar.

Fair point. Identity-based tool filtering in scripts is intentional and complementary to Cedar — there's no one-size-fits-all answer here. Administrators who prefer the declarative Cedar policy model can continue using it for access control. Those who need something simpler or prefer an imperative approach can express it in the script. Both are valid and coexist. Added a note to the security section clarifying this.

yrobla · 2026-03-31T10:58:48Z

rfcs/THV-0060-starlark-programmable-middleware.md

+|--------|----------|--------------------|
+| `default` | Reads existing config knobs (`aggregation`, `optimizer`, etc.) and produces identical behavior to the current config-driven system. Applies filtering, renaming, conflict resolution, and optimizer behavior based on what's configured. | All existing config |
+
+A single `default` preset handles all existing config knobs. When no `sessionInit` block is present, vMCP uses the `default` preset, which reads the existing config fields and produces identical behavior. There is no separate legacy code path — the Starlark engine is the single implementation.


Replacing the existing code paths with the default preset is clean, but it also means the preset is now a single point of failure for everyone who hasn't written a custom script. The equivalence tests cover correctness, but what about a bug that ships mid-upgrade? Is there a way to pin to a preset version, or is "fork it via show-preset" the intended escape hatch? Just worth thinking through — a subtle behavior change in the preset silently affects all existing deployments at once.

show-preset + fork is the intended escape hatch. If the default preset has a bug, an admin can show-preset default, copy the source, and use sessionInit.scriptFile to pin to their known-good copy. Preset versioning adds complexity we don't need — the equivalence tests gate the transition, and the fork path provides a recovery mechanism without introducing a versioning scheme.

yrobla · 2026-03-31T10:58:48Z

rfcs/THV-0060-starlark-programmable-middleware.md

+
+## Open Questions
+
+1. **Sessionless MCP requests**: What happens when MCP supports requests without sessions? Do we have to run this heavy script on every request? We could actually run the script once at startup, since it does not depend on request-time information. However, if we fold in authz concerns from above, then `current_user()` will be request-time information. We could cheat around this by recommending all logic which depends on `current_user()` be placed at the end of the script. When that's encountered during startup, we block and restore the state on each request. Alternatively, we could support two different scripts. One for initialization and one per-request.


The "put current_user() logic at the end" heuristic is going to confuse people. If sessionless MCP is a real possibility, I'd just commit to a two-hook model now: on_session_init() for what tools exist and how they're shaped, on_request() for per-request concerns like rate limiting and user-specific filtering. It maps directly onto the distinction the RFC already makes, survives the sessionless transition cleanly, and removes a footgun where script ordering determines correctness in a non-obvious way.

I like the two-hook model idea. We don't need to commit to it now since the single-script model covers Phase 2, but it's a strong candidate if sessionless MCP lands. Updated the open question to note on_session_init() + on_request() as a promising direction and removed the "put current_user() at the end" heuristic.

Proposes extending vMCP's Starlark engine from composite-tool-only scripting into a unified programmable middleware surface, replacing the growing set of independent config knobs (optimizer, filter, rate limiting, PII scrubbing) with a single script per session. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>

- Rename "middleware script" → "session initialization script" throughout - backends() returns dict[string, Backend] instead of flat tools() list - Handler functions take single dict arg instead of **kwargs - metadata() requires all fields (name, description, parameters, annotations) - Rewrite presets section: no config nesting, auto-generation from existing config - Update sequence diagram: remove Starlark Engine, use MultiSession - Restructure implementation phases: feature parity first, then new capabilities - Simplify Cedar interaction section - Remove THV-0058 reference - Update open questions (remove resolved, add error handling) - Rename CRD to VirtualMCPSessionInitScript - Rename config struct to SessionInitConfig Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Rewrite authz model: scripts see all backends, authz filters at runtime - Add Background section explaining how authorization works today - Replace call_tool() with saved handler dict pattern in all examples - Collapse three presets into single `default` preset with ~80-line sketch - Add handler timeout support (optional kwarg) - Move current_user() to future built-ins (user known at init, deferred) - Fix scrub_pii() as future built-in example, not v0 deliverable - Add config() built-in for preset access to persona config - Expand Alternatives Considered: configuration approaches + language considerations (Starlark, Risor, Wasm, Lua, OPA/Rego) - Add 4-phase implementation plan (POC → production → deprecate → ship) - Update Why Now with cost equation argument - Clarify scope: enabling new capabilities, not shipping them Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Reference stacklok/toolhive#4373 as a concrete example of feature interaction pain in the Problem Statement. Add open question about whether authz decisions should move into Starlark to unify the "who sees what?" model. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The optimizer/Cedar issues (#4373, #4374) are about the authz boundary, not config knob combinations. Move them to Open Question 2 where they motivate pulling authz into Starlark. Use #4287 as the Problem Statement example instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add "Prior Art: Gateway Configurability Patterns" covering Envoy, Kong, and the Configuration Complexity Clock. Add vMCP feature dependency diagram to the Problem Statement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tion Split monolithic "Config knob combinations" into two focused sections: "The configurability problem" (user-facing interaction examples) and "The maintainability problem" (developer-facing quadratic cost). Move Background (prior art + authz) under Proposed Solution where it provides context for the design. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Reframe problem as configurability vs maintainability tension. Move bug evidence and dependency diagram to maintainability section. Remove feature table (duplicative with diagram). Update summary to mirror problem framing. Add links to Envoy, Kong, and Configuration Complexity Clock resources. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Elaborate rate limiting interaction questions (composite tools, groups). Add optimizer × authz bugs (#4373, #4374) to maintainability section. Replace PII hypothetical with concrete framing. Add Envoy and Kong source code snippets with GitHub links to prior art section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace hand-written code snippets with examples from official documentation. Link to doc pages instead of source files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Move Motivating Use Cases to Appendix A - Update Last Updated date to 2026-03-27 - Replace scrub_pii decorator example with v0 logging decorator - Restore missing filter decorator in architecture diagram - Remove duplicate composite tools mentions from presets/config sections - Note composite tools support is optional in compatibility section - Fix Open Question 3 bold formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The CI validator checks all new files under rfcs/ for the THV-#### naming convention. Move the image to images/ at the repo root. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add Alternative 2 covering Go code refactoring as a competing approach. Acknowledge its value while noting it doesn't address configurability or cross-cutting concerns. Include AI-driven development observation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix priority_order.index() crash for unranked backends in default preset - Inline FIND_TOOL_SCHEMA and CALL_TOOL_SCHEMA in default preset sketch - Add when_unavailable parameter to elicit() for non-elicitation clients - Make preset/script/scriptFile mutual exclusion a hard validation error - Resolve error handling open question: MCP-standard isError response dicts - Keep existing decorators in Phase 1 POC, delete in later phases - Clarify session scope: runs once per session creation or Redis restore - Restructure rollout: safe capabilities first, optimizer + authz together - Add warning about handler dispatch bypassing Cedar authz (#4374) - Reference PR #4385 as interim fix for optimizer + authz bypass - Resolve authz open question: ship with optimizer, defer exact design - Add thv vmcp list-presets command Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Restructure implementation plan into 3 phases: POC (with rate limiting benchmark), backwards compatibility, new functionality - Update rate limiting interaction section: THV-0057 ships as standalone middleware first, Starlark built-in benchmarked later - Clarify current_user() returns same value for session lifetime - Add security section note on identity-based filtering being intentional and complementary to Cedar - Rewrite Open Question 1: note two-hook model as future direction for sessionless MCP, remove ordering heuristic Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jerm-dro force-pushed the jerm/2026-03-24-starlark-programmable-middleware branch from b57f13b to 1fde39a Compare March 25, 2026 03:54

jerm-dro commented Mar 25, 2026

View reviewed changes

jerm-dro changed the title ~~RFC-0059: Starlark as programmable middleware for vMCP~~ RFC-0060: Starlark Session Initialization for vMCP Mar 26, 2026

jerm-dro requested review from JAORMX and reyortiz3 March 28, 2026 02:24

yrobla requested changes Mar 30, 2026

View reviewed changes

JAORMX reviewed Mar 31, 2026

View reviewed changes

yrobla reviewed Mar 31, 2026

View reviewed changes

jerm-dro mentioned this pull request Mar 31, 2026

RFC-0058: Inline aggregator into session factory and extract filter decorator #58

Closed

jerm-dro and others added 19 commits April 7, 2026 14:06

Add current_user() to future built-ins list in use cases intro

e52c4ce

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

my edits

cf7b52d

Rename RFC from THV-0059 to THV-0060 to match PR number

69f0ba1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add prior art section and feature dependency diagram

54174d1

Add "Prior Art: Gateway Configurability Patterns" covering Envoy, Kong, and the Configuration Complexity Clock. Add vMCP feature dependency diagram to the Problem Statement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use verbatim doc examples for Envoy and Kong prior art

d6e907b

Replace hand-written code snippets with examples from official documentation. Link to doc pages instead of source files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move image to repo-level images/ to fix naming validation

780b4b2

The CI validator checks all new files under rfcs/ for the THV-#### naming convention. Move the image to images/ at the repo root. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move image to assets/0060/ per repo conventions

be01816

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jerm-dro force-pushed the jerm/2026-03-24-starlark-programmable-middleware branch from 4df7d2f to a4dcada Compare April 7, 2026 22:02


		## Open Questions

		1. Sessionless MCP requests: What happens when MCP supports requests without sessions? Do we have to run this heavy script on every request? We could actually run the script once at startup, since it does not depend on request-time information. However, if we fold in authz concerns from above, then `current_user()` will be request-time information. We could cheat around this by recommending all logic which depends on `current_user()` be placed at the end of the script. When that's encountered during startup, we block and restore the state on each request. Alternatively, we could support two different scripts. One for initialization and one per-request.

Conversation

jerm-dro commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yrobla left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JAORMX left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yrobla left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jerm-dro commented Mar 25, 2026 •

edited

Loading

JAORMX left a comment •

edited

Loading