Skip to content

RFC-0060: Starlark Session Initialization for vMCP#60

Draft
jerm-dro wants to merge 19 commits intomainfrom
jerm/2026-03-24-starlark-programmable-middleware
Draft

RFC-0060: Starlark Session Initialization for vMCP#60
jerm-dro wants to merge 19 commits intomainfrom
jerm/2026-03-24-starlark-programmable-middleware

Conversation

@jerm-dro
Copy link
Copy Markdown
Contributor

@jerm-dro jerm-dro commented Mar 25, 2026

Introduce a Starlark-based session initialization script for vMCP. A single script runs once per session, receives discovered backends and their capabilities, and calls publish() to declare what the agent sees — optionally wrapping handlers with additional logic. Existing config knobs remain fully supported, but customization of vMCP behavior can now be exactly tailored to the use case without adding more knobs. Increasing configurability no longer means decreasing maintainability.

@jerm-dro jerm-dro force-pushed the jerm/2026-03-24-starlark-programmable-middleware branch from b57f13b to 1fde39a Compare March 25, 2026 03:54
@jerm-dro jerm-dro changed the title RFC-0059: Starlark as programmable middleware for vMCP RFC-0060: Starlark Session Initialization for vMCP Mar 26, 2026
@jerm-dro jerm-dro requested review from JAORMX and reyortiz3 March 28, 2026 02:24
Copy link
Copy Markdown
Contributor

@yrobla yrobla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong RFC overall — the problem framing, prior art, and alternatives section are all excellent. The backends() + publish() model is clean, and the default preset sketch makes backward compatibility concrete rather than aspirational.

Filing 3 blockers, 3 should-address items, 2 questions, and 2 nits.

Copy link
Copy Markdown
Contributor

@JAORMX JAORMX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the proposal and think that overall this is a good step to provide more customizability and programmability to vMCP. I'd be wary of fully relying on this mechanism for vMCP's programmability, API Gateways tend to offer programming languages as pluggability options (as you mentioned, Envoy's usage of Lua) but they often don't recommend them for critical paths and instead opt for other type of programmability or extensions. e.g. ext_proc via a sidecar which is quite standard to see in the wild or even c++ extensions since they support that too. For us it's a little tricky to know without previous data. So I'd say let's start getting it by adding this addition, but still opt for built-in middleware for extensibility since that's the paved path we have today. e.g. rate limiting should still be implemented via dedicated middleware IMO.

Copy link
Copy Markdown
Contributor

@yrobla yrobla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good RFC overall — the core argument (config knob interactions grow quadratically, Starlark + built-ins inverts this) is solid and the prior art section grounds it well. Comments below on a few specific areas.

- Production-quality `backends()`, `publish()`, `metadata()` built-ins
- Name resolution, filtering, and overrides via the `default` preset
- Rate limiting integration
- `thv vmcp show-preset` command to inspect built-in presets
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restructuring to ship optimizer + authz together makes sense — the bypass isn't a "we'll fix it later" trade-off, it's a correctness bug. find_tool's dispatch table hands out access to tools Cedar would block (#4374). The interim fix in #4385 plugs one hole but the model is still broken, so shipping the optimizer alone would be knowingly shipping a privilege escalation.

On rate limiting: have you considered just implementing THV-0057 as a check_rate_limit() built-in rather than standalone middleware? Mechanism is identical (Redis token bucket), the default preset reads the same rateLimiting config block so non-script users see zero difference, and you avoid writing middleware that you're going to replace anyway. The main gap is that without current_user() you can only key on tool name — no per-user limits yet. That's fine as long as the Phase 2 scope note is explicit about it, otherwise it'll land as a disappointment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough follow-up. On rate limiting — we discussed this in the gateway channel and landed on shipping THV-0057 as standalone middleware first. The main reason is that rate limiting needs to cover both MCPServers and MCPRemoteProxy, not just vMCP sessions. The middleware approach gives us production data on usage patterns before we commit to the Starlark model. The plan is to benchmark both implementations (middleware vs. Starlark built-in with the same Redis backing) once the middleware is stable. I've updated the "Interaction with rate limiting" section to reflect this.

On phases — restructured the implementation plan into three phases:

  • Phase 1: POC — validates the programming model and benchmarks Starlark vs. native Go middleware for rate limiting.
  • Phase 2: Backwards compatibility — full feature parity (including optimizer + authz shipped together, current_user(), composite tool deprecation). This phase is substantial and will ship incrementally.
  • Phase 3: New functionality — speculative capabilities like scrub_pii(), code mode, and future built-ins.


| Built-in | Signature | Description |
|----------|-----------|-------------|
| `current_user()` | `current_user() → struct(sub, email, groups)` | Returns the authenticated user's identity. The user is known at init time, but this built-in is deferred to a future version. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth nailing down the semantics before this ships: is current_user() meant to be called at script top-level only (gives you the session-creation user), or also inside handler closures (gives you the request-time user)? Use Case 5 calls it inside a wrapper function that runs per-request — that only works if it reflects the current request's identity. Those are two pretty different contracts and threading request context into handler calls has real implementation implications. If you leave this ambiguous now you'll probably end up with a breaking change to handler semantics when current_user() actually lands.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current_user() returns the same user for the lifetime of the session — the user who created it. There is no distinction between top-level and handler calls; it is the same value everywhere. This simplifies the implementation (no request context threading) and matches the session-scoped execution model. Updated the built-in description to clarify this.

- `tools/list` responses are filtered to remove tools the caller isn't authorized to use
- `tools/call` requests are gated — unauthorized calls return 403

This means publishing a tool via `publish()` does not bypass authorization. The script controls *what tools exist and how they behave*; the authz middleware controls *who can see and use them*.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC is careful to separate "script controls what tools exist" from "Cedar controls who can use them" — but current_user() makes that line leaky. An admin can write if "admin" in current_user().groups: publish(admin_tool, fn) and now you've got authz logic in two places. That might be totally fine given the trust model (admins write scripts), but it's worth one sentence in the security section saying whether identity-based tool filtering in scripts is intentional or something you want to steer people away from in favor of Cedar.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. Identity-based tool filtering in scripts is intentional and complementary to Cedar — there's no one-size-fits-all answer here. Administrators who prefer the declarative Cedar policy model can continue using it for access control. Those who need something simpler or prefer an imperative approach can express it in the script. Both are valid and coexist. Added a note to the security section clarifying this.

|--------|----------|--------------------|
| `default` | Reads existing config knobs (`aggregation`, `optimizer`, etc.) and produces identical behavior to the current config-driven system. Applies filtering, renaming, conflict resolution, and optimizer behavior based on what's configured. | All existing config |

A single `default` preset handles all existing config knobs. When no `sessionInit` block is present, vMCP uses the `default` preset, which reads the existing config fields and produces identical behavior. There is no separate legacy code path — the Starlark engine is the single implementation.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing the existing code paths with the default preset is clean, but it also means the preset is now a single point of failure for everyone who hasn't written a custom script. The equivalence tests cover correctness, but what about a bug that ships mid-upgrade? Is there a way to pin to a preset version, or is "fork it via show-preset" the intended escape hatch? Just worth thinking through — a subtle behavior change in the preset silently affects all existing deployments at once.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show-preset + fork is the intended escape hatch. If the default preset has a bug, an admin can show-preset default, copy the source, and use sessionInit.scriptFile to pin to their known-good copy. Preset versioning adds complexity we don't need — the equivalence tests gate the transition, and the fork path provides a recovery mechanism without introducing a versioning scheme.


## Open Questions

1. **Sessionless MCP requests**: What happens when MCP supports requests without sessions? Do we have to run this heavy script on every request? We could actually run the script once at startup, since it does not depend on request-time information. However, if we fold in authz concerns from above, then `current_user()` will be request-time information. We could cheat around this by recommending all logic which depends on `current_user()` be placed at the end of the script. When that's encountered during startup, we block and restore the state on each request. Alternatively, we could support two different scripts. One for initialization and one per-request.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "put current_user() logic at the end" heuristic is going to confuse people. If sessionless MCP is a real possibility, I'd just commit to a two-hook model now: on_session_init() for what tools exist and how they're shaped, on_request() for per-request concerns like rate limiting and user-specific filtering. It maps directly onto the distinction the RFC already makes, survives the sessionless transition cleanly, and removes a footgun where script ordering determines correctness in a non-obvious way.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the two-hook model idea. We don't need to commit to it now since the single-script model covers Phase 2, but it's a strong candidate if sessionless MCP lands. Updated the open question to note on_session_init() + on_request() as a promising direction and removed the "put current_user() at the end" heuristic.

jerm-dro and others added 19 commits April 7, 2026 14:06
Proposes extending vMCP's Starlark engine from composite-tool-only
scripting into a unified programmable middleware surface, replacing
the growing set of independent config knobs (optimizer, filter,
rate limiting, PII scrubbing) with a single script per session.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Jeremy Drouillard <jeremy@stacklok.com>
- Rename "middleware script" → "session initialization script" throughout
- backends() returns dict[string, Backend] instead of flat tools() list
- Handler functions take single dict arg instead of **kwargs
- metadata() requires all fields (name, description, parameters, annotations)
- Rewrite presets section: no config nesting, auto-generation from existing config
- Update sequence diagram: remove Starlark Engine, use MultiSession
- Restructure implementation phases: feature parity first, then new capabilities
- Simplify Cedar interaction section
- Remove THV-0058 reference
- Update open questions (remove resolved, add error handling)
- Rename CRD to VirtualMCPSessionInitScript
- Rename config struct to SessionInitConfig

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rewrite authz model: scripts see all backends, authz filters at runtime
- Add Background section explaining how authorization works today
- Replace call_tool() with saved handler dict pattern in all examples
- Collapse three presets into single `default` preset with ~80-line sketch
- Add handler timeout support (optional kwarg)
- Move current_user() to future built-ins (user known at init, deferred)
- Fix scrub_pii() as future built-in example, not v0 deliverable
- Add config() built-in for preset access to persona config
- Expand Alternatives Considered: configuration approaches + language
  considerations (Starlark, Risor, Wasm, Lua, OPA/Rego)
- Add 4-phase implementation plan (POC → production → deprecate → ship)
- Update Why Now with cost equation argument
- Clarify scope: enabling new capabilities, not shipping them

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reference stacklok/toolhive#4373 as a concrete example of feature
interaction pain in the Problem Statement. Add open question about
whether authz decisions should move into Starlark to unify the
"who sees what?" model.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The optimizer/Cedar issues (#4373, #4374) are about the authz boundary,
not config knob combinations. Move them to Open Question 2 where they
motivate pulling authz into Starlark. Use #4287 as the Problem Statement
example instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add "Prior Art: Gateway Configurability Patterns" covering Envoy, Kong,
and the Configuration Complexity Clock. Add vMCP feature dependency
diagram to the Problem Statement.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tion

Split monolithic "Config knob combinations" into two focused sections:
"The configurability problem" (user-facing interaction examples) and
"The maintainability problem" (developer-facing quadratic cost). Move
Background (prior art + authz) under Proposed Solution where it provides
context for the design.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reframe problem as configurability vs maintainability tension. Move bug
evidence and dependency diagram to maintainability section. Remove
feature table (duplicative with diagram). Update summary to mirror
problem framing. Add links to Envoy, Kong, and Configuration Complexity
Clock resources.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Elaborate rate limiting interaction questions (composite tools, groups).
Add optimizer × authz bugs (#4373, #4374) to maintainability section.
Replace PII hypothetical with concrete framing. Add Envoy and Kong
source code snippets with GitHub links to prior art section.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hand-written code snippets with examples from official
documentation. Link to doc pages instead of source files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move Motivating Use Cases to Appendix A
- Update Last Updated date to 2026-03-27
- Replace scrub_pii decorator example with v0 logging decorator
- Restore missing filter decorator in architecture diagram
- Remove duplicate composite tools mentions from presets/config sections
- Note composite tools support is optional in compatibility section
- Fix Open Question 3 bold formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CI validator checks all new files under rfcs/ for the THV-####
naming convention. Move the image to images/ at the repo root.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Alternative 2 covering Go code refactoring as a competing approach.
Acknowledge its value while noting it doesn't address configurability
or cross-cutting concerns. Include AI-driven development observation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix priority_order.index() crash for unranked backends in default preset
- Inline FIND_TOOL_SCHEMA and CALL_TOOL_SCHEMA in default preset sketch
- Add when_unavailable parameter to elicit() for non-elicitation clients
- Make preset/script/scriptFile mutual exclusion a hard validation error
- Resolve error handling open question: MCP-standard isError response dicts
- Keep existing decorators in Phase 1 POC, delete in later phases
- Clarify session scope: runs once per session creation or Redis restore
- Restructure rollout: safe capabilities first, optimizer + authz together
- Add warning about handler dispatch bypassing Cedar authz (#4374)
- Reference PR #4385 as interim fix for optimizer + authz bypass
- Resolve authz open question: ship with optimizer, defer exact design
- Add thv vmcp list-presets command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Restructure implementation plan into 3 phases: POC (with rate
  limiting benchmark), backwards compatibility, new functionality
- Update rate limiting interaction section: THV-0057 ships as
  standalone middleware first, Starlark built-in benchmarked later
- Clarify current_user() returns same value for session lifetime
- Add security section note on identity-based filtering being
  intentional and complementary to Cedar
- Rewrite Open Question 1: note two-hook model as future direction
  for sessionless MCP, remove ordering heuristic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jerm-dro jerm-dro force-pushed the jerm/2026-03-24-starlark-programmable-middleware branch from 4df7d2f to a4dcada Compare April 7, 2026 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants