feat(054-B): output sanitisation enforcement (Spec 054 Track B)#535
Merged
Conversation
Deploying mcpproxy-docs with
|
| Latest commit: |
523075c
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://33252495.mcpproxy-docs.pages.dev |
| Branch Preview URL: | https://059-output-sanitisation.mcpproxy-docs.pages.dev |
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Make mcpproxy's existing-but-discarded content-trust classification (Spec 035) and secret detector (Spec 026) actually contain untrusted tool output, instead of only logging. All behaviour is fully opt-in; default config forwards every response byte-for-byte. - spotlight: wrap untrusted (open-world) tool text in source-identifying «untrusted:server/tool» delimiters, escaping the sentinel so content cannot forge the wrapper (FR-B1/B2). Applied post-truncation, not cached. - redact: mask detected secrets as [REDACTED:<category>] reusing the Spec 026 detector (FR-B3). - strip: neutralise ANSI / C0-C1 / zero-width / bidi sequences on untrusted text, per-class toggles (FR-B4). - block: replace the payload with a remediation error on a critical detection (FR-B7). Redact/strip/block run on the raw result BEFORE forwardContentResult truncates and caches it, so read_cache never stores an unredacted secret and a blocked response is never cached. Non-text blocks are untouched (FR-B5). Mutating actions emit a policy_decision activity record. New OutputSanitisationConfig mirrors OutputValidationConfig (Track A). Verified end-to-end: curl/MCP roundtrip (spotlight/redact/strip/block + read_cache), API E2E (65/65), and the Web UI activity view. Relates to Spec 054 Track B (#521).
4c7ab7f to
523075c
Compare
📦 Build ArtifactsWorkflow Run: View Run Available Artifacts
How to DownloadOption 1: GitHub Web UI (easiest)
Option 2: GitHub CLI gh run download 26623650066 --repo smart-mcp-proxy/mcpproxy-go
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements Spec 054 Track B — output sanitisation enforcement — the next track of the security-gateway umbrella (#521) after Track A (output-schema validation, #525/056). Makes mcpproxy's already-computed-but-discarded content-trust classification (Spec 035) and secret detector (Spec 026) actually contain untrusted tool output before it reaches the agent, at the single response chokepoint.
Fully opt-in — with no
output_sanitisationblock, every response is forwarded byte-for-byte (pre-feature behaviour; keeps the API E2E suite green).Behaviours (all opt-in)
spotlight_untrusted: true«untrusted:server/tool»delimiters, escaping the sentinel so content can't forge the wrapperresponse_action: redact[REDACTED:<category>](reuses Spec 026 detector)strip_control_chars: trueresponse_action: blockNon-text blocks (image/audio/embedded) are never modified (B5). Mutating actions emit a
policy_decisionactivity record.Ordering / read_cache
Redact / strip / block run on the raw result before
forwardContentResulttruncates and caches it, soread_cachenever stores an unredacted secret and a blocked response is never cached. The lossless spotlight wrapper is applied post-truncation and is not cached.Verification
Verified end-to-end (artifacts kept local, not committed):
[REDACTED:cloud_credentials], raw secret count 0)./scripts/test-api-e2e.sh→ 65/65, no regression-race, both editions build, lint cleanSpec artifacts
Speckit set under
specs/059-output-sanitisation/(spec, plan, research, data-model, tasks, quickstart).Relates to Spec 054 Track B (#521).