feat(sandbox): log connection attempts that bypass proxy path#326
Merged
johntmyers merged 6 commits intomainfrom Mar 16, 2026
Merged
feat(sandbox): log connection attempts that bypass proxy path#326johntmyers merged 6 commits intomainfrom
johntmyers merged 6 commits intomainfrom
Conversation
Add iptables LOG + REJECT rules inside the sandbox network namespace to detect and diagnose direct connection attempts that bypass the HTTP CONNECT proxy. This provides two improvements: 1. Fast-fail UX: applications get immediate ECONNREFUSED instead of a 30-second timeout when they bypass the proxy 2. Diagnostics: a /dev/kmsg monitor emits structured BYPASS_DETECT tracing events with destination, protocol, process identity, and actionable hints Both TCP and UDP bypass attempts are covered (UDP catches DNS bypass). The feature degrades gracefully if iptables or /dev/kmsg are unavailable. Closes #268
The sandbox base image runs Python 3.13. A stale venv on 3.12 causes all exec_python E2E tests to fail because cloudpickle bytecode is not compatible across minor versions.
The fast deploy's helm upgrade was missing the hostGatewayIP value that the bootstrap entrypoint injects into the HelmChart CR. This caused host.openshell.internal hostAliases to be lost from the gateway pod and sandbox pods after any fast deploy, breaking host gateway routing. Read the IP from the HelmChart CR and pass it through to helm upgrade.
The Drop impl for NetworkNamespace was accidentally deleted during the bypass detection refactor, which would cause network namespaces and veth interfaces to leak on every sandbox shutdown. Also removes dead kmsg volume/mount code (bypass monitor uses dmesg instead of direct /dev/kmsg access) and removes an accidentally committed session transcript file.
Collaborator
Author
|
|
pimlock
approved these changes
Mar 16, 2026
drew
pushed a commit
that referenced
this pull request
Mar 16, 2026
* feat(sandbox): log connection attempts that bypass proxy path Add iptables LOG + REJECT rules inside the sandbox network namespace to detect and diagnose direct connection attempts that bypass the HTTP CONNECT proxy. This provides two improvements: 1. Fast-fail UX: applications get immediate ECONNREFUSED instead of a 30-second timeout when they bypass the proxy 2. Diagnostics: a /dev/kmsg monitor emits structured BYPASS_DETECT tracing events with destination, protocol, process identity, and actionable hints Both TCP and UDP bypass attempts are covered (UDP catches DNS bypass). The feature degrades gracefully if iptables or /dev/kmsg are unavailable. Closes #268 * chore: track .python-version to pin Python 3.13.12 for uv The sandbox base image runs Python 3.13. A stale venv on 3.12 causes all exec_python E2E tests to fail because cloudpickle bytecode is not compatible across minor versions. * fix(cluster): preserve hostGatewayIP across fast deploys The fast deploy's helm upgrade was missing the hostGatewayIP value that the bootstrap entrypoint injects into the HelmChart CR. This caused host.openshell.internal hostAliases to be lost from the gateway pod and sandbox pods after any fast deploy, breaking host gateway routing. Read the IP from the HelmChart CR and pass it through to helm upgrade. * wip: fix iptables path resolution, use dmesg for kmsg, add CAP_SYSLOG * fix(sandbox): restore NetworkNamespace Drop impl, remove dead kmsg code The Drop impl for NetworkNamespace was accidentally deleted during the bypass detection refactor, which would cause network namespaces and veth interfaces to leak on every sandbox shutdown. Also removes dead kmsg volume/mount code (bypass monitor uses dmesg instead of direct /dev/kmsg access) and removes an accidentally committed session transcript file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add iptables LOG + REJECT rules inside the sandbox network namespace to detect and diagnose direct connection attempts that bypass the HTTP CONNECT proxy. Applications now get immediate ECONNREFUSED instead of a 30-second timeout, and a
/dev/kmsgmonitor emits structuredBYPASS_DETECTtracing events with destination, protocol, process identity, and actionable hints.Related Issue
Closes #268
Changes
crates/openshell-sandbox/src/sandbox/linux/netns.rs: Addinstall_bypass_rules()method with iptables OUTPUT chain rules (ACCEPT proxy, ACCEPT loopback, ACCEPT established, LOG+REJECT TCP, LOG+REJECT UDP). Includesrun_iptables_netns()helper andiptables_available()check. IPv6 rules mirrored via ip6tables.crates/openshell-sandbox/src/bypass_monitor.rs(NEW): Background/dev/kmsgreader that parses iptables LOG lines, resolves process identity viaprocfs::resolve_tcp_peer_identity(), emits structuredtracing::warn!()events, and feedsDenialEventwithdenial_stage: "bypass"to the denial aggregator.crates/openshell-sandbox/src/lib.rs: Registerbypass_monitormodule. Callinstall_bypass_rules()after namespace creation. Clone denial channel sender for the monitor. Spawn bypass monitor after proxy startup.crates/openshell-sandbox/src/denial_aggregator.rs: Updateddenial_stagedocumentation to include"bypass".examples/bring-your-own-container/Dockerfile: Addiptablesto system packages.architecture/sandbox.md: Added bypass detection section with rules, monitor lifecycle, event format, and graceful degradation.architecture/sandbox-custom-containers.md: Documentediptablesas optional dependency.Deviations from Plan
None — implemented as planned.
External Dependency
The sandbox base image in the
openshell-communityrepo also needsiptablesinstalled. This was handled in NVIDIA/OpenShell-Community#36 (merged).Testing
mise run pre-commitpassesTests added:
crates/openshell-sandbox/src/bypass_monitor.rs— 11 tests covering kmsg line parsing (TCP, UDP, IPv6, missing fields, wrong namespace, unrelated messages), field extraction, and protocol-specific hint generationcrates/openshell-sandbox/src/sandbox/linux/netns.rs— existing namespace test preserved; iptables rule tests require root (#[ignore])e2e/Checklist
Documentation updated:
architecture/sandbox.md: Added bypass detection sectionarchitecture/sandbox-custom-containers.md: Added iptables as optional dependency