Skip to content

PR4: SHM leak fixes, sandbox detection, 0.9.1 release prep#5

Merged
pollockjj merged 4 commits intoComfy-Org:mainfrom
pollockjj:pr4
Feb 25, 2026
Merged

PR4: SHM leak fixes, sandbox detection, 0.9.1 release prep#5
pollockjj merged 4 commits intoComfy-Org:mainfrom
pollockjj:pr4

Conversation

@pollockjj
Copy link
Collaborator

@pollockjj pollockjj commented Feb 25, 2026

Summary

  • SHM leak fixes and tensor transport hardening — Optimized CPU tensor shared memory strategy to prevent cross-process reference pinning, added transport cache purging during tensor keeper flush, explicit JSON transport for RemoteObjectHandle, and improved RPC error handling in UDS entrypoint.
  • Sandbox mode detection and CI updates — Added runtime sandbox capability detection, updated CI workflows for compatibility, expanded test coverage for bwrap/sandbox/memory leaks.
  • 0.9.1 packaging — Version bump, modernized license metadata, added MANIFEST.in pruning, added build and twine to dev extras.

Changes

22 files changed, +425 / -148

Test plan

  • CI passes (pytest, ruff, mypy)
  • Integration tests pass for sandbox detection
  • Memory leak tests confirm zero SHM leakage
  • python -m build produces clean wheel and sdist
  • twine check passes on built artifacts

Copilot AI review requested due to automatic review settings February 25, 2026 11:04
@socket-security
Copy link

socket-security bot commented Feb 25, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedbuild@​1.4.098100100100100
Addedtwine@​6.2.098100100100100
Addedmyst-parser@​5.0.0100100100100100

View full report

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request prepares version 0.9.1 of pyisolate with three main areas of improvement: shared memory leak fixes in tensor transport, runtime sandbox capability detection with degraded mode fallback, and packaging modernization.

Changes:

  • Implemented SHM leak prevention via borrowed storage strategy, transport cache purging, and cleanup hooks (atexit/signal handlers)
  • Added runtime sandbox detection with degraded mode fallback for systems blocking unprivileged user namespaces (e.g., Ubuntu AppArmor)
  • Updated packaging metadata to modern pyproject.toml conventions, added build tooling to dev dependencies, and pruned test artifacts from distributions

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
pyisolate/_internal/tensor_serializer.py Added flush/purge functions, borrowed SHM strategy, atexit/signal cleanup handlers
pyisolate/_internal/sandbox_detect.py Added degraded mode test for sandbox fallback when user namespaces blocked
pyisolate/_internal/rpc_transports.py Added explicit RemoteObjectHandle serialization, registry lookup before dict fallback, deprecation warnings
pyisolate/_internal/model_serialization.py Reordered RemoteObjectHandle check after adapter serializer lookup
pyisolate/_internal/uds_client.py Changed extension load failures to keep RPC alive for graceful error handling
pyisolate/_internal/sandbox.py Added /opt to default read-only bind mounts for hosted toolchains
pyisolate/_internal/environment.py Added editable dependency parsing, pip availability fallback, uv --seed flag
tests/test_sandbox_detect.py Added tests for degraded bwrap mode
tests/test_memory_leaks.py Reformatted monkeypatch lambda expression
tests/test_bwrap_command.py Added Linux-only skipif marker
tests/integration_v2/test_isolation.py Added sandbox availability detection and skipif marker
tests/harness/host.py Changed ImportError to Exception, added sandbox detection
pyproject.toml Version bump to 0.9.1, modernized license metadata, removed classifier, added build/twine deps
pyisolate/init.py Version bump, exported flush_tensor_keeper and purge_orphan_sender_shm_files
example/host.py Added sandbox detection, removed numpy version check from test
docs/conf.py Version bump to 0.9.1
MANIFEST.in Added prune directive for test temps
.github/workflows/windows.yml Removed PyTorch 2.3.0 matrix, migrated to integration_v2 tests
.github/workflows/pytorch.yml Migrated to integration_v2 test paths
.github/workflows/docs.yml Added repository check for deployment guard
.github/workflows/ci.yml Removed Debian 11 and Rocky 9, added --python python3 to uv venv
.coderabbit.yaml Deleted entire file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +227 to +229
_install_signal_cleanup_handlers()


Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installing signal handlers at module import time (line 227) can have unintended side effects. If this module is imported in a context where signal handlers are already configured (e.g., in a web server, test runner, or GUI application), this could overwrite existing handlers unexpectedly. Consider moving this to an explicit initialization function that users can call when they want this behavior, or at minimum document this side effect clearly in the module docstring.

Suggested change
_install_signal_cleanup_handlers()
def install_signal_cleanup_handlers() -> None:
"""Install optional signal cleanup handlers.
This is not invoked automatically on import; call explicitly when desired.
"""
_install_signal_cleanup_handlers()

Copilot uses AI. Check for mistakes.
Comment on lines +273 to +275
logger.warning(
"⚠️ GENERIC SERIALIZER USED ⚠️ Serializing %s via __dict__ fallback. "
"This is a SECURITY RISK and will be removed. Register a proper serializer!",
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning messages about "GENERIC SERIALIZER USED" and "GENERIC DESERIALIZER USED" with "⚠️" emoji and "SECURITY RISK" language are quite alarming and may cause unnecessary concern for users. While it's good to encourage proper serializer registration, these warnings will appear in normal operation for many legitimate use cases. Consider downgrading these to debug level or making the language less alarming (e.g., "Consider registering a custom serializer for better performance and security"). The warnings should guide users without causing panic, especially since the PR description states this is used as a fallback mechanism.

Suggested change
logger.warning(
"⚠️ GENERIC SERIALIZER USED ⚠️ Serializing %s via __dict__ fallback. "
"This is a SECURITY RISK and will be removed. Register a proper serializer!",
logger.info(
"Using generic __dict__-based serializer fallback for %s. "
"Consider registering a custom serializer for better performance and security.",

Copilot uses AI. Check for mistakes.
Comment on lines +412 to +414
logger.warning(
"⚠️ GENERIC DESERIALIZER USED ⚠️ Deserializing %s via __pyisolate_object__. "
"This is a SECURITY RISK and will be removed. Register a proper deserializer!",
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning messages about "GENERIC DESERIALIZER USED" with "⚠️" emoji and "SECURITY RISK" language are quite alarming and may cause unnecessary concern for users. While it's good to encourage proper deserializer registration, these warnings will appear in normal operation for many legitimate use cases. Consider downgrading these to debug level or making the language less alarming (e.g., "Consider registering a custom deserializer for better performance and security"). The warnings should guide users without causing panic.

Suggested change
logger.warning(
"⚠️ GENERIC DESERIALIZER USED ⚠️ Deserializing %s via __pyisolate_object__. "
"This is a SECURITY RISK and will be removed. Register a proper deserializer!",
logger.debug(
"Using generic object deserialization for %s via __pyisolate_object__. "
"Consider registering a custom deserializer for improved performance and safety.",

Copilot uses AI. Check for mistakes.
@@ -18,7 +19,6 @@ maintainers = [
classifiers = [
"Development Status :: 3 - Alpha",
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the "License :: OSI Approved :: MIT License" classifier is inconsistent with keeping the MIT license declaration. While the license and license-files fields are now the modern way to specify licenses in pyproject.toml, the classifier should still be included for discoverability on PyPI and compatibility with tools that search for licenses via classifiers. The classifier provides metadata that helps users filter packages by license on PyPI.

Suggested change
"Development Status :: 3 - Alpha",
"Development Status :: 3 - Alpha",
"License :: OSI Approved :: MIT License",

Copilot uses AI. Check for mistakes.
Comment on lines +58 to 59
except Exception:
pass
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing from except ImportError to except Exception is overly broad and could mask real errors. The original code specifically caught ImportError for when torch is unavailable, which is the expected failure mode when torch isn't installed. Catching all exceptions could hide bugs in the import path or in the tensor serializer module itself. Consider reverting to except ImportError or at minimum catching (ImportError, ModuleNotFoundError) if you need to handle both cases.

Copilot uses AI. Check for mistakes.
and ext2_result.get("array_sum") == 17.5
and ext2_result.get("numpy_version").startswith("2.")
):
if ext2_result and ext2_result.get("extension") == "extension2" and ext2_result.get("array_sum") == 17.5:
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the numpy version check (and ext2_result.get("numpy_version").startswith("2.")) from the Extension2 test assertion could allow the test to pass even when the wrong numpy version is installed. This removes valuable validation that was checking whether Extension2 was actually using numpy 2.x as expected. If this check was removed because it's unreliable or unnecessary, consider adding a comment explaining why, otherwise consider keeping it to maintain test coverage of the dependency isolation.

Suggested change
if ext2_result and ext2_result.get("extension") == "extension2" and ext2_result.get("array_sum") == 17.5:
if (
ext2_result
and ext2_result.get("extension") == "extension2"
and ext2_result.get("array_sum") == 17.5
and ext2_result.get("numpy_version").startswith("2.")
):

Copilot uses AI. Check for mistakes.
"--ro-bind",
"/lib",
"/lib",
"--ro-bind",
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _test_bwrap_degraded function uses --ro-bind for /lib64, which will fail on systems where /lib64 doesn't exist (e.g., some 32-bit or ARM systems). This could cause false negatives in sandbox detection. Consider using --ro-bind-try instead of --ro-bind for optional paths like /lib64, or check if the path exists before including it in the command. The same issue exists in the _test_bwrap function at line 162.

Suggested change
"--ro-bind",
"--ro-bind-try",

Copilot uses AI. Check for mistakes.
@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

📝 Walkthrough

Walkthrough

Release 0.9.1: removes a code-review config, tightens CI matrices and deploy guard, adds sandbox-detection with degraded-bwrap fallback, exposes tensor cleanup APIs and lifecycle hooks, adjusts RPC/model serialization ordering and JSON handling, and updates packaging metadata and tests for sandbox awareness.

Changes

Cohort / File(s) Summary
Version & Metadata
docs/conf.py, pyisolate/__init__.py, pyproject.toml
Bumped package/docs version to 0.9.1; updated pyproject metadata and build requirements; simplified license and added license-files.
Removed Review Config
.coderabbit.yaml
Removed AI/review configuration file containing path-specific review/instruction blocks and guidelines.
Distribution Manifest
MANIFEST.in
Added prune tests/.test_temps to exclude test temp files from distributions.
CI Matrix & Commands
.github/workflows/ci.yml, .github/workflows/pytorch.yml, .github/workflows/windows.yml
Reduced test matrices (removed some distro/PyTorch entries), changed test targets to integration_v2 modules, and made venv creation use explicit --python python3.
Docs Deploy Guard
.github/workflows/docs.yml
Added repository guard to deploy job condition (requires github.repository == 'Comfy-Org/pyisolate') and reformatted condition block.
Sandbox Detection & Paths
pyisolate/_internal/sandbox_detect.py, pyisolate/_internal/sandbox.py
Added degraded-bwrap probe _test_bwrap_degraded and fallback logic to report available sandbox capability; added "/opt" to SANDBOX_SYSTEM_PATHS.
Host & Test Harness Sandbox Integration
example/host.py, tests/harness/host.py, tests/integration_v2/test_isolation.py
Propagated sandbox capability detection to host/test harness: add sandbox_mode to ExtensionConfig, disable sandbox when unavailable, and skip tests when sandbox not available.
Tensor Lifecycle & Cleanup APIs
pyisolate/_internal/tensor_serializer.py, pyisolate/__init__.py
Added TensorKeeper flush and orphan-SHM purge APIs (flush_tensor_keeper, purge_orphan_sender_shm_files), atexit/signal cleanup handlers, borrowed-SHM handling, and exported new APIs via package init.
Model & RPC Serialization
pyisolate/_internal/model_serialization.py, pyisolate/_internal/rpc_transports.py
Deferred RemoteObjectHandle forwarding until after adapter serializer lookup; added explicit RemoteObjectHandle JSON serialization/deserialization and serializer-registry dispatch with security-warning logs for generic dict fallbacks.
Environment & Dependency Installation
pyisolate/_internal/environment.py
Added pip-availability handling in requirement filtering, passed --seed to uv venv, and robust handling/validation for editable (-e) install targets.
UDS Client Resilience
pyisolate/_internal/uds_client.py
Changed extension load error handling to log warnings and keep the RPC connection alive instead of re-raising, allowing host to skip faulty extensions.
Tests: sandbox & bwrap
tests/test_bwrap_command.py, tests/test_sandbox_detect.py, tests/test_memory_leaks.py
Added platform skip markers for non-Linux, added degraded-bwrap test coverage, and minor test formatting refactor.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
.github/workflows/docs.yml (1)

26-26: ⚠️ Potential issue | 🟡 Minor

actions/setup-python@v4 is stale — all other workflows use @v5.

🔧 Proposed fix
-      uses: actions/setup-python@v4
+      uses: actions/setup-python@v5
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docs.yml at line 26, Update the GitHub Actions step that
currently references the outdated action string "uses: actions/setup-python@v4"
to the newer pinned version "uses: actions/setup-python@v5" so it matches other
workflows; locate the step containing the literal "uses:
actions/setup-python@v4" and change the version suffix to `@v5` while leaving any
inputs (like python-version) unchanged.
.github/workflows/ci.yml (1)

91-99: 🛠️ Refactor suggestion | 🟠 Major

Dead branch: the extras == "dev" condition can never be true after removing Debian 11 / Rocky Linux 9.

All remaining matrix entries carry extras: "dev,test", so the if [ "${{ matrix.extras }}" = "dev" ] guard at line 94 will always evaluate to false. The torch-exclusion filter (-k "not torch and not TestShareTorchConfiguration") at line 96 is now permanently unreachable, and every distro job will unconditionally run pytest -v — including torch-related tests on containers where torch may not be available.

If all remaining containers are expected to have torch installed (via the test extra), this is fine — but the dead branch should be cleaned up. If some future matrix entry should skip torch tests, a separate matrix field (e.g., skip-torch: true) would be cleaner than relying on the extras string.

🧹 Suggested cleanup
     - name: Run tests
       run: |
         . .venv/bin/activate
-        if [ "${{ matrix.extras }}" = "dev" ]; then
-          # Skip torch-related tests on platforms without torch support
-          pytest -v -k "not torch and not TestShareTorchConfiguration"
-        else
-          pytest -v
-        fi
+        pytest -v
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci.yml around lines 91 - 99, Remove the dead extras-branch
and simplify the test step: delete the if/else block that checks matrix.extras
and always run "pytest -v" (i.e., replace the conditional block around the
pytest invocation with a single unconditional pytest -v line). Update the Run
tests step so it sources .venv/bin/activate then runs pytest -v; if you later
need to skip torch tests, introduce a separate matrix field like skip-torch and
guard the pytest invocation against that instead of using matrix.extras.
.github/workflows/pytorch.yml (1)

17-18: ⚠️ Potential issue | 🟡 Minor

Python 3.12 is absent from the CPU test matrix.

pytorch-version includes 2.2.0 and 2.3.0, both of which support Python 3.12, yet python-version only covers ['3.10', '3.11']. This leaves an untested combination that downstream users with Python 3.12 + modern PyTorch will actually run.

➕ Suggested matrix expansion
-        python-version: ['3.10', '3.11']
+        python-version: ['3.10', '3.11', '3.12']
         pytorch-version: ['2.0.0', '2.1.0', '2.2.0', '2.3.0']

Note: PyTorch 2.0.0 does not officially support Python 3.12, so you may also want to add a exclude: block to skip that combination.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/pytorch.yml around lines 17 - 18, Update the CI matrix to
include Python 3.12 by adding '3.12' to the python-version list so combinations
like Python 3.12 + pytorch-version (2.2.0, 2.3.0) are tested; also add an
exclude rule in the matrix to skip the unsupported combination of
python-version: '3.12' with pytorch-version: '2.0.0' (or any other unsupported
pair) so the GitHub Actions matrix (keys: python-version, pytorch-version,
matrix.exclude) only runs valid supported combos.
pyproject.toml (1)

2-2: ⚠️ Potential issue | 🟠 Major

Upgrade setuptools minimum to support the PEP 639 license metadata format.

The code uses project.license = "MIT" (SPDX string) and project.license-files = ["LICENSE"], which are PEP 639 features that require setuptools 77.0.0 or later. The current floor of setuptools>=61.0 is insufficient and will fail to recognize these metadata fields.

Fix
 [build-system]
-requires = ["setuptools>=61.0", "wheel"]
+requires = ["setuptools>=77.0.0", "wheel"]
 build-backend = "setuptools.build_meta"

Also applies to: 11-12

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 2, The pyproject.toml currently pins setuptools in
the "requires" array to "setuptools>=61.0", but the project uses PEP 639
metadata fields ("project.license" and "project.license-files") which require
setuptools 77.0.0 or newer; update the "requires" entry to require a newer
setuptools (e.g., "setuptools>=77.0.0") so the PEP 639 license metadata is
recognized and parsed correctly (also update any duplicate "requires" entries at
the other occurrence).
pyisolate/_internal/rpc_transports.py (1)

367-367: 🧹 Nitpick | 🔵 Trivial

Use lazy %s formatting instead of f-string in logger.warning.

f-strings in logger calls evaluate the string unconditionally, even when the log level is suppressed. This is a minor performance concern on hot paths.

Proposed fix
-                        logger.warning(f"Failed to deserialize {type_name}: {e}")
+                        logger.warning("Failed to deserialize %s: %s", type_name, e)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/rpc_transports.py` at line 367, Replace the unconditional
f-string used in the logger call so formatting is done lazily: in
pyisolate._internal.rpc_transports locate the exception handling that calls
logger.warning(f"Failed to deserialize {type_name}: {e}") and change it to use
%-style/logging parameter formatting, e.g. logger.warning("Failed to deserialize
%s: %s", type_name, e) so the string interpolation only happens when the warning
will actually be emitted; keep the same message contents and variables
(type_name, e) and preserve any surrounding exception handling context.
tests/integration_v2/test_isolation.py (1)

35-43: 🧹 Nitpick | 🔵 Trivial

Clean up thinking-out-loud comments.

Lines 35–43 and 59–62 read like developer working notes rather than intentional documentation. Consider replacing them with a concise comment explaining the test strategy, or removing them.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration_v2/test_isolation.py` around lines 35 - 43, Remove the
developer "thinking-out-loud" comments in test_isolation.py (the multi-line
notes about bwrap, /tmp, module_path, venv_path, etc.) and replace them with a
single concise comment that states the test strategy: e.g., "This test verifies
sandbox isolation by attempting to write to a path that should not be bound by
the sandbox (ensuring writes fail when outside bound paths like
module_path/venv_path/site-packages)." Ensure the replacement appears where the
existing exploratory comments are and keep it short and focused so the intent is
clear to future readers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/windows.yml:
- Line 53: The matrix entry for the CI key pytorch-version was changed to only
['2.1.0'], which removes Windows coverage for the current PyTorch release line;
restore Windows testing for PyTorch 2.3.x by updating the pytorch-version matrix
to include '2.3.0' (either replace '2.1.0' with '2.3.0' or make it ['2.3.0',
'2.1.0'] if you want both), ensuring the pytorch-version matrix value is
adjusted in the Windows workflow definition.

In `@example/host.py`:
- Around line 180-182: The example currently uses inspect.isawaitable around
extension.stop(), which hides the intended API; change the example to use the
canonical async pattern by calling await extension.stop() directly (and update
any example extension implementations to make stop an async def) so the example
consistently demonstrates the async API for extension.stop(); alternatively, if
the intended API is synchronous, remove the await and call extension.stop()
directly—pick one canonical contract and make extension.stop() implementations
and the example match it.
- Line 13: The example imports the internal function detect_sandbox_capability
from pyisolate._internal.sandbox_detect; update the example to use the public
API instead by either (A) exporting detect_sandbox_capability from the package
public surface (add it to pyisolate/__init__.py and include it in __all__) so
example/host.py can import it as from pyisolate import
detect_sandbox_capability, or (B) modify example/host.py to detect or handle
sandbox mode using an existing public function/setting rather than importing
from _internal; reference the symbol detect_sandbox_capability and ensure the
example only imports from pyisolate (not pyisolate._internal).

In `@pyisolate/_internal/environment.py`:
- Around line 360-375: The parser currently accepts editable targets that begin
with "-" which later produce confusing installer errors; update the '-e'
handling in the block that processes safe_deps so that after computing
editable_target (both in the dep_stripped == "-e" branch where editable_target =
safe_deps[i + 1].strip() and in the dep_stripped.startswith("-e ") branch where
editable_target = dep_stripped[3:].strip()) you immediately validate that
editable_target is non-empty and does not start with "-" and raise a ValueError
(with a clear message like "Editable dependency '-e' must include a valid path
or URL, not an option") if it does; keep existing logic to extend
install_targets and advance i unchanged otherwise.

In `@pyisolate/_internal/rpc_transports.py`:
- Around line 347-351: The deserialization branch that reconstructs
RemoteObjectHandle bypasses the SerializerRegistry; add a short inline comment
above the hardcoded check in rpc_transports.py (the block that tests
dct.get("__type__") == "RemoteObjectHandle" and returns
RemoteObjectHandle(dct["object_id"], dct["type_name"])) stating that this branch
intentionally takes precedence over any SerializerRegistry entries for
"RemoteObjectHandle" to ensure deterministic handle reconstruction; keep the
comment concise and mention RemoteObjectHandle and SerializerRegistry by name so
future readers understand the deliberate bypass.

In `@pyisolate/_internal/sandbox_detect.py`:
- Around line 177-215: Both _test_bwrap_degraded and _test_bwrap contain almost
identical bwrap argument lists and subprocess call; extract the shared argument
list and subprocess invocation into a helper (e.g., _run_bwrap_test or
_build_bwrap_args) so both functions call the helper with a flag to include or
exclude "--unshare-user-try" and reuse the same timeout/capture/error handling,
preserving current return values and exception behavior from _test_bwrap and
_test_bwrap_degraded.

In `@pyisolate/_internal/sandbox.py`:
- Line 32: The sandbox currently mounts the entire "/opt" directory which
overexposes host data; change the mount entry for "/opt" to a much narrower path
such as "/opt/hostedtoolcache" (or a small explicit whitelist of toolcache
subpaths) instead of the root "/opt", or make the list configurable so only
known CI toolcache locations are mounted read-only; update the mounts list entry
that currently contains the string "/opt" accordingly.

In `@pyisolate/_internal/tensor_serializer.py`:
- Around line 157-160: The current cleanup swallows all exceptions: change the
broad "except Exception:" around require_torch("flush_tensor_keeper") to only
catch the expected import/availability error (e.g., ImportError or a custom
error from require_torch) and return released in that specific case; for any
other exception, log the exception with context (e.g., include
"flush_tensor_keeper" and released) and re-raise so real failures aren't hidden.
Likewise, replace the bare "except: pass" block (the cleanup loop around tensor
release at lines referenced) with explicit handling for expected errors and
otherwise log-and-re-raise unexpected exceptions instead of silencing them.
Ensure you use the module logger or processLogger consistently and reference
require_torch and the flush_tensor_keeper cleanup loop when making the changes.
- Around line 168-173: The public purge function (purge_sender_shm) currently
returns 0 when the purge feature is disabled via PYISOLATE_PURGE_SENDER_SHM,
which silently masks misuse; change this to fail loudly by raising a clear
exception (e.g., RuntimeError or ValueError) that tells the caller the purge is
disabled and how to enable it (set PYISOLATE_PURGE_SENDER_SHM="1" or pass
force=True). Update the early-exit branch that checks os.environ and the force
parameter so it raises the explanatory exception instead of returning 0,
preserving the rest of purge_sender_shm's behavior.
- Around line 283-288: The code currently calls private PyTorch internals
(storage._shared_decref and torch._C._new_shared_filename_cpu) directly; wrap
these usages in capability checks and safe fallbacks: before calling
storage._shared_decref verify hasattr(storage, "_shared_decref") and call it in
a try/except, and if missing or it raises, fall back to not using the borrowed
strategy (set strategy = "file_system") and avoid adjusting refcounts; likewise
check for torch._C and hasattr(torch._C, "_new_shared_filename_cpu") before
using it and fall back to the safe non-borrowed path if unavailable or errors
occur; ensure the branch that sets strategy to "file_system_borrowed" only
executes when both capabilities exist and succeed so code using strategy and
storage filename handling remains robust across torch versions.

In `@pyisolate/_internal/uds_client.py`:
- Around line 241-245: The except block that logs module load failures should
log the full traceback and use an error level: replace the logger.warning call
in the except Exception as exc handler (the one that references module_path)
with logger.error and pass exc_info=True so the exception traceback is included;
keep the await rpc.run_until_stopped() behavior unchanged to preserve RPC
lifetime.

In `@tests/harness/host.py`:
- Around line 56-59: Replace the bare "except Exception: pass" with specific
exception handling: allow ImportError to be silently ignored (except
ImportError: pass) but catch other exceptions (except Exception as e) and log
the failure at debug with the exception details before re-raising; apply this
around the block that calls registry.register("torch.Tensor", serialize_tensor,
deserialize_tensor) so any non-import errors in register or
register_tensor_serializer are logged (use the module logger via
logging.getLogger(__name__) or the existing logger) and then re-raise the
exception.
- Around line 98-100: Set sandbox_available default to False and only set it
True after checking detect_sandbox_capability() on Linux: change the
initialization of self.sandbox_available and the conditional that calls
detect_sandbox_capability() so non-Linux platforms do not leave it True. Also
ensure any later logic that sets sandbox_mode (e.g., code referencing
SandboxMode.REQUIRED) uses this updated self.sandbox_available to avoid
requiring sandboxing on unsupported platforms; update references around where
sandbox_mode is assigned to consult self.sandbox_available.

In `@tests/test_sandbox_detect.py`:
- Around line 190-197: Add a test for the timeout/failure path of
_test_bwrap_degraded by patching subprocess.run to raise
subprocess.TimeoutExpired (to simulate a timeout) and assert the returned tuple
is (False, "bwrap degraded test timed out") or that error contains that exact
message; also consider a variant where subprocess.run returns a mock with a
non-zero returncode and assert success is False and error is non-empty. Locate
tests in tests/test_sandbox_detect.py near the existing
test_bwrap_degraded_test_success and use the same mocking pattern
(patch("subprocess.run", ...)) to implement these negative-case assertions for
_test_bwrap_degraded.

---

Outside diff comments:
In @.github/workflows/ci.yml:
- Around line 91-99: Remove the dead extras-branch and simplify the test step:
delete the if/else block that checks matrix.extras and always run "pytest -v"
(i.e., replace the conditional block around the pytest invocation with a single
unconditional pytest -v line). Update the Run tests step so it sources
.venv/bin/activate then runs pytest -v; if you later need to skip torch tests,
introduce a separate matrix field like skip-torch and guard the pytest
invocation against that instead of using matrix.extras.

In @.github/workflows/docs.yml:
- Line 26: Update the GitHub Actions step that currently references the outdated
action string "uses: actions/setup-python@v4" to the newer pinned version "uses:
actions/setup-python@v5" so it matches other workflows; locate the step
containing the literal "uses: actions/setup-python@v4" and change the version
suffix to `@v5` while leaving any inputs (like python-version) unchanged.

In @.github/workflows/pytorch.yml:
- Around line 17-18: Update the CI matrix to include Python 3.12 by adding
'3.12' to the python-version list so combinations like Python 3.12 +
pytorch-version (2.2.0, 2.3.0) are tested; also add an exclude rule in the
matrix to skip the unsupported combination of python-version: '3.12' with
pytorch-version: '2.0.0' (or any other unsupported pair) so the GitHub Actions
matrix (keys: python-version, pytorch-version, matrix.exclude) only runs valid
supported combos.

In `@pyisolate/_internal/rpc_transports.py`:
- Line 367: Replace the unconditional f-string used in the logger call so
formatting is done lazily: in pyisolate._internal.rpc_transports locate the
exception handling that calls logger.warning(f"Failed to deserialize
{type_name}: {e}") and change it to use %-style/logging parameter formatting,
e.g. logger.warning("Failed to deserialize %s: %s", type_name, e) so the string
interpolation only happens when the warning will actually be emitted; keep the
same message contents and variables (type_name, e) and preserve any surrounding
exception handling context.

In `@pyproject.toml`:
- Line 2: The pyproject.toml currently pins setuptools in the "requires" array
to "setuptools>=61.0", but the project uses PEP 639 metadata fields
("project.license" and "project.license-files") which require setuptools 77.0.0
or newer; update the "requires" entry to require a newer setuptools (e.g.,
"setuptools>=77.0.0") so the PEP 639 license metadata is recognized and parsed
correctly (also update any duplicate "requires" entries at the other
occurrence).

In `@tests/integration_v2/test_isolation.py`:
- Around line 35-43: Remove the developer "thinking-out-loud" comments in
test_isolation.py (the multi-line notes about bwrap, /tmp, module_path,
venv_path, etc.) and replace them with a single concise comment that states the
test strategy: e.g., "This test verifies sandbox isolation by attempting to
write to a path that should not be bound by the sandbox (ensuring writes fail
when outside bound paths like module_path/venv_path/site-packages)." Ensure the
replacement appears where the existing exploratory comments are and keep it
short and focused so the intent is clear to future readers.

ℹ️ Review info

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between efdbc61 and e187670.

📒 Files selected for processing (22)
  • .coderabbit.yaml
  • .github/workflows/ci.yml
  • .github/workflows/docs.yml
  • .github/workflows/pytorch.yml
  • .github/workflows/windows.yml
  • MANIFEST.in
  • docs/conf.py
  • example/host.py
  • pyisolate/__init__.py
  • pyisolate/_internal/environment.py
  • pyisolate/_internal/model_serialization.py
  • pyisolate/_internal/rpc_transports.py
  • pyisolate/_internal/sandbox.py
  • pyisolate/_internal/sandbox_detect.py
  • pyisolate/_internal/tensor_serializer.py
  • pyisolate/_internal/uds_client.py
  • pyproject.toml
  • tests/harness/host.py
  • tests/integration_v2/test_isolation.py
  • tests/test_bwrap_command.py
  • tests/test_memory_leaks.py
  • tests/test_sandbox_detect.py
💤 Files with no reviewable changes (1)
  • .coderabbit.yaml

fail-fast: false
matrix:
pytorch-version: ['2.1.0', '2.3.0']
pytorch-version: ['2.1.0']
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

PyTorch 2.3.0 is no longer tested on Windows.

Reducing the matrix to ['2.1.0'] only means Windows-specific regressions for PyTorch 2.3.x (the current release line at the time of this PR) go undetected. If CI time is a concern, consider keeping 2.3.0 and dropping 2.1.0 instead, since 2.1.x is now EOL.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/windows.yml at line 53, The matrix entry for the CI key
pytorch-version was changed to only ['2.1.0'], which removes Windows coverage
for the current PyTorch release line; restore Windows testing for PyTorch 2.3.x
by updating the pytorch-version matrix to include '2.3.0' (either replace
'2.1.0' with '2.3.0' or make it ['2.3.0', '2.1.0'] if you want both), ensuring
the pytorch-version matrix value is adjusted in the Windows workflow definition.

from shared import DatabaseSingleton, ExampleExtensionBase

import pyisolate
from pyisolate._internal.sandbox_detect import detect_sandbox_capability
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Example imports from internal _internal module — use public API instead.

detect_sandbox_capability is imported from pyisolate._internal.sandbox_detect, but examples serve as user-facing documentation. As per coding guidelines: "Documentation should NEVER include references to internal implementation details." Either expose detect_sandbox_capability in pyisolate/__init__ and __all__, or handle sandbox mode differently in the example.

Option: export it publicly

In pyisolate/__init__.py:

 from ._internal.tensor_serializer import flush_tensor_keeper, purge_orphan_sender_shm_files
+from ._internal.sandbox_detect import detect_sandbox_capability
 from .config import ExtensionConfig, ExtensionManagerConfig, SandboxMode
 __all__ = [
     ...
+    "detect_sandbox_capability",
     "register_adapter",

Then in example/host.py:

-from pyisolate._internal.sandbox_detect import detect_sandbox_capability
+from pyisolate import detect_sandbox_capability
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@example/host.py` at line 13, The example imports the internal function
detect_sandbox_capability from pyisolate._internal.sandbox_detect; update the
example to use the public API instead by either (A) exporting
detect_sandbox_capability from the package public surface (add it to
pyisolate/__init__.py and include it in __all__) so example/host.py can import
it as from pyisolate import detect_sandbox_capability, or (B) modify
example/host.py to detect or handle sandbox mode using an existing public
function/setting rather than importing from _internal; reference the symbol
detect_sandbox_capability and ensure the example only imports from pyisolate
(not pyisolate._internal).

Comment on lines +180 to +182
stop_result = extension.stop()
if inspect.isawaitable(stop_result):
await stop_result
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

inspect.isawaitable check suggests API uncertainty — example should demonstrate the canonical call pattern.

If extension.stop() is synchronous, just call it. If it's async, await it. The isawaitable guard obscures the intended API usage for readers of the example. Examples are "tested in CI" and should "demonstrate real use cases," per guidelines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@example/host.py` around lines 180 - 182, The example currently uses
inspect.isawaitable around extension.stop(), which hides the intended API;
change the example to use the canonical async pattern by calling await
extension.stop() directly (and update any example extension implementations to
make stop an async def) so the example consistently demonstrates the async API
for extension.stop(); alternatively, if the intended API is synchronous, remove
the await and call extension.stop() directly—pick one canonical contract and
make extension.stop() implementations and the example match it.

Comment on lines +360 to +375
if dep_stripped == "-e":
if i + 1 >= len(safe_deps):
raise ValueError("Editable dependency '-e' must include a path or URL")
editable_target = safe_deps[i + 1].strip()
if not editable_target:
raise ValueError("Editable dependency '-e' must include a path or URL")
install_targets.extend(["-e", editable_target])
i += 2
continue

if dep_stripped.startswith("-e "):
editable_target = dep_stripped[3:].strip()
if not editable_target:
raise ValueError("Editable dependency must include a path or URL after '-e'")
install_targets.extend(["-e", editable_target])
else:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reject option-like editable targets in split/combined -e parsing.

A target that still starts with - should be rejected early; otherwise malformed inputs fail later with less clear install errors.

🛠️ Suggested validation hardening
         if dep_stripped == "-e":
             if i + 1 >= len(safe_deps):
                 raise ValueError("Editable dependency '-e' must include a path or URL")
             editable_target = safe_deps[i + 1].strip()
             if not editable_target:
                 raise ValueError("Editable dependency '-e' must include a path or URL")
+            if editable_target.startswith("-"):
+                raise ValueError(
+                    "Editable dependency target after '-e' cannot start with '-'"
+                )
             install_targets.extend(["-e", editable_target])
             i += 2
             continue

         if dep_stripped.startswith("-e "):
             editable_target = dep_stripped[3:].strip()
             if not editable_target:
                 raise ValueError("Editable dependency must include a path or URL after '-e'")
+            if editable_target.startswith("-"):
+                raise ValueError(
+                    "Editable dependency target after '-e' cannot start with '-'"
+                )
             install_targets.extend(["-e", editable_target])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/environment.py` around lines 360 - 375, The parser
currently accepts editable targets that begin with "-" which later produce
confusing installer errors; update the '-e' handling in the block that processes
safe_deps so that after computing editable_target (both in the dep_stripped ==
"-e" branch where editable_target = safe_deps[i + 1].strip() and in the
dep_stripped.startswith("-e ") branch where editable_target =
dep_stripped[3:].strip()) you immediately validate that editable_target is
non-empty and does not start with "-" and raise a ValueError (with a clear
message like "Editable dependency '-e' must include a valid path or URL, not an
option") if it does; keep existing logic to extend install_targets and advance i
unchanged otherwise.

Comment on lines +347 to +351
# Reconstruct remote object handles.
if dct.get("__type__") == "RemoteObjectHandle":
from .remote_handle import RemoteObjectHandle

return RemoteObjectHandle(dct["object_id"], dct["type_name"])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

RemoteObjectHandle deserialization is correctly prioritized before the generic __type__ registry lookup.

This ensures handle reconstruction is deterministic regardless of what's in the registry.

One note: any "RemoteObjectHandle" deserializer registered in SerializerRegistry will be silently bypassed by this hardcoded path. If that's intentional, a brief inline comment would help.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/rpc_transports.py` around lines 347 - 351, The
deserialization branch that reconstructs RemoteObjectHandle bypasses the
SerializerRegistry; add a short inline comment above the hardcoded check in
rpc_transports.py (the block that tests dct.get("__type__") ==
"RemoteObjectHandle" and returns RemoteObjectHandle(dct["object_id"],
dct["type_name"])) stating that this branch intentionally takes precedence over
any SerializerRegistry entries for "RemoteObjectHandle" to ensure deterministic
handle reconstruction; keep the comment concise and mention RemoteObjectHandle
and SerializerRegistry by name so future readers understand the deliberate
bypass.

Comment on lines +283 to +288
use_borrowed = os.environ.get("PYISOLATE_CPU_BORROWED_SHM", "1") == "1"
strategy = "file_system_borrowed" if use_borrowed else "file_system"
if use_borrowed:
# reduce_storage() increments sender-side refcount for transfer.
# Undo transit incref immediately and use borrowed strategy payload.
storage._shared_decref()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== File structure and context ==="
ls -la pyisolate/_internal/tensor_serializer.py 2>/dev/null || echo "File not found"

echo
echo "=== Lines 283-288 (first mentioned section) ==="
sed -n '280,295p' pyisolate/_internal/tensor_serializer.py 2>/dev/null || echo "Unable to read file"

echo
echo "=== Lines 401-406 (second mentioned section) ==="
sed -n '398,415p' pyisolate/_internal/tensor_serializer.py 2>/dev/null || echo "Unable to read file"

echo
echo "=== Search for private PyTorch API usage ==="
rg -nP '_shared_decref|_new_shared_filename_cpu' --type=py pyisolate/

echo
echo "=== Torch dependency constraints ==="
fd -HI 'pyproject.toml|setup.py|setup.cfg|requirements.*\.txt|tox.ini|poetry.lock|uv.lock' | while read -r f; do
  if [[ -f "$f" ]]; then
    echo "--- $f ---"
    rg -i 'torch|pytorch' "$f" || echo "(no torch references)"
  fi
done

echo
echo "=== Check for logger usage and imports in the file ==="
head -50 pyisolate/_internal/tensor_serializer.py | rg -n 'import|logger'

Repository: Comfy-Org/pyisolate

Length of output: 2911


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check for any existing handling of torch API variations ==="
rg -nB 3 -A 3 'getattr.*torch|hasattr.*torch|try:.*torch' --type=py pyisolate/

echo
echo "=== Check for reductions module usage ==="
head -30 pyisolate/_internal/tensor_serializer.py | rg -n 'reductions|import'

echo
echo "=== Full context around line 404-410 to verify fallback ==="
sed -n '398,415p' pyisolate/_internal/tensor_serializer.py

echo
echo "=== Check if _shared_decref has documentation or usage patterns elsewhere ==="
rg -n 'shared_decref|shared_incref|borrowed' --type=py pyisolate/ | head -20

Repository: Comfy-Org/pyisolate

Length of output: 3535


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Verify reductions module is accessible from require_torch ==="
sed -n '1,50p' pyisolate/_internal/tensor_serializer.py | grep -A 20 'require_torch'

echo
echo "=== Check torch_gate module to understand torch/reductions import ==="
sed -n '1,100p' pyisolate/_internal/torch_gate.py | grep -A 10 'require_torch\|reductions'

echo
echo "=== Search for any torch version compatibility comments or issues ==="
rg -i 'version|compat|torch.*2\.[0-9]|breaking|deprecat' --type=py pyisolate/_internal/tensor_serializer.py | head -15

echo
echo "=== Verify the _shared_decref impact - where is storage coming from ==="
sed -n '270,295p' pyisolate/_internal/tensor_serializer.py

Repository: Comfy-Org/pyisolate

Length of output: 3109


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Verify the complete context for line 288 decision path ==="
sed -n '280,310p' pyisolate/_internal/tensor_serializer.py

echo
echo "=== Verify torch version stability concerns - check PyTorch release notes context ==="
echo "Checking if there are any integration tests that verify these APIs"
fd -HI 'test.*tensor|tensor.*test' --type=py | head -10

echo
echo "=== Look for any existing fallback patterns in codebase ==="
rg -n 'getattr.*callable|hasattr.*_' --type=py pyisolate/ | head -10

echo
echo "=== Confirm reductions.rebuild_storage_filename exists and what it does ==="
python3 << 'EOF'
import importlib
try:
    reductions = importlib.import_module("torch.multiprocessing.reductions")
    if hasattr(reductions, "rebuild_storage_filename"):
        print("✓ reductions.rebuild_storage_filename exists")
        import inspect
        sig = inspect.signature(reductions.rebuild_storage_filename)
        print(f"  Signature: {sig}")
except Exception as e:
    print(f"Could not verify: {e}")
EOF

echo
echo "=== Check if there are any comments about torch API fragility ==="
rg -n 'UntypedStorage|multiprocessing|shared.*mem' --type=py pyisolate/_internal/ -B 1 -A 1 | head -30

Repository: Comfy-Org/pyisolate

Length of output: 1834


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Look for any existing fallback patterns in codebase ==="
rg -n 'getattr.*callable|hasattr' --type=py pyisolate/ | head -10

echo
echo "=== Confirm reductions.rebuild_storage_filename exists ==="
python3 << 'EOF'
import sys
try:
    import torch
    print(f"✓ torch version: {torch.__version__}")
    from torch.multiprocessing import reductions
    
    # Check for _new_shared_filename_cpu
    if hasattr(torch.UntypedStorage, "_new_shared_filename_cpu"):
        print("✓ torch.UntypedStorage._new_shared_filename_cpu exists")
    else:
        print("✗ torch.UntypedStorage._new_shared_filename_cpu NOT found")
    
    # Check for _shared_decref
    storage = torch.UntypedStorage(10)
    if hasattr(storage, "_shared_decref"):
        print("✓ storage._shared_decref exists")
    else:
        print("✗ storage._shared_decref NOT found")
    
    # Check fallback
    if hasattr(reductions, "rebuild_storage_filename"):
        print("✓ reductions.rebuild_storage_filename exists")
        import inspect
        sig = inspect.signature(reductions.rebuild_storage_filename)
        print(f"  Signature: {sig}")
    else:
        print("✗ reductions.rebuild_storage_filename NOT found")
        
except Exception as e:
    print(f"Error during verification: {e}", file=sys.stderr)
    sys.exit(1)
EOF

echo
echo "=== Search for PyTorch version requirements and any API stability notes ==="
rg -i 'torch.*2\.[0-9]|api.*change|deprecat|breaking' --type=py pyisolate/

Repository: Comfy-Org/pyisolate

Length of output: 1121


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check PyTorch documentation/comments about UntypedStorage internals ==="
rg -n 'UntypedStorage|shared_decref|_new_shared_filename' pyisolate/ -B 2 -A 2

echo
echo "=== Verify the proposed fallback for line 288 makes sense ==="
echo "Looking at what happens if _shared_decref is not available..."
sed -n '280,320p' pyisolate/_internal/tensor_serializer.py

echo
echo "=== Verify the architectural pattern - is graceful degradation used elsewhere? ==="
rg -n 'logger.debug.*unsupported|logger.warning.*fallback' --type=py pyisolate/

echo
echo "=== Check if there are test cases for borrowed SHM strategy ==="
find . -name '*.py' -type f -exec grep -l 'file_system_borrowed\|PYISOLATE_CPU_BORROWED_SHM' {} \;

echo
echo "=== Verify torch.multiprocessing.reductions availability across versions ==="
rg -n 'torch.multiprocessing.reductions\|rebuild_storage_filename' --type=py pyisolate/ -B 1 -A 1

Repository: Comfy-Org/pyisolate

Length of output: 3704


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Analyze the complete deserialization flow for borrowed strategy ==="
sed -n '385,420p' pyisolate/_internal/tensor_serializer.py

echo
echo "=== Check if _shared_decref is essential or just an optimization ==="
rg -n 'refcount|incref|decref' pyisolate/_internal/tensor_serializer.py -B 2 -A 2

echo
echo "=== Verify if this is marked as a known issue or TODO ==="
rg -n 'TODO|FIXME|XXX|HACK|BUG' pyisolate/_internal/tensor_serializer.py

echo
echo "=== Review the coding guidelines to check if private API usage is flagged ==="
echo "(Already have guidelines in context)"

echo
echo "=== Check if there are any conditionals based on torch version ==="
rg -n 'torch.__version__|torch.version|TORCH_VERSION' --type=py pyisolate/

echo
echo "=== Verify whether fallback to non-borrowed strategy is acceptable behavior ==="
echo "Checking if file_system vs file_system_borrowed is just an optimization..."
sed -n '1,30p' pyisolate/_internal/tensor_serializer.py | rg -n 'borrowed|strategy'

Repository: Comfy-Org/pyisolate

Length of output: 2377


Guard private PyTorch storage APIs with capability checks and fallback.

Line 288 and Line 404-406 rely on underscored PyTorch internals (_shared_decref, _new_shared_filename_cpu). These are implementation details that can change across torch versions. With torch>=2.0.0 (no upper bound), the code must remain compatible with future releases where these private APIs may be removed or redesigned.

Proposed hardening
         use_borrowed = os.environ.get("PYISOLATE_CPU_BORROWED_SHM", "1") == "1"
         strategy = "file_system_borrowed" if use_borrowed else "file_system"
         if use_borrowed:
             # reduce_storage() increments sender-side refcount for transfer.
             # Undo transit incref immediately and use borrowed strategy payload.
-            storage._shared_decref()
+            decref = getattr(storage, "_shared_decref", None)
+            if callable(decref):
+                decref()
+            else:
+                logger.debug("Borrowed SHM unsupported: missing storage._shared_decref; falling back to file_system")
+                strategy = "file_system"
@@
         if strategy == "file_system_borrowed":
             # Rebuild directly without caching the borrowed storage handle in shared_cache.
             # Keeping borrowed refs in the global cache can pin SHM entries longer than needed.
-            rebuilt_storage = torch.UntypedStorage._new_shared_filename_cpu(
-                manager_path, storage_key, storage_size
-            )
+            new_shared = getattr(torch.UntypedStorage, "_new_shared_filename_cpu", None)
+            if callable(new_shared):
+                rebuilt_storage = new_shared(manager_path, storage_key, storage_size)
+            else:
+                logger.debug("Borrowed SHM unsupported in this torch build; falling back to rebuild_storage_filename")
+                rebuilt_storage = reductions.rebuild_storage_filename(
+                    torch.UntypedStorage, manager_path, storage_key, storage_size
+                )

Also applies to: 401-406

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/tensor_serializer.py` around lines 283 - 288, The code
currently calls private PyTorch internals (storage._shared_decref and
torch._C._new_shared_filename_cpu) directly; wrap these usages in capability
checks and safe fallbacks: before calling storage._shared_decref verify
hasattr(storage, "_shared_decref") and call it in a try/except, and if missing
or it raises, fall back to not using the borrowed strategy (set strategy =
"file_system") and avoid adjusting refcounts; likewise check for torch._C and
hasattr(torch._C, "_new_shared_filename_cpu") before using it and fall back to
the safe non-borrowed path if unavailable or errors occur; ensure the branch
that sets strategy to "file_system_borrowed" only executes when both
capabilities exist and succeed so code using strategy and storage filename
handling remains robust across torch versions.

Comment on lines 241 to +245
except Exception as exc:
logger.error(
"Extension module loading/execution failed for %s: %s", module_path, exc, exc_info=True
)
raise
# Keep RPC alive so the host can gracefully skip broken extensions
# instead of seeing a connection-reset hard failure.
logger.warning("Extension module loading/execution failed for %s: %s", module_path, exc)
await rpc.run_until_stopped()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Log the traceback and consider upgrading to logger.error.

The original exception traceback is lost because exc_info=True is not passed (compare line 165 which does pass it). Also, a module that fails to load is arguably an error, not a warning.

Proposed fix
-            logger.warning("Extension module loading/execution failed for %s: %s", module_path, exc)
+            logger.error("Extension module loading/execution failed for %s: %s", module_path, exc, exc_info=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/uds_client.py` around lines 241 - 245, The except block
that logs module load failures should log the full traceback and use an error
level: replace the logger.warning call in the except Exception as exc handler
(the one that references module_path) with logger.error and pass exc_info=True
so the exception traceback is included; keep the await rpc.run_until_stopped()
behavior unchanged to preserve RPC lifetime.

Comment on lines 56 to 59

registry.register("torch.Tensor", serialize_tensor, deserialize_tensor)
except ImportError:
except Exception:
pass
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Bare except Exception: pass silently swallows non-import errors.

If torch imports successfully but register_tensor_serializer or registry.register fails for a non-import reason (e.g., a bug), the failure is silently hidden. At minimum, log at debug level. As per coding guidelines: "Never use empty except: or except Exception: without re-raising or logging."

Proposed fix
-        except Exception:
-            pass
+        except Exception:
+            logger.debug("Torch serializer registration skipped", exc_info=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/harness/host.py` around lines 56 - 59, Replace the bare "except
Exception: pass" with specific exception handling: allow ImportError to be
silently ignored (except ImportError: pass) but catch other exceptions (except
Exception as e) and log the failure at debug with the exception details before
re-raising; apply this around the block that calls
registry.register("torch.Tensor", serialize_tensor, deserialize_tensor) so any
non-import errors in register or register_tensor_serializer are logged (use the
module logger via logging.getLogger(__name__) or the existing logger) and then
re-raise the exception.

Comment on lines +98 to +100
self.sandbox_available = True
if sys.platform == "linux":
self.sandbox_available = detect_sandbox_capability().available
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

sandbox_available defaults to True on non-Linux, causing SandboxMode.REQUIRED on unsupported platforms.

On macOS/Windows, the if sys.platform == "linux" guard is skipped, so sandbox_available stays True. Line 170 then sets sandbox_mode=SandboxMode.REQUIRED, which will fail since bwrap is Linux-only.

Proposed fix — default to False
-        self.sandbox_available = True
-        if sys.platform == "linux":
-            self.sandbox_available = detect_sandbox_capability().available
+        self.sandbox_available = False
+        if sys.platform == "linux":
+            self.sandbox_available = detect_sandbox_capability().available
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.sandbox_available = True
if sys.platform == "linux":
self.sandbox_available = detect_sandbox_capability().available
self.sandbox_available = False
if sys.platform == "linux":
self.sandbox_available = detect_sandbox_capability().available
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/harness/host.py` around lines 98 - 100, Set sandbox_available default
to False and only set it True after checking detect_sandbox_capability() on
Linux: change the initialization of self.sandbox_available and the conditional
that calls detect_sandbox_capability() so non-Linux platforms do not leave it
True. Also ensure any later logic that sets sandbox_mode (e.g., code referencing
SandboxMode.REQUIRED) uses this updated self.sandbox_available to avoid
requiring sandboxing on unsupported platforms; update references around where
sandbox_mode is assigned to consult self.sandbox_available.

Comment on lines +190 to +197
def test_bwrap_degraded_test_success(self) -> None:
"""Test successful degraded bwrap invocation."""
mock_result = MagicMock()
mock_result.returncode = 0
with patch("subprocess.run", return_value=mock_result):
success, error = _test_bwrap_degraded("/usr/bin/bwrap")
assert success is True
assert error == ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Degraded bwrap test covers the success path — consider adding failure/timeout cases.

Only the happy path is tested. Since _test_bwrap_degraded has distinct timeout messaging ("bwrap degraded test timed out"), a timeout test would ensure that string is correct. Though given the bodies are nearly identical to _test_bwrap (which is well-tested), this is low priority.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_sandbox_detect.py` around lines 190 - 197, Add a test for the
timeout/failure path of _test_bwrap_degraded by patching subprocess.run to raise
subprocess.TimeoutExpired (to simulate a timeout) and assert the returned tuple
is (False, "bwrap degraded test timed out") or that error contains that exact
message; also consider a variant where subprocess.run returns a mock with a
non-zero returncode and assert success is False and error is non-empty. Locate
tests in tests/test_sandbox_detect.py near the existing
test_bwrap_degraded_test_success and use the same mocking pattern
(patch("subprocess.run", ...)) to implement these negative-case assertions for
_test_bwrap_degraded.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pyproject.toml`:
- Around line 66-68: Update the Sphinx dependency constraint to match MyST's
minimum requirement: replace the current "sphinx>=5.0" constraint with
"sphinx>=6" (or "sphinx>=6.0") so that the declared Sphinx version is compatible
with "myst-parser>=2.0"; locate the Sphinx entry in the dependency list where
"sphinx>=5.0" appears and change it to the new constraint.

ℹ️ Review info

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e187670 and 495c7f0.

📒 Files selected for processing (1)
  • pyproject.toml

Comment on lines 66 to +68
"sphinx>=5.0",
"sphinx-rtd-theme>=1.0",
"myst-parser>=2.0",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Current docs constraints in pyproject.toml ==="
rg -nP '"sphinx>=|"myst-parser>=' pyproject.toml -A 1 -B 1 | head -20

echo
echo "=== myst-parser 2.0.0 Sphinx requirement(s) ==="
curl -s https://pypi.org/pypi/myst-parser/2.0.0/json \
  | jq -r '.info.requires_dist[] | select(test("^sphinx")) | select(. != null)' 2>/dev/null || echo "No Sphinx requirement found"

echo
echo "=== myst-parser latest version and Sphinx requirement ==="
latest="$(curl -s https://pypi.org/pypi/myst-parser/json | jq -r '.info.version')"
echo "Latest myst-parser: ${latest}"
curl -s "https://pypi.org/pypi/myst-parser/${latest}/json" \
  | jq -r '.info.requires_dist[] | select(test("^sphinx")) | select(. != null)' 2>/dev/null || echo "No Sphinx requirement found"

Repository: Comfy-Org/pyisolate

Length of output: 1233


Align direct Sphinx constraint with MyST's minimum requirement.

Line 66 currently specifies sphinx>=5.0, but myst-parser>=2.0 on line 68 requires sphinx>=6 (per PyPI metadata). This creates an implicit incompatibility that can cause resolver confusion. Update the Sphinx constraint to explicitly match:

♻️ Proposed change
 docs = [
-    "sphinx>=5.0",
+    "sphinx>=6.0",
     "sphinx-rtd-theme>=1.0",
     "myst-parser>=2.0",
     "sphinx-markdown-builder>=0.5.4",  # Optional: for markdown output if needed
 ]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 66 - 68, Update the Sphinx dependency constraint
to match MyST's minimum requirement: replace the current "sphinx>=5.0"
constraint with "sphinx>=6" (or "sphinx>=6.0") so that the declared Sphinx
version is compatible with "myst-parser>=2.0"; locate the Sphinx entry in the
dependency list where "sphinx>=5.0" appears and change it to the new constraint.

@pollockjj pollockjj merged commit 357171e into Comfy-Org:main Feb 25, 2026
44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants