Skip to content

Fix memory leak in optional_import traceback handling#8782

Open
aymuos15 wants to merge 3 commits intoProject-MONAI:devfrom
aymuos15:fix/optional-import-traceback-leak
Open

Fix memory leak in optional_import traceback handling#8782
aymuos15 wants to merge 3 commits intoProject-MONAI:devfrom
aymuos15:fix/optional-import-traceback-leak

Conversation

@aymuos15
Copy link
Copy Markdown
Contributor

Summary

  • Fix memory leak in optional_import where stored traceback objects retain references to entire call stacks, preventing garbage collection
  • Format traceback to string immediately and clear import_exception.__traceback__ = None
  • Hoist 3 optional_import calls for cucim to module level in monai/transforms/utils.py

Fixes #7480, #7727

Details

monai/utils/module.py:

  • Replace storing live __traceback__ object with traceback.format_exception() string
  • Clear import_exception.__traceback__ = None to break reference chain
  • Embed formatted traceback in error message under "Original traceback:" section

monai/transforms/utils.py:

  • Move cucim optional_import calls to module level (consistent with existing skimage/scipy/cupy pattern)
  • Fixes incidental name shadowing in distance_transform_edt

Test plan

  • test_no_traceback_leak — weakref regression test for the leak
  • test_failed_import_shows_traceback_string — verifies traceback in error message
  • All 12 test_optional_import.py tests pass
  • pre-commit, black, ruff, pytype, mypy — all clean

…I#7480, Project-MONAI#7727)

When an import fails, `optional_import` captured the live traceback
object and stored it in a `_LazyRaise` closure. The traceback held
references to stack frames containing CUDA tensors, preventing garbage
collection and causing GPU memory leaks.

Replace the live traceback with a string-formatted copy via
`traceback.format_exception()` and clear the original with
`import_exception.__traceback__ = None`. The formatted traceback is
appended to the error message so debugging information is preserved.

Also move three hot-path `optional_import` calls in
`monai/transforms/utils.py` (cucim.skimage, cucim morphology EDT)
from function bodies to module level, eliminating repeated leaked
tracebacks on every invocation.

Fixes Project-MONAI#7480, fixes Project-MONAI#7727

Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

Adds a module-level note in monai/transforms/utils.py clarifying that CuCIM-related imports must remain inside functions (no module-level CuCIM import). Updates monai/utils/module.py to capture formatted tracebacks as strings on optional-import failures (introducing had_exception and tb_str) and include that text in the lazy OptionalImportError message instead of attaching traceback objects. Adds two tests in tests/utils/test_optional_import.py: one asserting no traceback/frame leakage via GC/weakref, and one asserting the raised OptionalImportError message contains the captured traceback string and ModuleNotFoundError.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly and concisely describes the main change: fixing a memory leak in optional_import traceback handling.
Description check ✅ Passed Description covers scope, technical details, and test plan; matches template with all key sections present despite some optional items unchecked.
Linked Issues check ✅ Passed Changes address the core objective from #7480: eliminating traceback references that prevent garbage collection and cause GPU memory leaks.
Out of Scope Changes check ✅ Passed All changes align with stated objectives: traceback handling fix, cucim import reorganization, and test coverage for the leak. No extraneous modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/utils/test_optional_import.py (1)

80-97: Prefix unused variable with underscore.

Line 91: mod is never used. Rename to _mod to indicate intentional discard.

🔧 Suggested fix
-            mod, flag = optional_import("nonexistent_module_for_leak_test")
+            _mod, flag = optional_import("nonexistent_module_for_leak_test")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/utils/test_optional_import.py` around lines 80 - 97, In
test_no_traceback_leak, the unused variable `mod` returned from optional_import
should be renamed to `_mod` to signal intentional discard; update the tuple
unpacking in the _do_import function from `mod, flag =
optional_import("nonexistent_module_for_leak_test")` to `_mod, flag =
optional_import("nonexistent_module_for_leak_test")` (keeping the rest of
test_no_traceback_leak, _Marker, and gc check unchanged).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/utils/test_optional_import.py`:
- Around line 80-97: In test_no_traceback_leak, the unused variable `mod`
returned from optional_import should be renamed to `_mod` to signal intentional
discard; update the tuple unpacking in the _do_import function from `mod, flag =
optional_import("nonexistent_module_for_leak_test")` to `_mod, flag =
optional_import("nonexistent_module_for_leak_test")` (keeping the rest of
test_no_traceback_leak, _Marker, and gc check unchanged).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7b9386de-fac6-49d8-a44b-50a0d17f850d

📥 Commits

Reviewing files that changed from the base of the PR and between daaedaa and 5395848.

📒 Files selected for processing (3)
  • monai/transforms/utils.py
  • monai/utils/module.py
  • tests/utils/test_optional_import.py

Copy link
Copy Markdown
Member

@ericspod ericspod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @aymuos15 thanks for these fixes. I commented on the cucim change which mostly should be reverted, the changes for the traceback leaking look fine to me so I hope this solves the problems the issues have raised.

cp, has_cp = optional_import("cupy")
cp_ndarray, _ = optional_import("cupy", name="ndarray")
exposure, has_skimage = optional_import("skimage.exposure")
_cucim_skimage, _has_cucim_skimage = optional_import("cucim.skimage")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we had strange import issues if we had cucim at the module level, such as very slow import or some other buggy behaviour. Our solution was to put importation into functions so that this happened only when the library was needed, which was very rarely. Please undo this change but replace it with a note that cucim is deliberately not imported at the module level, there should have been one already we neglected to add in the past.

@aymuos15
Copy link
Copy Markdown
Contributor Author

Thank you very much for the review. Will get to this soon!

Restore cucim imports inside functions to avoid slow import times and
buggy behaviour. Add comment explaining why cucim is not at module level.
Also rename unused `mod` to `_mod` in test per CodeRabbit nitpick.

Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/utils/test_optional_import.py (1)

83-93: Add docstrings for new local definitions.

_Marker and _do_import are new definitions but have no docstrings.

Suggested patch
         class _Marker:
+            """Marker object used to validate that frame references are released."""
             pass
 
         def _do_import():
+            """Run a failing optional import and return a weak reference.
+
+            Returns:
+                weakref.ReferenceType: Weak reference to a local marker instance.
+            """
             marker = _Marker()
             ref = weakref.ref(marker)

As per coding guidelines, "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/utils/test_optional_import.py` around lines 83 - 93, Add Google-style
docstrings for the new local definitions _Marker (briefly stating it’s a
sentinel class used to detect GC/frame leaks) and for _do_import (documenting
its purpose, that it creates a _Marker instance, returns a weakref.ref to that
marker, and that it calls optional_import for a nonexistent module expecting
flag==False). Ensure the _do_import docstring includes a "Returns" section
describing the weakref.ref return value and any expected behavior (no exceptions
expected), and keep the wording concise and consistent with other test
docstrings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/utils/test_optional_import.py`:
- Around line 103-104: The test currently uses a bare attribute access
`mod.something` inside the with self.assertRaises(OptionalImportError) block
which triggers Ruff B018; replace that expression with an explicit getattr call
(e.g., getattr(mod, "something")) so __getattr__ is invoked without a useless
expression statement; update the test in tests/utils/test_optional_import.py
where the context manager asserts OptionalImportError to use getattr(mod,
"something") instead of the bare `mod.something`.

---

Nitpick comments:
In `@tests/utils/test_optional_import.py`:
- Around line 83-93: Add Google-style docstrings for the new local definitions
_Marker (briefly stating it’s a sentinel class used to detect GC/frame leaks)
and for _do_import (documenting its purpose, that it creates a _Marker instance,
returns a weakref.ref to that marker, and that it calls optional_import for a
nonexistent module expecting flag==False). Ensure the _do_import docstring
includes a "Returns" section describing the weakref.ref return value and any
expected behavior (no exceptions expected), and keep the wording concise and
consistent with other test docstrings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7c66820b-ac88-48da-9477-d9421a7fc627

📥 Commits

Reviewing files that changed from the base of the PR and between 5395848 and ed37db5.

📒 Files selected for processing (2)
  • monai/transforms/utils.py
  • tests/utils/test_optional_import.py
✅ Files skipped from review due to trivial changes (1)
  • monai/transforms/utils.py

Comment on lines +103 to +104
with self.assertRaises(OptionalImportError) as ctx:
mod.something
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix Ruff B018 on Line 104 (useless expression).

Use getattr to trigger __getattr__ without a bare expression statement.

Suggested patch
         with self.assertRaises(OptionalImportError) as ctx:
-            mod.something
+            getattr(mod, "something")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with self.assertRaises(OptionalImportError) as ctx:
mod.something
with self.assertRaises(OptionalImportError) as ctx:
getattr(mod, "something")
🧰 Tools
🪛 Ruff (0.15.9)

[warning] 104-104: Found useless expression. Either assign it to a variable or remove it.

(B018)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/utils/test_optional_import.py` around lines 103 - 104, The test
currently uses a bare attribute access `mod.something` inside the with
self.assertRaises(OptionalImportError) block which triggers Ruff B018; replace
that expression with an explicit getattr call (e.g., getattr(mod, "something"))
so __getattr__ is invoked without a useless expression statement; update the
test in tests/utils/test_optional_import.py where the context manager asserts
OptionalImportError to use getattr(mod, "something") instead of the bare
`mod.something`.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HausdorffDTLoss leads to GPU memory leak.

2 participants