Fix nnUNet test directory leakage into current working directory#8887
Fix nnUNet test directory leakage into current working directory#8887aymuos15 wants to merge 8 commits into
Conversation
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
) The TestnnUNetV2RunnerSecurity YAMLs used cwd-relative paths (./work, ./nnUNet_raw, ./nnUNet_preprocessed, ./nnUNet_results). nnUNetV2Runner.__init__ creates the raw/preprocessed/results dirs eagerly (before dataset-name validation), so instantiating the runner in these tests wrote into the repo cwd and tearDown could not clean it up. Root all generated paths in self.test_dir so the temp dir owns them, including the injection payload's redirect target. Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
📝 WalkthroughWalkthroughThe PR hardens dataset name handling in nnU-Net runner by introducing regex-based validation. During initialization, Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
monai/apps/nnunet/nnunetv2_runner.py (1)
209-218:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix dataset_name resolution for
DatasetXXXinputs in nnUNetV2Runner
DATASET_ID_FORMATallowsDataset[0-9]{3}, but init casts viaint(self.dataset_name_or_id)at line 212; for inputs like"Dataset003"this throws, gets swallowed, andself.dataset_namestaysNone.- With
run_convert_dataset=False,train_parallel_cmd/train_parallel/predict_ensemble_postprocessingwill then raiseValueErrorbecauseself.dataset_name is None.convert_datasetline 230 also doesint(self.dataset_name_or_id), so"DatasetXXX"inputs would still break even when conversion is enabled; either tighten the regex to numeric IDs only or make theint(...)logic conditional.🔧 Proposed fix
- self.dataset_name = maybe_convert_to_dataset_name(int(self.dataset_name_or_id)) + self.dataset_name = maybe_convert_to_dataset_name(self.dataset_name_or_id)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@monai/apps/nnunet/nnunetv2_runner.py` around lines 209 - 218, The current int(self.dataset_name_or_id) cast in nnUNetV2Runner init prevents valid DatasetXXX strings (e.g., "Dataset003") from resolving because maybe_convert_to_dataset_name is only called with an int and the exception swallows the failure leaving self.dataset_name None; update the logic in the __init__/dataset resolution and in convert_dataset so you first detect whether self.dataset_name_or_id matches a pure-numeric ID (using DATASET_ID_FORMAT or a digits-only check) and only cast to int in that case, otherwise pass the raw string into maybe_convert_to_dataset_name (or call maybe_convert_to_dataset_name without int conversion); ensure convert_dataset likewise uses the same conditional casting so run_convert_dataset=False and conversion paths both resolve DatasetXXX names correctly and do not leave self.dataset_name as None.
🧹 Nitpick comments (3)
tests/integration/test_integration_nnunetv2_runner.py (3)
33-33: ⚡ Quick winScope the logger override to the test and restore it.
Setting the runner logger level at import time leaks global state across the whole test session and can hide warnings from unrelated tests. Patch it inside the relevant test/class setup and restore the previous level in cleanup.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/test_integration_nnunetv2_runner.py` at line 33, The module-level call monai.apps.nnunet.nnunetv2_runner.logger.setLevel(logging.ERROR) is changing global logging state at import time; move this into the specific test's setup (or a pytest fixture) and restore the original level in teardown (or after fixture yield). Concretely, capture the previous level via prev = monai.apps.nnunet.nnunetv2_runner.logger.level, set to logging.ERROR for the duration of the test (or fixture), and then reset monai.apps.nnunet.nnunetv2_runner.logger.setLevel(prev) in a finally/teardown block so the override does not leak to other tests.
158-163: ⚡ Quick winAssert that the injection payload produced no side effect.
assertRaises(ValueError)still passes if the constructor createstest.txtbefore failing. Add a negative assertion so this stays a real security regression test.Suggested patch
def test_nnunetv2runner_bad_dataset_name(self) -> None: """ Test the dataset name given must conform to the nnUNet requirement of being an int or "Dataset###". """ with self.assertRaises(ValueError): nnUNetV2Runner(input_config=self.inject_yml, trainer_class_name="nnUNetTrainer") + self.assertFalse(os.path.exists(os.path.join(self.test_dir.name, "test.txt")))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/test_integration_nnunetv2_runner.py` around lines 158 - 163, The test test_nnunetv2runner_bad_dataset_name currently only checks that nnUNetV2Runner(input_config=self.inject_yml, trainer_class_name="nnUNetTrainer") raises ValueError but doesn't assert there were no side‑effects; update the test to, after the assertRaises block, explicitly assert that the injection artifact (e.g. the file created by the payload such as "test.txt" or any files under the same temp dir used by self.inject_yml) does NOT exist. Locate the test method test_nnunetv2runner_bad_dataset_name and add a negative assertion referencing the specific artifact path (or verify the directory referenced by self.inject_yml is unchanged/empty) to ensure the constructor failed without creating files. Ensure the extra assertion runs after the context manager that expects ValueError so the test still fails if the artifact was created.
101-166: ⚡ Quick winAdd Google-style docstrings to the new definitions.
The new class,
setUp, andtearDownare undocumented, and the two test-method docstrings are not Google-style docstrings with return/exception sections.As per coding guidelines, "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/test_integration_nnunetv2_runner.py` around lines 101 - 166, Add Google-style docstrings to the TestnnUNetV2RunnerSecurity class and its methods: setUp, tearDown, test_nnunetv2runner_good_dataset_name, and test_nnunetv2runner_bad_dataset_name. For each definition include a short description, Args sections for any parameters (e.g., self), Returns if applicable (use None for setUp/tearDown/test methods), and Raises for expected exceptions (e.g., ValueError in test_nnunetv2runner_bad_dataset_name). Update the docstrings in the class TestnnUNetV2RunnerSecurity and the methods setUp, tearDown, test_nnunetv2runner_good_dataset_name, and test_nnunetv2runner_bad_dataset_name to follow Google-style formatting.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@monai/apps/nnunet/nnunetv2_runner.py`:
- Around line 585-605: The argv builder in the kwargs loop currently always
appends str(_value), causing store_true options to appear as "--flag True"
instead of a bare "--flag" and including false/empty pretrained_weights; update
the loop in nnunetv2_runner.py where cmd is assembled so that for boolean flags
(e.g., keys "c", "val", "use_compressed", "disable_checkpointing" and any other
store_true options) you only append the flag name (prefix + _key) when _value is
True and skip it when False, and for "pretrained_weights" (or "-p") only append
the option and its value when _value is truthy (a non-empty path); keep the
existing prefix selection logic for short vs long options and continue
converting items to strings later.
---
Outside diff comments:
In `@monai/apps/nnunet/nnunetv2_runner.py`:
- Around line 209-218: The current int(self.dataset_name_or_id) cast in
nnUNetV2Runner init prevents valid DatasetXXX strings (e.g., "Dataset003") from
resolving because maybe_convert_to_dataset_name is only called with an int and
the exception swallows the failure leaving self.dataset_name None; update the
logic in the __init__/dataset resolution and in convert_dataset so you first
detect whether self.dataset_name_or_id matches a pure-numeric ID (using
DATASET_ID_FORMAT or a digits-only check) and only cast to int in that case,
otherwise pass the raw string into maybe_convert_to_dataset_name (or call
maybe_convert_to_dataset_name without int conversion); ensure convert_dataset
likewise uses the same conditional casting so run_convert_dataset=False and
conversion paths both resolve DatasetXXX names correctly and do not leave
self.dataset_name as None.
---
Nitpick comments:
In `@tests/integration/test_integration_nnunetv2_runner.py`:
- Line 33: The module-level call
monai.apps.nnunet.nnunetv2_runner.logger.setLevel(logging.ERROR) is changing
global logging state at import time; move this into the specific test's setup
(or a pytest fixture) and restore the original level in teardown (or after
fixture yield). Concretely, capture the previous level via prev =
monai.apps.nnunet.nnunetv2_runner.logger.level, set to logging.ERROR for the
duration of the test (or fixture), and then reset
monai.apps.nnunet.nnunetv2_runner.logger.setLevel(prev) in a finally/teardown
block so the override does not leak to other tests.
- Around line 158-163: The test test_nnunetv2runner_bad_dataset_name currently
only checks that nnUNetV2Runner(input_config=self.inject_yml,
trainer_class_name="nnUNetTrainer") raises ValueError but doesn't assert there
were no side‑effects; update the test to, after the assertRaises block,
explicitly assert that the injection artifact (e.g. the file created by the
payload such as "test.txt" or any files under the same temp dir used by
self.inject_yml) does NOT exist. Locate the test method
test_nnunetv2runner_bad_dataset_name and add a negative assertion referencing
the specific artifact path (or verify the directory referenced by
self.inject_yml is unchanged/empty) to ensure the constructor failed without
creating files. Ensure the extra assertion runs after the context manager that
expects ValueError so the test still fails if the artifact was created.
- Around line 101-166: Add Google-style docstrings to the
TestnnUNetV2RunnerSecurity class and its methods: setUp, tearDown,
test_nnunetv2runner_good_dataset_name, and test_nnunetv2runner_bad_dataset_name.
For each definition include a short description, Args sections for any
parameters (e.g., self), Returns if applicable (use None for setUp/tearDown/test
methods), and Raises for expected exceptions (e.g., ValueError in
test_nnunetv2runner_bad_dataset_name). Update the docstrings in the class
TestnnUNetV2RunnerSecurity and the methods setUp, tearDown,
test_nnunetv2runner_good_dataset_name, and test_nnunetv2runner_bad_dataset_name
to follow Google-style formatting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 631f3e0e-2273-4606-a386-197be2c23c11
📒 Files selected for processing (2)
monai/apps/nnunet/nnunetv2_runner.pytests/integration/test_integration_nnunetv2_runner.py
| cmd = [ | ||
| "nnUNetv2_train", | ||
| f"{self.dataset_name_or_id}", | ||
| f"{config}", | ||
| f"{fold}", | ||
| self.dataset_name_or_id, | ||
| config, | ||
| fold, | ||
| "-tr", | ||
| f"{self.trainer_class_name}", | ||
| self.trainer_class_name, | ||
| "-num_gpus", | ||
| f"{num_gpus}", | ||
| num_gpus, | ||
| ] | ||
|
|
||
| if self.export_validation_probabilities: | ||
| cmd.append("--npz") | ||
|
|
||
| for _key, _value in kwargs.items(): | ||
| if _key == "p" or _key == "pretrained_weights": | ||
| cmd.extend([f"-{_key}", f"{_value}"]) | ||
| else: | ||
| cmd.extend([f"--{_key}", f"{_value}"]) | ||
| return cmd, env | ||
| prefix = "-" if _key in {"p", "pretrained_weights"} else "--" | ||
| cmd += [f"{prefix}{_key}", str(_value)] | ||
|
|
||
| cmd_str: list[str] = [str(c) for c in cmd] | ||
|
|
||
| return cmd_str, env |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
nnUNetv2_train argparse flags store_true --c --val --use_compressed --disable_checkpointing
💡 Result:
In nnU-Net v2, the command line utility nnUNetv2_train utilizes several argparse flags to control the training and validation process. These flags are implemented as store_true options, meaning they do not require an associated value and serve as toggles [1][2]. The requested flags perform the following functions: --c (Continue training): This flag instructs the trainer to resume training from the most recent checkpoint. It is essential if a training run was interrupted or if you wish to extend a previously stopped training session [2]. --val (Validation only): This flag sets the trainer to skip the training process and instead run only the final validation on the specified fold [2][3]. This is often used after a training run has completed or to re-run validation without re-training [4][3]. --use_compressed: This flag directs the system to use compressed data during training. By default, nnU-Net may decompress data; enabling this flag prevents decompression. It is noted that reading compressed data is significantly more CPU and potentially RAM intensive, and it should only be used when necessary [2]. --disable_checkpointing: This flag disables the saving of model checkpoints during training. This is useful in scenarios where you are testing a configuration or running quick experiments and do not want to consume disk space with multiple large model checkpoint files [2]. In addition to these, the --npz flag is frequently used alongside these commands. Although not explicitly in your list, it is critical to note that --npz is required if you intend to use nnUNetv2_find_best_configuration later, as it forces the model to save the softmax probabilities from the final validation, which are necessary for ensemble creation and model selection [4][3].
Citations:
- 1: nnUNetv2 parser plus MIC-DKFZ/nnUNet#1543
- 2: https://hackmd.io/n2NJtLqVT1iITYIBpzCLXA
- 3: https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/how-to/train-models.md
- 4: https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/how_to_use_nnunet.md
Fix argv construction for store_true kwargs (e.g., --c, --val) in monai/apps/nnunet/nnunetv2_runner.py (lines 585-605): the kwargs loop always appends str(_value), so documented boolean flags (c, val, use_compressed, disable_checkpointing) become --c True / --val True instead of being emitted as bare flags (the same pattern as the special-case --npz). Update the builder to emit --<flag> only when the value is True and omit the flag when False (and similarly only include pretrained_weights when it’s a real path, not False).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@monai/apps/nnunet/nnunetv2_runner.py` around lines 585 - 605, The argv
builder in the kwargs loop currently always appends str(_value), causing
store_true options to appear as "--flag True" instead of a bare "--flag" and
including false/empty pretrained_weights; update the loop in nnunetv2_runner.py
where cmd is assembled so that for boolean flags (e.g., keys "c", "val",
"use_compressed", "disable_checkpointing" and any other store_true options) you
only append the flag name (prefix + _key) when _value is True and skip it when
False, and for "pretrained_weights" (or "-p") only append the option and its
value when _value is truthy (a non-empty path); keep the existing prefix
selection logic for short vs long options and continue converting items to
strings later.
Fixes #8886
Types of changes