Add SwinTransformer support for torch_onnx quantization workflow by ajrasane · Pull Request #1235 · NVIDIA/Model-Optimizer

ajrasane · 2026-04-10T22:18:28Z

Summary

Enable end-to-end quantize → ONNX export → TRT engine pipeline for SwinTransformer models (v1 and v2) across FP8, INT8, MXFP8, NVFP4, and auto precision modes
Add Conv2d quantization overrides for TRT compatibility (TRT only supports FP8/INT8 for convolutions)
Fix FP8 LayerNorm type mismatch in TRT stronglyTyped mode by adding LayerNormalization to change_casts_to_fp16
Fix cast_initializer_to_dtype crash when node has no initializer inputs
Add vision model support matrix to README (ViT, Swin, SwinV2)
Rewrite tests: parametrize over (ViT, Swin) × (fp8, int8, mxfp8, nvfp4, auto) with TRT engine build verification

Test plan

python -m pytest tests/examples/torch_onnx/test_torch_quant_to_onnx.py -v — 10 tests (2 models × 5 modes), all pass
Verified Swin accuracy on ImageNet-1k across all precisions (FP8: 81.29%, INT8: 81.12%, MXFP8: 81.32%, NVFP4: 80.79%, Auto: 80.84% TRT top-1 vs 81.37% base)
INT4_AWQ deferred (TODO in test file) — requires INT4 exporter changes for non-MatMul/Gemm consumer patterns

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- ONNX export now supports arbitrary timm vision models (auto device selection) and new CLI options for model selection and kwargs.
- Configurable quantization overrides for Conv2d across quant modes; option to skip pretrained weights.
Bug Fixes
- Broader FP16 cast handling for more ONNX ops to improve low-bit export fidelity.
- Disabled inplace ReLU before auto-quantization to avoid incorrect transforms.
Documentation
- Updated docs with supported models table, quantization mappings, and example CLI usage.
Tests
- Expanded tests to cover multiple architectures, quant modes, and TensorRT build verification.

coderabbitai · 2026-04-10T22:18:41Z

📝 Walkthrough

Walkthrough

Adds timm-model support to ONNX export and quantization flows: new CLI option to export arbitrary timm vision models, model-specific input-shape resolution, Conv2d quantization overrides, expanded tests that build TensorRT engines for exported ONNX files.

Changes

Cohort / File(s)	Summary
ONNX Export CLI `examples/onnx_ptq/download_example_onnx.py`	Added `--timm_model_name` option, device selection, timm model instantiation, resolved input_shape via `timm.data.resolve_model_data_config`, ONNX path selection, and fp16/fp32 weights handling; branch runs alongside existing ViT export.
Quantize & Export Logic `examples/torch_onnx/torch_quant_to_onnx.py`	Added `get_quant_config(quantize_mode)` and `QUANT_CONFIG_DICT` typing; introduced Conv2d override lists for FP8/INT8, apply overrides with warnings, added `_disable_inplace_relu`, tightened `filter_func`, added `--no_pretrained` and `--model_kwargs`, and route standard quantization via `get_quant_config`.
Documentation `examples/torch_onnx/README.md`	Rewrote Vision Models docs and examples to reference `--timm_model_name`, added supported-models table and Conv2d quantization override descriptions; updated example CLI usage.
ONNX utils `modelopt/torch/_deploy/utils/torch_onnx.py`	Extended `change_casts_to_fp16` target consumer ops to also include `LayerNormalization`, `Clip`, `Mul`, and `Exp` alongside existing ops.
Tests & Test Helpers `tests/_test_utils/torch/vision_models.py`, `tests/examples/torch_onnx/test_torch_quant_to_onnx.py`	Dummy input creation now uses `timm.data.resolve_model_data_config(...)[\"input_size\"]`; added `swin_tiny_patch4_window7_224` to benchmarks; reworked parametrized test to cover multiple timm models and quant modes, added `_verify_trt_engine_build()` and TensorRT build assertions; tests pass `--no_pretrained`.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as Export CLI
    participant Timm as Timm
    participant Model as Model (timm)
    participant Exporter as Export/Quantize Logic
    participant ONNX as ONNX File

    CLI->>Timm: create_model(timm_model_name, pretrained=...)
    Timm-->>CLI: model
    CLI->>Timm: resolve_model_data_config(model)
    Timm-->>CLI: input_size
    CLI->>Exporter: export_to_onnx(model, input_shape, weights_dtype...)
    Exporter->>Exporter: apply quant config / Conv2d overrides
    Exporter->>ONNX: write ONNX file
    ONNX-->>CLI: onnx_save_path

sequenceDiagram
    participant Test as Test Suite
    participant Exporter as Export Script
    participant ONNX as ONNX File
    participant TRT as trtexec
    participant Result as Build Result

    Test->>Exporter: run export (model_key, quantize_mode, --no_pretrained, ...)
    Exporter->>ONNX: save quantized ONNX
    ONNX-->>Test: onnx_save_path
    Test->>TRT: trtexec(onnx_save_path, --builderOptimizationLevel=...)
    TRT->>TRT: build engine
    TRT-->>Test: return code / success
    Test->>Result: assert build success

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name	Status	Explanation	Resolution
Security Anti-Patterns	❌ Error	json.loads(args.model_kwargs) in torch_quant_to_onnx.py lacks error handling and input validation, allowing unsanitized command-line arguments to be parsed without JSONDecodeError protection or dictionary key validation.	Add try-except around json.loads to handle JSONDecodeError, validate parsed dictionary keys against a whitelist of acceptable timm model kwargs, and document expected JSON format in argument help text.
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and accurately describes the main change: adding SwinTransformer support to the torch_onnx quantization workflow, which is the primary objective of this PR.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ajrasane/pytorch_quantization

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-10T22:22:37Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1235/
Built to branch `gh-pages` at 2026-04-10 23:33 UTC. Preview will be ready when the GitHub Pages deployment is complete.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/onnx_ptq/download_example_onnx.py`:
- Around line 53-58: The flag --timm_model_name is set with a non-None default
so the check if args.timm_model_name always succeeds; change the
parser.add_argument for "timm_model_name" to use default=None and then make the
model selection checks mutually exclusive (replace the separate if for
args.timm_model_name with an elif in the same chain that handles args.vit,
args.swin, etc.) so only one export path runs; update references to
args.timm_model_name and ensure help text still documents the expected default
behavior.

In `@tests/examples/torch_onnx/test_torch_quant_to_onnx.py`:
- Line 52: Remove the prohibited "# nosec" bypass on the subprocess call in
tests/examples/torch_onnx/test_torch_quant_to_onnx.py: locate the
subprocess.run(...) invocation (the call using variables cmd,
capture_output=True, text=True, timeout=600) and simply delete the "# nosec"
comment; ensure the call remains a list-based invocation (no shell=True and no
external user input) so Bandit flags are not bypassed—if Bandit still flags it,
escalate to `@NVIDIA/modelopt-setup-codeowners` or adjust the project Bandit
config instead of adding "# nosec".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7a34ea4a-81c7-4f70-b2c8-b9d65a485dd1

📥 Commits

Reviewing files that changed from the base of the PR and between da0e8ff and 6455295.

📒 Files selected for processing (7)

examples/onnx_ptq/download_example_onnx.py
examples/torch_onnx/README.md
examples/torch_onnx/torch_quant_to_onnx.py
modelopt/onnx/quantization/qdq_utils.py
modelopt/torch/_deploy/utils/torch_onnx.py
tests/_test_utils/torch/vision_models.py
tests/examples/torch_onnx/test_torch_quant_to_onnx.py

coderabbitai · 2026-04-10T22:22:38Z

examples/onnx_ptq/download_example_onnx.py

+    parser.add_argument(
+        "--timm_model_name",
+        type=str,
+        default="vit_base_patch16_224",
+        help="Export any timm model to ONNX (e.g., swin_tiny_patch4_window7_224).",
+    )


⚠️ Potential issue | 🟠 Major

Logic issue: --timm_model_name always evaluates to truthy due to default value.

Since --timm_model_name has default="vit_base_patch16_224", the condition if args.timm_model_name: on line 99 is always True. This causes unintended behavior:

Running python download_example_onnx.py --vit exports the ViT model twice (once via --vit block, once via --timm_model_name block).

Running with no model flags still triggers the --timm_model_name block.

🐛 Proposed fix: Change default to None and use elif

parser.add_argument( "--timm_model_name", type=str, - default="vit_base_patch16_224", + default=None, help="Export any timm model to ONNX (e.g., swin_tiny_patch4_window7_224).", )

- if args.timm_model_name: + elif args.timm_model_name: device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Also applies to: 99-116

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/onnx_ptq/download_example_onnx.py` around lines 53 - 58, The flag --timm_model_name is set with a non-None default so the check if args.timm_model_name always succeeds; change the parser.add_argument for "timm_model_name" to use default=None and then make the model selection checks mutually exclusive (replace the separate if for args.timm_model_name with an elif in the same chain that handles args.vit, args.swin, etc.) so only one export path runs; update references to args.timm_model_name and ensure help text still documents the expected default behavior.

coderabbitai · 2026-04-10T22:22:38Z

tests/examples/torch_onnx/test_torch_quant_to_onnx.py

+        f"--builderOptimizationLevel={opt_level}",
+    ]
+
+    result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)  # nosec


⚠️ Potential issue | 🔴 Critical

CRITICAL: # nosec comment is prohibited by coding guidelines.

The # nosec comment to bypass Bandit security checks is explicitly prohibited. Per coding guidelines:

"Any use of '# nosec' comments to bypass Bandit security checks is not allowed. If a security-sensitive pattern is genuinely necessary, the PR must be reviewed and approved by @NVIDIA/modelopt-setup-codeowners with an explicit justification in the PR description."

The subprocess.run() call here appears safe (no shell=True, arguments passed as a list, no user-supplied input), but the bypass mechanism itself is not allowed.

🔒 Proposed fix: Remove the nosec comment

The subprocess call is safe as-is since:

Arguments are passed as a list (not shell string)

shell=True is not used

All arguments are controlled by the test, not external input

Simply remove the # nosec comment:

- result = subprocess.run(cmd, capture_output=True, text=True, timeout=600) # nosec + result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)

If Bandit still flags this, consider using a more targeted exclusion in the Bandit config file or requesting formal review from @NVIDIA/modelopt-setup-codeowners.

As per coding guidelines: "Prohibit the use of '# nosec' comments to bypass Bandit security checks."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

result = subprocess.run(cmd, capture_output=True, text=True, timeout=600) # nosec

result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/examples/torch_onnx/test_torch_quant_to_onnx.py` at line 52, Remove the prohibited "# nosec" bypass on the subprocess call in tests/examples/torch_onnx/test_torch_quant_to_onnx.py: locate the subprocess.run(...) invocation (the call using variables cmd, capture_output=True, text=True, timeout=600) and simply delete the "# nosec" comment; ensure the call remains a list-based invocation (no shell=True and no external user input) so Bandit flags are not bypassed—if Bandit still flags it, escalate to `@NVIDIA/modelopt-setup-codeowners` or adjust the project Bandit config instead of adding "# nosec".

codecov · 2026-04-10T22:31:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.43%. Comparing base (da0e8ff) to head (15f3809).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1235      +/-   ##
==========================================
+ Coverage   76.04%   77.43%   +1.39%     
==========================================
  Files         350      350              
  Lines       40478    40478              
==========================================
+ Hits        30781    31344     +563     
+ Misses       9697     9134     -563

Flag	Coverage Δ
examples	`43.76% <100.00%> (+2.39%)`	⬆️
gpu	`57.42% <0.00%> (-0.10%)`	⬇️
unit	`55.53% <0.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

examples/torch_onnx/torch_quant_to_onnx.py (1)

351-357: ⚠️ Potential issue | 🟠 Major

--no_pretrained / --model_kwargs are not propagated into calibration-data model setup

Line 351 and Line 373 call load_calibration_data(...), but that helper still builds its own model with pretrained=True and fixed kwargs (see examples/torch_onnx/torch_quant_to_onnx.py Line 129). This bypasses the new CLI behavior and can force unexpected weight downloads or mismatched data config for custom model kwargs.

💡 Suggested fix

-def load_calibration_data(model_name, data_size, batch_size, device, with_labels=False):
+def load_calibration_data(model, data_size, batch_size, device, with_labels=False):
@@
-    model = timm.create_model(model_name, pretrained=True, num_classes=1000)
     data_config = timm.data.resolve_model_data_config(model)
@@
-        data_loader = load_calibration_data(
-            args.timm_model_name,
+        data_loader = load_calibration_data(
+            model,
             args.calibration_data_size,
             args.batch_size,
             device,
             with_labels=True,
         )
@@
-            data_loader = load_calibration_data(
-                args.timm_model_name,
+            data_loader = load_calibration_data(
+                model,
                 args.calibration_data_size,
                 args.batch_size,
                 device,
                 with_labels=False,
             )

Also applies to: 373-379

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@examples/torch_onnx/torch_quant_to_onnx.py` around lines 351 - 357, Calls to
load_calibration_data are building their own model with hardcoded
pretrained=True and fixed kwargs; update the two callers to pass through the CLI
options (e.g., args.no_pretrained and args.model_kwargs) and change
load_calibration_data's signature to accept these parameters (e.g.,
pretrained_override or no_pretrained and model_kwargs) and use them when
constructing the timm model (set pretrained = not no_pretrained and pass
model_kwargs into the model creation path instead of fixed kwargs). Ensure both
places that call load_calibration_data (the lines around data_loader =
load_calibration_data(...) at start and the later call) are updated to forward
the flags so calibration uses the same model configuration as the rest of the
script.

♻️ Duplicate comments (1)

tests/examples/torch_onnx/test_torch_quant_to_onnx.py (1)
57-57: ⚠️ Potential issue | 🔴 Critical

Remove prohibited # nosec suppression

The subprocess invocation itself is fine (list args, no shell=True), but # nosec is not allowed in this repo and must be removed.
🔧 Minimal fix
-    result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)  # nosec
+    result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)
As per coding guidelines, "Any use of '# nosec' comments to bypass Bandit security checks is not allowed."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/examples/torch_onnx/test_torch_quant_to_onnx.py` at line 57, Remove the
prohibited "# nosec" suppression from the subprocess.run call in the test (the
line assigning to result via subprocess.run(cmd, capture_output=True, text=True,
timeout=600)) — leave the invocation intact (list args, no shell=True) but
delete the trailing "  # nosec" comment so Bandit checks will run normally;
verify tests still pass after removal.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/torch_onnx/README.md`:
- Around line 305-309: Update the support matrix row for the model identifier
"swinv2_tiny_window8_256" in README.md to remove the Auto ✅ (change to ❌) or add
a footnote explaining it's currently unsupported; reference the test that skips
this combo (test_torch_quant_to_onnx.py) as the reason for the change so readers
know the limitation is intentional.

---

Outside diff comments:
In `@examples/torch_onnx/torch_quant_to_onnx.py`:
- Around line 351-357: Calls to load_calibration_data are building their own
model with hardcoded pretrained=True and fixed kwargs; update the two callers to
pass through the CLI options (e.g., args.no_pretrained and args.model_kwargs)
and change load_calibration_data's signature to accept these parameters (e.g.,
pretrained_override or no_pretrained and model_kwargs) and use them when
constructing the timm model (set pretrained = not no_pretrained and pass
model_kwargs into the model creation path instead of fixed kwargs). Ensure both
places that call load_calibration_data (the lines around data_loader =
load_calibration_data(...) at start and the later call) are updated to forward
the flags so calibration uses the same model configuration as the rest of the
script.

---

Duplicate comments:
In `@tests/examples/torch_onnx/test_torch_quant_to_onnx.py`:
- Line 57: Remove the prohibited "# nosec" suppression from the subprocess.run
call in the test (the line assigning to result via subprocess.run(cmd,
capture_output=True, text=True, timeout=600)) — leave the invocation intact
(list args, no shell=True) but delete the trailing "  # nosec" comment so Bandit
checks will run normally; verify tests still pass after removal.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 95b4fbb8-da54-4e92-8ebf-74879aa8e76f

📥 Commits

Reviewing files that changed from the base of the PR and between 6455295 and 9793444.

📒 Files selected for processing (6)

examples/onnx_ptq/download_example_onnx.py
examples/torch_onnx/README.md
examples/torch_onnx/torch_quant_to_onnx.py
modelopt/torch/_deploy/utils/torch_onnx.py
tests/_test_utils/torch/vision_models.py
tests/examples/torch_onnx/test_torch_quant_to_onnx.py

🚧 Files skipped from review as they are similar to previous changes (2)

tests/_test_utils/torch/vision_models.py
modelopt/torch/_deploy/utils/torch_onnx.py

coderabbitai · 2026-04-10T23:25:37Z

examples/torch_onnx/README.md

+| Model | FP8 | INT8 | MXFP8 | NVFP4 | INT4_AWQ | Auto |
+| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
+| [vit_base_patch16_224](https://huggingface.co/timm/vit_base_patch16_224.augreg_in21k_ft_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [swin_tiny_patch4_window7_224](https://huggingface.co/timm/swin_tiny_patch4_window7_224.ms_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [swinv2_tiny_window8_256](https://huggingface.co/timm/swinv2_tiny_window8_256.ms_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |


⚠️ Potential issue | 🟠 Major

Support matrix overstates swinv2_tiny Auto support

Line 309 marks Auto as ✅ for swinv2_tiny_window8_256, but tests explicitly skip that combo (tests/examples/torch_onnx/test_torch_quant_to_onnx.py Line 35). Please mark it unsupported (or add a footnote with the current limitation).

📝 Suggested docs correction

-| [swinv2_tiny_window8_256](https://huggingface.co/timm/swinv2_tiny_window8_256.ms_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | +| [swinv2_tiny_window8_256](https://huggingface.co/timm/swinv2_tiny_window8_256.ms_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

| Model | FP8 | INT8 | MXFP8 | NVFP4 | INT4_AWQ | Auto |

| :---: | :---: | :---: | :---: | :---: | :---: | :---: |

| [vit_base_patch16_224](https://huggingface.co/timm/vit_base_patch16_224.augreg_in21k_ft_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |

| [swin_tiny_patch4_window7_224](https://huggingface.co/timm/swin_tiny_patch4_window7_224.ms_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |

| [swinv2_tiny_window8_256](https://huggingface.co/timm/swinv2_tiny_window8_256.ms_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |

| Model | FP8 | INT8 | MXFP8 | NVFP4 | INT4_AWQ | Auto |

| :---: | :---: | :---: | :---: | :---: | :---: | :---: |

| [vit_base_patch16_224](https://huggingface.co/timm/vit_base_patch16_224.augreg_in21k_ft_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |

| [swin_tiny_patch4_window7_224](https://huggingface.co/timm/swin_tiny_patch4_window7_224.ms_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |

| [swinv2_tiny_window8_256](https://huggingface.co/timm/swinv2_tiny_window8_256.ms_in1k) | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/torch_onnx/README.md` around lines 305 - 309, Update the support matrix row for the model identifier "swinv2_tiny_window8_256" in README.md to remove the Auto ✅ (change to ❌) or add a footnote explaining it's currently unsupported; reference the test that skips this combo (test_torch_quant_to_onnx.py) as the reason for the change so readers know the limitation is intentional.

Enable end-to-end quantize-export-TRT pipeline for SwinTransformer models (v1 and v2) across FP8, INT8, MXFP8, NVFP4, and auto precision modes. Core fixes: - Add LayerNormalization, Clip, Mul, Exp to change_casts_to_fp16 for FP8 stronglyTyped compatibility (fixes type mismatches in Swin/SwinV2 TRT builds) Example/test changes: - Add Conv2d quantization overrides for TRT compatibility (MXFP8/NVFP4->FP8, INT4_AWQ->INT8) since TRT only supports FP8/INT8 for convolutions - Add cpb_mlp and downsample to quantization filter exclusion list - Add --no_pretrained and --model_kwargs CLI args for testing with tiny models - Add --timm_model_name to download_example_onnx.py (default: ViT) - Add SwinTransformer to vision_models.py with dynamic input size resolution - Rewrite tests: parametrize over (ViT, Swin, SwinV2) x (fp8, int8, mxfp8, nvfp4, auto) with TRT engine build verification using --stronglyTyped - Update README with vision model support matrix and Conv2d override docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

coderabbitai

♻️ Duplicate comments (2)

tests/examples/torch_onnx/test_torch_quant_to_onnx.py (1)

53-53: ⚠️ Potential issue | 🔴 Critical

Remove the # nosec bypass.

This subprocess.run(...) call is already using a list of args and no shell=True, so the bypass is unnecessary and violates repo policy.
Suggested fix
-    result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)  # nosec
+    result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)
As per coding guidelines, "Any use of '# nosec' comments to bypass Bandit security checks is not allowed."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/examples/torch_onnx/test_torch_quant_to_onnx.py` at line 53, The
subprocess.run call assigning to result (result = subprocess.run(cmd,
capture_output=True, text=True, timeout=600)) includes an unnecessary "# nosec"
suppression; remove the "# nosec" comment and leave the call as-is (ensure cmd
remains a list and no shell=True is used) so Bandit checks are not bypassed
while preserving capture_output, text, and timeout parameters.

examples/onnx_ptq/download_example_onnx.py (1)

53-58: ⚠️ Potential issue | 🟠 Major

Make --timm_model_name opt-in and mutually exclusive.

With default="vit_base_patch16_224", Line 99 is always truthy. That means --vit exports twice, and --llama or even “no flags” still export a timm model.

Suggested fix

     parser.add_argument(
         "--timm_model_name",
         type=str,
-        default="vit_base_patch16_224",
+        default=None,
         help="Export any timm model to ONNX (e.g., swin_tiny_patch4_window7_224).",
     )

-    if args.vit:
+    if args.vit:
         device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
         model = timm.create_model("vit_base_patch16_224", pretrained=True, num_classes=1000).to(
             device
         )
         ...
-    if args.timm_model_name:
+    elif args.timm_model_name:
         device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
         model = timm.create_model(args.timm_model_name, pretrained=True, num_classes=1000).to(
             device
         )
         ...
-    if args.llama:
+    elif args.llama:
         ...

Also applies to: 99-116

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@examples/onnx_ptq/download_example_onnx.py` around lines 53 - 58, The parser
currently always exports a timm model because
parser.add_argument("--timm_model_name", default="vit_base_patch16_224") is
truthy; change this to be opt-in by removing the default (use default=None or
action that requires the flag) and validate presence before exporting in the
export flow (e.g., check timm_model_name is not None). Also create an argparse
mutually exclusive group (parser.add_mutually_exclusive_group) and add the timm
option alongside the existing flags like --vit and --llama so only one export
path is allowed; update any checks that assume timm_model_name is always set
(export logic around the timm export function) to rely on the new exclusivity
and None-check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@examples/onnx_ptq/download_example_onnx.py`:
- Around line 53-58: The parser currently always exports a timm model because
parser.add_argument("--timm_model_name", default="vit_base_patch16_224") is
truthy; change this to be opt-in by removing the default (use default=None or
action that requires the flag) and validate presence before exporting in the
export flow (e.g., check timm_model_name is not None). Also create an argparse
mutually exclusive group (parser.add_mutually_exclusive_group) and add the timm
option alongside the existing flags like --vit and --llama so only one export
path is allowed; update any checks that assume timm_model_name is always set
(export logic around the timm export function) to rely on the new exclusivity
and None-check.

In `@tests/examples/torch_onnx/test_torch_quant_to_onnx.py`:
- Line 53: The subprocess.run call assigning to result (result =
subprocess.run(cmd, capture_output=True, text=True, timeout=600)) includes an
unnecessary "# nosec" suppression; remove the "# nosec" comment and leave the
call as-is (ensure cmd remains a list and no shell=True is used) so Bandit
checks are not bypassed while preserving capture_output, text, and timeout
parameters.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cd3f4d08-7ded-4c8d-b84e-dec1aa75f848

📥 Commits

Reviewing files that changed from the base of the PR and between 9793444 and 15f3809.

📒 Files selected for processing (6)

examples/onnx_ptq/download_example_onnx.py
examples/torch_onnx/README.md
examples/torch_onnx/torch_quant_to_onnx.py
modelopt/torch/_deploy/utils/torch_onnx.py
tests/_test_utils/torch/vision_models.py
tests/examples/torch_onnx/test_torch_quant_to_onnx.py

✅ Files skipped from review due to trivial changes (1)

examples/torch_onnx/README.md

🚧 Files skipped from review as they are similar to previous changes (3)

tests/_test_utils/torch/vision_models.py
modelopt/torch/_deploy/utils/torch_onnx.py
examples/torch_onnx/torch_quant_to_onnx.py

ajrasane requested review from a team as code owners April 10, 2026 22:18

ajrasane requested review from cjluo-nv and galagam April 10, 2026 22:18

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

ajrasane force-pushed the ajrasane/pytorch_quantization branch from 6455295 to 9793444 Compare April 10, 2026 23:20

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

ajrasane force-pushed the ajrasane/pytorch_quantization branch from 9793444 to 15f3809 Compare April 10, 2026 23:29

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SwinTransformer support for torch_onnx quantization workflow#1235

Add SwinTransformer support for torch_onnx quantization workflow#1235
ajrasane wants to merge 1 commit intomainfrom
ajrasane/pytorch_quantization

ajrasane commented Apr 10, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 10, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks failed

Uh oh!

github-actions bot commented Apr 10, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-04-10 23:33 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 10, 2026

Uh oh!

coderabbitai bot Apr 10, 2026

Uh oh!

codecov bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 10, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	result = subprocess.run(cmd, capture_output=True, text=True, timeout=600) # nosec
	result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)

Conversation

ajrasane commented Apr 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks failed

❌ Failed checks (1 error, 1 warning)

Uh oh!

github-actions bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Built to branch gh-pages at 2026-04-10 23:33 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ajrasane commented Apr 10, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 10, 2026 •

edited

Loading

github-actions bot commented Apr 10, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-04-10 23:33 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

codecov bot commented Apr 10, 2026 •

edited

Loading