Skip to content

[OMNIML-4922] Four over Six PTQ & Updating Nemotron Ultra Example#1684

Open
jenchen13 wants to merge 4 commits into
mainfrom
jennifchen/omniml-4922-four-over-six
Open

[OMNIML-4922] Four over Six PTQ & Updating Nemotron Ultra Example#1684
jenchen13 wants to merge 4 commits into
mainfrom
jennifchen/omniml-4922-four-over-six

Conversation

@jenchen13

@jenchen13 jenchen13 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Four Over Six PTQ implementation for weight-only quantization. Four Over Six was used to produce the Nemotron 3 Ultra NVFP4 checkpoint. Also updates the Ultra PTQ example in the launcher to use this new 4/6 config huggingface/nvidia/Nemotron-3-Ultra-550B-A55B/ptq/ultra-nvfp4-46-max

Usage

uv run launch.py --yaml examples/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16/megatron_lm_ptq.yaml --yes

Testing

  • Unit tests pass
  • [ ] Run launcher example

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

  • Is this change backward compatible?: ✅ / ❌ / N/A
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
  • Did you write any new necessary tests?: ✅ / ❌ / N/A
  • Did you update Changelog?: ✅ / ❌ / N/A
  • Did you get Claude approval on this PR?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

  • New Features

    • Added NVFP4 Four‑Over‑Six (4/6) adaptive per‑block weight scaling and a configurable FP8 normalization option for FP4/FP8 quantization.
  • Documentation

    • Added PTQ recipes/presets and updated config docs to document FP8 max variants and the Four‑Over‑Six option.
  • Tests

    • Added unit and GPU tests validating 4/6 selection, normalization threading, scaling behavior, and reconstruction error checks.
  • Chores

    • Updated a PTQ pipeline example to use the NVFP4‑46‑max recipe and bumped a container image; minor project config tweak.

@jenchen13 jenchen13 requested review from a team as code owners June 11, 2026 16:48
@copy-pr-bot

copy-pr-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Four-Over-Six (4/6) adaptive per-block NVFP4 weight scaling: parameterizes FP8 normalization, implements M=4 vs M=6 per-block candidate generation and MSE-based selection, threads flags through quantizers and backend, provides presets/recipes, and adds unit tests.

Changes

NVFP4 Four-Over-Six Adaptive Per-Block Scaling

Layer / File(s) Summary
FP8 kernel constants and parameter plumbing
modelopt/torch/kernels/quantization/gemm/fp4_kernel.py, modelopt/torch/kernels/quantization/gemm/fp4_kernel_hopper.py, modelopt/torch/quantization/tensor_quant.py
Add FP8_E4M3_MAX = 448.0, add fp8_max_for_normalization parameter to FP4 kernels and the StaticBlockwiseFP4FakeQuantFunction, document it, and thread it into per-block/global normalization computations.
NVFP4QTensor constants and adaptive scale selection
modelopt/torch/quantization/qtensor/nvfp4_tensor.py
Add FP4/FP8 normalization constants and 4/6 variants; add _is_four_over_six; compute weights_scaling_factor_2 with mode-specific FP8 max; implement M=4/M=6 candidate generation, FP8→E2M1 fake-quant round-trip, per-block MSE scoring, and select best per-block scales.
Quantizer integration & NVFP4-GEMM backend wiring
modelopt/torch/quantization/nn/modules/tensor_quantizer.py, modelopt/torch/quantization/backends/nvfp4_gemm.py, modelopt/torch/export/layer_utils.py
Thread four_over_six and computed weights_scaling_factor_2 into NVFP4QTensor.quantize; compute/passthrough fp8_max_for_normalization in static fake-quant; select weight_fp8_max (256 vs 448) in backend and restrict NVFP4 GEMM availability when 4/6 is enabled for non-prequantized weights; update docstring wording.
Configuration validation, presets and recipes
modelopt/torch/quantization/config.py, modelopt_recipes/configs/numerics/nvfp4_four_over_six.yaml, modelopt_recipes/configs/ptq/presets/model/nvfp4_four_over_six.yaml, modelopt_recipes/configs/ptq/units/w4a4_nvfp4_nvfp4_four_over_six.yaml, modelopt_recipes/huggingface/.../nvfp4-46-max.yaml, tools/launcher/examples/.../megatron_lm_ptq.yaml, pyproject.toml
Allow "four_over_six" key in block_sizes validation; register NVFP4_FOUR_OVER_SIX_CFG; add numerics/unit/preset YAMLs and a Hugging Face recipe for nvfp4-46-max; update example launcher references and add MyPy Python version setting.
Unit tests for Four-Over-Six
tests/unit/torch/quantization/*, tests/gpu/torch/quantization/test_nvfp4_fp8_sweep_kernel.py
Add CPU and GPU tests validating FP8/FP4 constants, _is_four_over_six behavior, get_weights_scaling_factor_2 denominator change (448 vs 256), _select_four_over_six_scale correctness and MSE checks, per-block scale sanity, threading of fp8 normalization value, and preset wiring.

Sequence Diagram

sequenceDiagram
  participant Quantizer
  participant NVFP4QTensor
  participant _select_four_over_six_scale
  participant triton_kernel
  Quantizer->>NVFP4QTensor: quantize(four_over_six flag)
  NVFP4QTensor->>_select_four_over_six_scale: per-block amax, generate M4/M6 candidates
  _select_four_over_six_scale->>triton_kernel: fake-quant candidates (FP8→E2M1) round-trip
  triton_kernel-->>_select_four_over_six_scale: quantized reconstructions
  _select_four_over_six_scale-->>NVFP4QTensor: chosen per-block scales (lower MSE)
  NVFP4QTensor->>triton_kernel: static_blockwise_fp4_fake_quant(fp8_max_for_normalization)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA/Model-Optimizer#849: Related refactor introducing NVFP4StaticQuantizer and MSE calibration logic that this PR builds upon.

Suggested labels

cherry-pick-0.45.0

Suggested reviewers

  • meenchen
  • Edwardf0t1
  • kevalmorabia97
  • realAsma
🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 55.93% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main changes: implementing Four-over-Six (4/6) PTQ method and updating the Nemotron Ultra example to use the new config.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns ✅ Passed Scanned all PR-changed Python files for torch.load(weights_only=False), numpy.load(allow_pickle=True), trust_remote_code=True, eval/exec, and # nosec—none found; pyproject change (db5497e) only add...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jennifchen/omniml-4922-four-over-six

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jenchen13 jenchen13 requested a review from realAsma June 11, 2026 16:50
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor
PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1684/

Built to branch gh-pages at 2026-06-12 19:47 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@modelopt_recipes/configs/ptq/units/w4a4_nvfp4_nvfp4_four_over_six.yaml`:
- Around line 16-18: The comment wrongly states that both weight and input
quantizers are dynamic NVFP4; update it to reflect that the imported weight
config uses the static four_over_six path (i.e. static weights with
Four-Over-Six scaling) while only the input/activation quantizers remain dynamic
NVFP4. Locate the QuantizerCfgList and the reference to the four_over_six weight
config (the w4a4_nvfp4_nvfp4_four_over_six unit) and change the wording to
“static weights (four_over_six) + dynamic NVFP4 inputs/activations” or
equivalent concise phrasing.

In `@tests/unit/torch/quantization/test_nvfp4_four_over_six.py`:
- Around line 168-169: The test currently imports choices inside the test body;
move the import statement "from modelopt.torch.quantization.config import
choices" to the module top-level so import errors surface at collection time and
follow project import guidelines, or if there is a true
circular/optional/heavy-import reason, keep it inline but add a concise comment
justifying the in-function import and mark the test appropriately (e.g., a skip
or note) so reviewers understand the exception.
- Around line 116-121: The selected-scale re-cast uses the default FP8
normalization max; change the second call so sel_scale is cast with the
four-over-six normalization max (256) to match the validation path. Update the
call to NVFP4QTensor._fake_quant_to_e2m1(...,
_cast_per_block_scale_to_fp8(sel_scale, 256).float(), ...) (or pass the project
constant for the 4/6 max) so sel_scale uses the 4/6 normalization like m6_scale
and BLOCK_SIZE remains unchanged.

In
`@tools/launcher/examples/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16/megatron_lm_ptq.yaml`:
- Line 32: Update the QUANT_CFG and export tag values to the new 4/6 preset
names: replace any occurrence of QUANT_CFG value
"huggingface/nvidia/Nemotron-3-Ultra-550B-A55B/ptq/ultra-nvfp4-max-calib" with
"huggingface/nvidia/Nemotron-3-Ultra-550B-A55B/ptq/ultra-nvfp4-46-max", and
replace export tag usages of "super-nvfp4-max-calib" with "super-nvfp4-46-max"
so task_2 artifact lookup path derived from QUANT_CFG matches; search for the
QUANT_CFG key and the literal export tag "super-nvfp4-max-calib" in the YAML
(lines around the existing QUANT_CFG, lines ~55, ~76-77, ~83) and update them
consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 922a3994-8d5b-4c0d-81b4-8d59a2509a7c

📥 Commits

Reviewing files that changed from the base of the PR and between c88b62b and cdc160a.

📒 Files selected for processing (14)
  • modelopt/torch/export/layer_utils.py
  • modelopt/torch/kernels/quantization/gemm/fp4_kernel.py
  • modelopt/torch/kernels/quantization/gemm/fp4_kernel_hopper.py
  • modelopt/torch/quantization/backends/nvfp4_gemm.py
  • modelopt/torch/quantization/config.py
  • modelopt/torch/quantization/nn/modules/tensor_quantizer.py
  • modelopt/torch/quantization/qtensor/nvfp4_tensor.py
  • modelopt/torch/quantization/tensor_quant.py
  • modelopt_recipes/configs/numerics/nvfp4_four_over_six.yaml
  • modelopt_recipes/configs/ptq/presets/model/nvfp4_four_over_six.yaml
  • modelopt_recipes/configs/ptq/units/w4a4_nvfp4_nvfp4_four_over_six.yaml
  • modelopt_recipes/huggingface/nvidia/Nemotron-3-Ultra-550B-A55B/ptq/ultra-nvfp4-46-max.yaml
  • tests/unit/torch/quantization/test_nvfp4_four_over_six.py
  • tools/launcher/examples/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16/megatron_lm_ptq.yaml

Comment thread modelopt_recipes/configs/ptq/units/w4a4_nvfp4_nvfp4_four_over_six.yaml Outdated
Comment thread tests/unit/torch/quantization/test_nvfp4_four_over_six.py
Comment thread tests/unit/torch/quantization/test_nvfp4_four_over_six.py Outdated
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 88.40580% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.04%. Comparing base (46eddab) to head (86ac97b).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
modelopt/torch/quantization/backends/nvfp4_gemm.py 14.28% 6 Missing ⚠️
...odelopt/torch/quantization/qtensor/nvfp4_tensor.py 98.14% 1 Missing ⚠️
modelopt/torch/quantization/tensor_quant.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1684      +/-   ##
==========================================
+ Coverage   67.73%   76.04%   +8.31%     
==========================================
  Files         511      511              
  Lines       56169    56636     +467     
==========================================
+ Hits        38044    43068    +5024     
+ Misses      18125    13568    -4557     
Flag Coverage Δ
examples 41.82% <36.23%> (+0.51%) ⬆️
gpu 57.41% <55.07%> (+25.46%) ⬆️
regression 14.68% <21.73%> (+0.03%) ⬆️
unit 54.38% <78.26%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tools/launcher/examples/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16/megatron_lm_ptq.yaml (1)

21-21: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the pipeline note to match the active quant config tag.

Line 21 still says super-nvfp4-max-calib while tasks now use nvfp4-46-max. This creates doc/config drift for operators.

As per coding guidelines, “Don't repeat yourself; keep a single source of truth.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@tools/launcher/examples/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16/megatron_lm_ptq.yaml`
at line 21, Update the pipeline note string to reflect the active quant config
tag: replace "super-nvfp4-max-calib" with "nvfp4-46-max" in the note field (the
quoted value on the line containing note: in the megatron_lm_ptq.yaml) so the
documentation/config matches the actual tasks using nvfp4-46-max.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In
`@tools/launcher/examples/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16/megatron_lm_ptq.yaml`:
- Line 21: Update the pipeline note string to reflect the active quant config
tag: replace "super-nvfp4-max-calib" with "nvfp4-46-max" in the note field (the
quoted value on the line containing note: in the megatron_lm_ptq.yaml) so the
documentation/config matches the actual tasks using nvfp4-46-max.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8936915e-0ac5-43c0-a688-b25075b3ccf9

📥 Commits

Reviewing files that changed from the base of the PR and between cdc160a and 97d1402.

📒 Files selected for processing (6)
  • modelopt/torch/quantization/qtensor/nvfp4_tensor.py
  • modelopt_recipes/configs/numerics/nvfp4_four_over_six.yaml
  • modelopt_recipes/configs/ptq/presets/model/nvfp4_four_over_six.yaml
  • modelopt_recipes/configs/ptq/units/w4a4_nvfp4_nvfp4_four_over_six.yaml
  • modelopt_recipes/huggingface/models/nvidia/Nemotron-3-Ultra-550B-A55B/ptq/nvfp4-46-max.yaml
  • tools/launcher/examples/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16/megatron_lm_ptq.yaml
💤 Files with no reviewable changes (1)
  • modelopt_recipes/huggingface/models/nvidia/Nemotron-3-Ultra-550B-A55B/ptq/nvfp4-46-max.yaml
🚧 Files skipped from review as they are similar to previous changes (4)
  • modelopt_recipes/configs/numerics/nvfp4_four_over_six.yaml
  • modelopt_recipes/configs/ptq/presets/model/nvfp4_four_over_six.yaml
  • modelopt_recipes/configs/ptq/units/w4a4_nvfp4_nvfp4_four_over_six.yaml
  • modelopt/torch/quantization/qtensor/nvfp4_tensor.py

@jenchen13

Copy link
Copy Markdown
Contributor Author

/claude review

jenchen13 and others added 2 commits June 12, 2026 07:59
Add the NVFP4_FOUR_OVER_SIX_CFG preset (scoped to max calibration) and
implement 4/6 scale selection in NVFP4QTensor, normalizing the selected
per-block scale with F8_E4M3_MAX_46. Wire the supporting changes through
the fp4 kernels, nvfp4_gemm backend, tensor_quant, tensor_quantizer,
config, and layer_utils.

Add recipe/preset YAMLs and the Megatron PTQ launcher example, plus unit
tests covering 4/6 quantization and config registration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>
mypy defaulted to the running interpreter's version and failed to parse
3.10 match/case syntax (e.g. precisionconverter.py). Pin python_version
to 3.10 so mypy parses modern syntax.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>
@jenchen13 jenchen13 force-pushed the jennifchen/omniml-4922-four-over-six branch from c3f69af to db5497e Compare June 12, 2026 15:01
@jenchen13 jenchen13 requested a review from a team as a code owner June 12, 2026 15:01
@jenchen13

Copy link
Copy Markdown
Contributor Author

/claude review

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/torch/quantization/test_mse_calibrator.py`:
- Line 805: Move the function-scoped import "import
modelopt.torch.quantization.nn.modules.tensor_quantizer as tqm" out of the test
helper and place it at the top of the module imports in
tests/unit/torch/quantization/test_mse_calibrator.py; update any references
inside the helper to use the top-level symbol `tqm`, and if you believe the
import must remain local due to a genuine circular or optional dependency, add a
short comment above it documenting that justification—otherwise keep it as a
standard module-level import to ensure import errors surface at test collection
time.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 930d2b15-6a15-40ad-8113-ec2383f013f5

📥 Commits

Reviewing files that changed from the base of the PR and between db5497e and 90b1c76.

📒 Files selected for processing (5)
  • tests/gpu/torch/quantization/test_nvfp4_fp8_sweep_kernel.py
  • tests/unit/torch/quantization/test_config_validation.py
  • tests/unit/torch/quantization/test_mse_calibrator.py
  • tests/unit/torch/quantization/test_nvfp4_four_over_six.py
  • tests/unit/torch/quantization/test_nvfp4_static_export_cpu.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/torch/quantization/test_nvfp4_four_over_six.py

return q

def _captured_fp8_max(self, monkeypatch, four_over_six: bool) -> float:
import modelopt.torch.quantization.nn.modules.tensor_quantizer as tqm

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Move the function-scoped import to module top.

Line 805 introduces an in-function import in a test helper without an explicit circular/optional dependency justification. This violates the test import convention and can hide import failures until runtime.

As per coding guidelines, imports in tests/**/*.py should be at module top unless there is a concrete, documented exception.

Suggested patch
@@
 import torch
 
 import modelopt.torch.quantization as mtq
+import modelopt.torch.quantization.nn.modules.tensor_quantizer as tqm
 from modelopt.torch.quantization import calib
@@
     def _captured_fp8_max(self, monkeypatch, four_over_six: bool) -> float:
-        import modelopt.torch.quantization.nn.modules.tensor_quantizer as tqm
-
         captured = {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/torch/quantization/test_mse_calibrator.py` at line 805, Move the
function-scoped import "import
modelopt.torch.quantization.nn.modules.tensor_quantizer as tqm" out of the test
helper and place it at the top of the module imports in
tests/unit/torch/quantization/test_mse_calibrator.py; update any references
inside the helper to use the top-level symbol `tqm`, and if you believe the
import must remain local due to a genuine circular or optional dependency, add a
short comment above it documenting that justification—otherwise keep it as a
standard module-level import to ensure import errors surface at test collection
time.

Source: Coding guidelines

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant