Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Qwen3.5 support: docs updated, model-name mapping extended, SV-D quant fusion mapping extended, quantizer-exclusion rules added for narrow hybrid-attention projections, test utilities for a tiny Qwen3.5, and a new unit test validating PTQ behavior for Qwen3.5 hybrid-attention models. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unit/torch/quantization/plugins/test_huggingface.py`:
- Around line 286-293: The test currently sets has_gdn_quantized /
has_attn_quantized based only on module name and presence of
module.weight_quantizer, which can give false positives when quantization is
disabled; update the loop over model.named_modules() to also check
module.weight_quantizer.is_enabled (or truthiness of that property) before
setting the flags and ensure the final assertions verify that the found modules
have weight_quantizer.is_enabled true (i.e., assert the quantizer is enabled for
"linear_attn.in_proj_qkv" and "self_attn.q_proj" modules).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b1ad8b23-7859-407c-8231-f295b28c6ba4
📒 Files selected for processing (7)
examples/llm_ptq/README.mdexamples/llm_ptq/example_utils.pyexamples/vlm_ptq/README.mdmodelopt/torch/export/model_utils.pymodelopt/torch/export/quant_utils.pytests/_test_utils/torch/transformers_models.pytests/unit/torch/quantization/plugins/test_huggingface.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
tests/unit/torch/quantization/plugins/test_huggingface.py (1)
286-293:⚠️ Potential issue | 🟡 MinorStrengthen positive quantization assertions to avoid false positives.
has_gdn_quantized/has_attn_quantizedare currently set by name match + quantizer presence, even if quantization is disabled. Gate these checks onmodule.weight_quantizer.is_enabled.Proposed fix
for name, module in model.named_modules(): if hasattr(module, "weight_quantizer") and hasattr(module, "weight"): - if "linear_attn.in_proj_qkv" in name: + if "linear_attn.in_proj_qkv" in name and module.weight_quantizer.is_enabled: has_gdn_quantized = True - if "self_attn.q_proj" in name: + if "self_attn.q_proj" in name and module.weight_quantizer.is_enabled: has_attn_quantized = True#!/bin/bash # Verify positive assertions currently don't require enabled quantizers. rg -n -C2 'linear_attn\.in_proj_qkv|self_attn\.q_proj|is_enabled' tests/unit/torch/quantization/plugins/test_huggingface.pyExpected: the positive checks for
linear_attn.in_proj_qkvandself_attn.q_projshould includeis_enabledin the same condition.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/torch/quantization/plugins/test_huggingface.py` around lines 286 - 293, The current positive assertions set has_gdn_quantized/has_attn_quantized based only on name and presence of module.weight_quantizer, which can produce false positives when quantization is disabled; update the loop over model.named_modules() so that when you detect the target names ("linear_attn.in_proj_qkv" and "self_attn.q_proj") you also check module.weight_quantizer.is_enabled before setting has_gdn_quantized or has_attn_quantized, i.e., require hasattr(module, "weight_quantizer") and hasattr(module, "weight") and module.weight_quantizer.is_enabled when assigning those flags, then keep the existing assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/unit/torch/quantization/plugins/test_huggingface.py`:
- Around line 286-293: The current positive assertions set
has_gdn_quantized/has_attn_quantized based only on name and presence of
module.weight_quantizer, which can produce false positives when quantization is
disabled; update the loop over model.named_modules() so that when you detect the
target names ("linear_attn.in_proj_qkv" and "self_attn.q_proj") you also check
module.weight_quantizer.is_enabled before setting has_gdn_quantized or
has_attn_quantized, i.e., require hasattr(module, "weight_quantizer") and
hasattr(module, "weight") and module.weight_quantizer.is_enabled when assigning
those flags, then keep the existing assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 99e37517-c133-410c-86b9-b5afc49ed662
📒 Files selected for processing (7)
examples/llm_ptq/README.mdexamples/llm_ptq/example_utils.pyexamples/vlm_ptq/README.mdmodelopt/torch/export/model_utils.pymodelopt/torch/export/quant_utils.pytests/_test_utils/torch/transformers_models.pytests/unit/torch/quantization/plugins/test_huggingface.py
✅ Files skipped from review due to trivial changes (1)
- modelopt/torch/export/model_utils.py
🚧 Files skipped from review as they are similar to previous changes (3)
- examples/vlm_ptq/README.md
- modelopt/torch/export/quant_utils.py
- examples/llm_ptq/example_utils.py
What does this PR do?
feature for qwen3.5 quantization
Usage
Testing
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit
New Features
Documentation
Tests