Skip to content

[None][fix] Fix CuteDslFusedMoE.load_weights signature to accept allow_partial_loading#12690

Open
tianyuz-nv wants to merge 2 commits intoNVIDIA:mainfrom
wanqian-nv:fix/cutedsl-load-weights-signature
Open

[None][fix] Fix CuteDslFusedMoE.load_weights signature to accept allow_partial_loading#12690
tianyuz-nv wants to merge 2 commits intoNVIDIA:mainfrom
wanqian-nv:fix/cutedsl-load-weights-signature

Conversation

@tianyuz-nv
Copy link
Copy Markdown
Collaborator

@tianyuz-nv tianyuz-nv commented Apr 2, 2026

Problem

PR #12136 added a CuteDslFusedMoE.load_weights() override with a signature
that did not match the MoE base class interface — it was missing the
allow_partial_loading parameter.

This causes a TypeError when _load_weights_impl in modeling_utils.py
calls module.load_weights(weights=..., allow_partial_loading=...) via the
params_map code path (e.g., Qwen3 + CuteDSL backend):

TypeError: CuteDslFusedMoE.load_weights() got an unexpected keyword argument 'allow_partial_loading'

Fix

  • Align CuteDslFusedMoE.load_weights signature with the base class
    MoE.load_weights(weights: List[Dict], allow_partial_loading: bool = False)
  • Pass allow_partial_loading through to super().load_weights()
  • Add a regression test that checks the method signature to prevent future breakage

Test

Added test_cutedsl_load_weights_signature_matches_base in
tests/unittest/_torch/thop/parallel/test_cute_dsl_moe.py to verify
CuteDslFusedMoE.load_weights accepts both weights and
allow_partial_loading parameters.

Summary by CodeRabbit

  • New Features

    • Added support for partial weight loading capability in the Mixture of Experts module.
  • Bug Fixes

    • Updated weight loading mechanism to support flexible parameter formats.
  • Tests

    • Added regression test to ensure weight loading API compatibility and prevent parameter mismatches.

@tianyuz-nv tianyuz-nv changed the title [None][fix] Fix CuteDslFusedMoE.load_weights signature to accept allo… [None][fix] Fix CuteDslFusedMoE.load_weights signature to accept allow_partial_loading Apr 2, 2026
…w_partial_loading

Signed-off-by: tianyuz-nv <tianyuz@nvidia.com>
@tianyuz-nv tianyuz-nv force-pushed the fix/cutedsl-load-weights-signature branch from e836874 to 26653d5 Compare April 2, 2026 11:41
@tianyuz-nv tianyuz-nv marked this pull request as ready for review April 2, 2026 11:43
@tianyuz-nv tianyuz-nv requested a review from a team as a code owner April 2, 2026 11:43
@tianyuz-nv tianyuz-nv requested a review from xxi-nv April 2, 2026 11:43
@tianyuz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5c718b00-8807-4b94-bf58-771bf7e892ff

📥 Commits

Reviewing files that changed from the base of the PR and between 2b4f54c and 26653d5.

📒 Files selected for processing (2)
  • tensorrt_llm/_torch/modules/fused_moe/fused_moe_cute_dsl.py
  • tests/unittest/_torch/thop/parallel/test_cute_dsl_moe.py

📝 Walkthrough

Walkthrough

This PR updates the load_weights method signature in CuteDslFusedMoE to accept weights as List[Dict] instead of Dict[str, torch.Tensor], adds an allow_partial_loading parameter with default value False, and includes a regression test verifying signature compatibility.

Changes

Cohort / File(s) Summary
Fused MoE Module Signature Update
tensorrt_llm/_torch/modules/fused_moe/fused_moe_cute_dsl.py
Modified load_weights method to change weights parameter type from Dict[str, torch.Tensor] to List[Dict] and added allow_partial_loading: bool = False parameter, forwarded to parent class.
Regression Test
tests/unittest/_torch/thop/parallel/test_cute_dsl_moe.py
Added test_cutedsl_load_weights_signature_matches_base to verify that CuteDslFusedMoE.load_weights signature contains both weights and allow_partial_loading parameters.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main fix: updating CuteDslFusedMoE.load_weights signature to accept the allow_partial_loading parameter, matching the base class interface.
Description check ✅ Passed The description provides a clear problem statement, explains the fix, and documents the test coverage. However, the PR Checklist section from the template is not completed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41426 [ run ] triggered by Bot. Commit: 26653d5 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41426 [ run ] completed with state SUCCESS. Commit: 26653d5
/LLM/main/L0_MergeRequest_PR pipeline #32358 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@tianyuz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41472 [ run ] triggered by Bot. Commit: 26653d5 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41472 [ run ] completed with state SUCCESS. Commit: 26653d5
/LLM/main/L0_MergeRequest_PR pipeline #32398 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@tianyuz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41539 [ run ] triggered by Bot. Commit: 26653d5 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41539 [ run ] completed with state SUCCESS. Commit: 26653d5
/LLM/main/L0_MergeRequest_PR pipeline #32453 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@tianyuz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41569 [ run ] triggered by Bot. Commit: 26653d5 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41569 [ run ] completed with state SUCCESS. Commit: 26653d5
/LLM/main/L0_MergeRequest_PR pipeline #32480 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@tianyuz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41590 [ run ] triggered by Bot. Commit: ac758f8 Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants