-
Notifications
You must be signed in to change notification settings - Fork 608
Add logic for block-scaled tensors with GEMM swizzled scales #2486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
53 commits
Select commit
Hold shift + click to select a range
0563c1a
Add general C API for setting tensor params
timmoon10 5c9b1be
Implement general accessors for NVTETensor
timmoon10 219ddc1
Merge branch 'main' into tmoon/pre-swizzled-scales
timmoon10 1c49646
Refactor tex swizzling to skip if scales are already swizzled
timmoon10 5f60184
Add checks for non-swizzled scales in MXFP8 and NVFP4 kernels
timmoon10 21ec928
Support pre-swizzled scales in MXFP8Tensor
timmoon10 fa7e7c0
Add tex function to swizzle MXFP8 scales
timmoon10 b796c96
Fix bug in inplace swizzle function
timmoon10 52ce3a4
Tweak comments to use "compact/swizzled format"
timmoon10 5c7c1d9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] dfb4b94
MXFP8 quantize kernel with pre-swizzled scales
timmoon10 1a8b551
Expose pre-swizzled scales in modules
timmoon10 cb1254a
Fix bug in multi-swizzle
timmoon10 8b10300
Support MXFP8 gated activations with swizzled scales
timmoon10 1de4b5e
Merge branch 'main' into tmoon/pre-swizzled-scales
timmoon10 8c6ea61
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] a0184bc
Add PyTorch infrastructure for pre-swizzled NVFP4 tensors
timmoon10 2365821
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] bf12da9
Deprecate DSv3-specific quantization logic in C API
timmoon10 a89c006
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] b7eced8
Remove support for DSv3 compact data from quantizer
timmoon10 1da2c19
Remove DSv3 compact data format from core lib
timmoon10 9ed62cb
Fix bug in FP8 all-gather
timmoon10 43c8132
Fix linter warnings
timmoon10 f37036e
Update JAX to use new swizzled scale API
timmoon10 c549e90
Merge branch 'main' into tmoon/pre-swizzled-scales
timmoon10 4b06462
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 6c11bb5
Review suggestion from @greptile-apps
timmoon10 736a971
Review suggestions from @greptile-apps
timmoon10 8b5e43d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 78b572c
Update C++ swizzle test with swizzled scales API
timmoon10 d13760c
Return default tensor params when querying params for invalid NVTETensor
timmoon10 9cc7fe4
Debug DSv3 FP8 test failures
timmoon10 41c8d51
Debug Userbuffers test failures
timmoon10 7b3e231
Make sure gated activations populate FP8 transpose if needed
timmoon10 732425c
Merge branch 'main' into tmoon/pre-swizzled-scales
timmoon10 7b55b9d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] dc235e9
Review suggestions from @greptile-apps
timmoon10 5aec484
Disable pre-swizzling with debug quantizer
timmoon10 f05fd06
Merge branch 'main' into tmoon/pre-swizzled-scales
timmoon10 c6f12e1
Review suggestion from @greptile-apps
timmoon10 583e948
Merge branch 'main' into tmoon/pre-swizzled-scales
timmoon10 28d08a7
Fix merge conflicts and review suggestions
timmoon10 1a184ab
Use explicitly sized types in config accessors
timmoon10 ed3fb0a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] eb6c5e7
Merge branch 'main' into tmoon/pre-swizzled-scales
timmoon10 0358355
Make util header for function that compute swizzled scale index
timmoon10 47c93ea
Merge branch 'main' into tmoon/pre-swizzled-scales
timmoon10 75c299f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 63b88d2
Apply suggestions from @greptile-apps
timmoon10 0f970ce
Merge branch 'main' into tmoon/pre-swizzled-scales
timmoon10 fa37297
Update expected error message in FP8 block-scaling test
timmoon10 2b02fe5
Review suggestion from @yaox12
timmoon10 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we need this test anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FP8 block-scaling doesn't require a compact format anymore. Now it's always GEMM-ready.