Open
Conversation
for more information, see https://pre-commit.ci
Signed-off-by: root <pgadzinski@nvidia.com>
for more information, see https://pre-commit.ci
Signed-off-by: root <pgadzinski@nvidia.com>
Contributor
Greptile SummaryThis PR introduces Key changes:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant debug_api
participant DumpTensors
participant TensorLogger
Caller->>debug_api: inspect_tensor(layer_name, tensor_name, iteration, tensor, rowwise_quantized_tensor, ...)
debug_api->>DumpTensors: inspect_tensor(config, ...)
DumpTensors->>DumpTensors: validate rowwise/columnwise identity
DumpTensors->>DumpTensors: resolve quantized_tensor (rowwise preferred)
DumpTensors->>TensorLogger: ensure_initialized(root_log_dir)
TensorLogger-->>DumpTensors: (root_dir ready)
DumpTensors->>DumpTensors: build dump_dict {high_precision?, quantized?}
alt dump_dict non-empty
DumpTensors->>TensorLogger: save_tensor(dump_dict, layer_name, tensor_name, iteration)
TensorLogger->>TensorLogger: sanitize names, build filepath
TensorLogger->>disk: torch.save(dump_dict, filepath)
TensorLogger-->>DumpTensors: done
DumpTensors->>debug_api: log_message("Dumped ...")
else dump_dict empty
DumpTensors->>debug_api: log_message("No tensors available ...")
end
Last reviewed commit: 677ad51 |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com>
for more information, see https://pre-commit.ci
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com>
for more information, see https://pre-commit.ci
Collaborator
Author
|
/te-ci pytorch |
Drop the dump_quantized_internals config option, the _get_quantized_internals method, and all helper functions for extracting scales/raw data from Float8Tensor, Float8BlockwiseQTensor, MXFP8Tensor, and NVFP4Tensor. Remove corresponding tests: test_dump_tensors_nvfp4_unpacked_codes and NVFP4_DUMP_TENSORS_CONFIG, and scale/data assertions from test_dump_tensors_sanity. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
for more information, see https://pre-commit.ci
- Add dot ('.') to _sanitize_name to handle common PyTorch dotted layer
names like 'encoder.layer.0.attention'
- Add docstring note about pickle dependency for the 'quantized' key
- Add comment explaining weights_only=False in test
- Remove redundant local RecipeState import in test_nvfp4_numeric
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Avoids relying on stale self.rank when ensure_initialized is called before initialize() has set the rank. Consistent with how nvdlfw_inspect logger resolves rank. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Detach both high_precision and quantized tensors before saving to avoid serializing the autograd graph. For QuantizedTensor this is a zero-copy view (make_like), so no extra GPU allocation. Add filename format assertion to test_dump_tensors_sanity to catch regressions in _sanitize_name or the naming convention. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
for more information, see https://pre-commit.ci
Log a message when no tensors are available to dump so the user has an explicit signal that no file was written. Assert that the quantized key round-trips as a QuantizedTensor to catch regressions in detach() or serialisation path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
for more information, see https://pre-commit.ci
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com>
Collaborator
Author
|
/te-ci pytorch |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR introduces a new debug feature focused on offline analysis of tensors.
The motivation is to make it easier to inspect and analyze intermediate tensors outside of runtime, especially during quantization debugging.
The new `DumpTensors` feature allows saving:
Type of change
Changes
Please list the changes introduced in this PR:
Checklist