-
Notifications
You must be signed in to change notification settings - Fork 237
Rename compress to puzzletron #776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename compress to puzzletron #776
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughLarge-scale refactoring migrating the modelopt.torch module structure from Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
modelopt/torch/puzzletron/utils/checkpoint_manager.py (1)
136-139: Method name mismatch:load_stateshould beload_state_dict.Line 138 calls
hook.load_state(...), but theForwardHookabstract base class inmodelopt/torch/nas/plugins/megatron_hooks/base_hooks.pydeclaresload_state_dict(...)as the abstract method. This will cause anAttributeErrorat runtime since noload_statemethod exists on hook implementations.🐛 Proposed fix
if module_name in hook_states: - hook.load_state(hook_states[module_name]) + hook.load_state_dict(hook_states[module_name]) loaded_count += 1modelopt/torch/puzzletron/replacement_library/build_replacement_library.py (1)
127-135: Bug: String literal{PUZZLE_DIR}is not interpolated.The error message uses
'{PUZZLE_DIR}/ckpts'inside a regular string, not an f-string, so the placeholder will be displayed literally instead of being substituted with the actual path.🐛 Proposed fix
if not teacher_checkpoint_dir.exists(): raise ValueError( - "You must either provide the --teacher_checkpoint_dir argument, or create a link to the " - "teacher dir under '{PUZZLE_DIR}/ckpts'." + f"You must either provide the --teacher_checkpoint_dir argument, or create a link to the " + f"teacher dir under '{master_puzzle_dir}/ckpts'." )modelopt/torch/puzzletron/puzzletron.py (1)
58-75: Renumber steps to be sequential - Step 3 is missing from the sequence.The step comments skip from "Step 2" (line 61) directly to "Step 4" (line 66). Renumber the subsequent steps to "Step 3", "Step 4", and "Step 5" for clarity and consistency.
# Step 2: pruning_ckpts (single process) if dist.is_master(): pruning_ckpts.launch_prune_ckpt(hydra_cfg) dist.barrier() # Step 3: build_library_and_stats (single process) if dist.is_master(): build_library_and_stats.launch_build_library_and_stats(hydra_cfg) dist.barrier() # Step 4: calc_one_block_scores (distributed processing) scoring.launch_scoring(hydra_cfg) # Step 5: mip_and_realize_models (distributed processing) mip_and_realize_models.launch_mip_and_realize_model(hydra_cfg)modelopt/torch/puzzletron/nas/plugins/puzzletron_nas_plugin.py (1)
49-89: Typo in comment: "prunining" → "pruning".Line 84 has a typo in the comment.
✏️ Proposed fix
- # Dataset path to use for scoring in prunining and NAS search + # Dataset path to use for scoring in pruning and NAS searchmodelopt/torch/puzzletron/tools/bypassed_training/init_child_from_parent.py (1)
165-169: Bug: Incorrect condition will always evaluate toFalse.The condition
isinstance(mlp_init_config_yaml, str) is Noneis incorrect.isinstance()returns a boolean, so comparing it toNonewithiswill always beFalse. This meansyaml.safe_load()will never be called, andmlp_init_configwill always be set tomlp_init_config_yamlregardless of its type.The intent appears to be: parse YAML if the input is a string, otherwise use as-is.
🐛 Proposed fix
mlp_init_config = ( yaml.safe_load(mlp_init_config_yaml) - if isinstance(mlp_init_config_yaml, str) is None + if mlp_init_config_yaml is not None and isinstance(mlp_init_config_yaml, str) else mlp_init_config_yaml )Alternatively, if the intent is only to parse when it's a non-None string:
mlp_init_config = ( yaml.safe_load(mlp_init_config_yaml) if isinstance(mlp_init_config_yaml, str) else mlp_init_config_yaml )
🤖 Fix all issues with AI agents
In @modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py:
- Line 457: Remove the erroneous duplicate TypeVar declaration "T_DataClass:
TypeVar = type[dataclasses.dataclass]" (which shadows the correct one)—delete
that line so the original proper declaration T_DataClass =
TypeVar("T_DataClass") (near the top) remains and _dataclass_from_dict and other
code use the correct TypeVar.
🧹 Nitpick comments (10)
modelopt/torch/puzzletron/utils/utils.py (1)
95-97: Pre-existing bug: Missing f-string prefix.The string literal at line 96 won't interpolate
{file_path}- it's missing thefprefix. This is a pre-existing issue, but worth fixing while you're in this file.Suggested fix
if not os.path.exists(file_path): - print("file does not exist {file_path}") + print(f"file does not exist {file_path}") return Noneexamples/pruning/README.md (1)
10-10: Minor grammar fix: hyphenate compound adjective.Per static analysis, "MIP based" should be "MIP-based" when used as a compound adjective before a noun.
Suggested fix
-1. [Puzzletron](../puzzletron/README.md): An advanced pruning method by NVIDIA Research using Mixed Integer Programming (MIP) based NAS search algorithm. +1. [Puzzletron](../puzzletron/README.md): An advanced pruning method by NVIDIA Research using Mixed Integer Programming (MIP)-based NAS search algorithm.modelopt/torch/puzzletron/decilm/converters/convert_llama3_to_decilm.py (1)
16-18: Misplaced shebang line has no effect.The shebang
#!/usr/bin/env python3on line 18 appears after the license header and docstring. For a shebang to be effective, it must be the very first line of the file. In its current position, it's just a comment with no functional impact. Consider either moving it to line 1 (before the license) or removing it entirely since this module is typically imported rather than executed directly.tests/gpu/torch/puzzletron/nas/plugins/test_nas_convert.py (2)
72-78: Variable shadowing:rankis redefined inside the conditional block.The
rankparameter on line 39 is shadowed by the redefinition on line 74 (rank = int(os.environ["RANK"])). This works because the code is inside anif rank == 0block, but it's confusing and could cause maintenance issues.♻️ Suggested improvement
if rank == 0: # assertions for the score_pruning_activations step - rank = int(os.environ["RANK"]) + env_rank = int(os.environ["RANK"]) rank_filepath = ( - f"pruning/pruning_scores/ffn_iterative/100samples_diverse_mini/rank_{rank}.pth" + f"pruning/pruning_scores/ffn_iterative/100samples_diverse_mini/rank_{env_rank}.pth" )
130-137: Same variable shadowing issue exists here.Line 132 has the same
rankredefinition pattern as the FFN pruning test.♻️ Suggested improvement
if rank == 0: # assertions for the score_pruning_activations step - rank = int(os.environ["RANK"]) + env_rank = int(os.environ["RANK"]) rank_filepath = ( f"pruning/pruning_scores/attn_independent_kv_head_contribution/" - f"100samples_diverse_mini/rank_{rank}.pth" + f"100samples_diverse_mini/rank_{env_rank}.pth" )tests/gpu/torch/puzzletron/test_puzzletron.py (1)
30-33: Comment still references "compress" terminology.The comment on line 30 refers to "compress a model" which is inconsistent with the rename to Puzzletron. Consider updating for consistency.
Suggested update
-# The e2e test to compress a model based on Local Neural Architecture Search (Mixed Integer Programing NAS search) +# The e2e test to run Puzzletron based on Local Neural Architecture Search (Mixed Integer Programming NAS search) # using a one-click command.examples/puzzletron/main.py (2)
66-72: Docstring parameter name mismatch.The docstring references
config_pathbut the actual parameter ishydra_config_path.Suggested fix
def run_full_puzzletron(hydra_config_path: str): """Run the full puzzletron pipeline. Args: - config_path: Path to the YAML configuration file + hydra_config_path: Path to the YAML configuration file """
147-148: TODO comment references the renamed function.The TODO comment now correctly references
run_full_puzzletron()API. Consider tracking this TODO if it represents planned work.Would you like me to open an issue to track integrating the MIP-only path into the
mtn.search()API?modelopt/torch/puzzletron/tools/bypassed_training/child_init.py (2)
853-856: Consider specifying encoding when opening files.The file is opened without an explicit encoding parameter. For JSON files, it's good practice to specify
encoding="utf-8"to ensure consistent behavior across platforms.Suggested fix
- with open(expert_scores_file) as f: + with open(expert_scores_file, encoding="utf-8") as f: expert_scores = json.load(f)
1382-1385: Consider specifying encoding when opening files.Similar to line 855, the file is opened without an explicit encoding parameter.
Suggested fix
- with open(channel_importance_path) as f: + with open(channel_importance_path, encoding="utf-8") as f: channel_ranking = json.load(f)["channel_importance_ranking"]
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (105)
.github/CODEOWNERS.pre-commit-config.yamlexamples/pruning/README.mdexamples/puzzletron/README.mdexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/Llama-3_1-8B.yamlexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yamlexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/pruning/attn_pruning.yamlexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/pruning/ffn_pruning.yamlexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/pruning/hidden_dim_pruning.yamlexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/pruning/pruning_defaults.yamlexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/validate_model_defaults.yamlexamples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/validate_solutions_defaults.yamlexamples/puzzletron/main.pymodelopt/torch/nas/plugins/megatron_hooks/base_hooks.pymodelopt/torch/puzzletron/README.mdmodelopt/torch/puzzletron/__init__.pymodelopt/torch/puzzletron/activation_scoring/activation_hooks/__init__.pymodelopt/torch/puzzletron/activation_scoring/activation_hooks/utils.pymodelopt/torch/puzzletron/activation_scoring/score_pruning_activations.pymodelopt/torch/puzzletron/build_library_and_stats.pymodelopt/torch/puzzletron/dataset/__init__.pymodelopt/torch/puzzletron/dataset/prepare_dataset.pymodelopt/torch/puzzletron/decilm/conversion_utils.pymodelopt/torch/puzzletron/decilm/converters/convert_llama3_to_decilm.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/__init__.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/block_config.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/configuration_decilm.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/megatron_lm__mamba_mixer.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/megatron_lm__megatron_tokenizer.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/megatron_lm__tokenizer.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/modeling_decilm.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/tokenization_decilm.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_44_2__activations.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_44_2__cache_utils.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_44_2__configuration_llama.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_44_2__modeling_attn_mask_utils.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_44_2__modeling_flash_attention_utils_backward_compat.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_44_2__modeling_outputs.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_44_2__modeling_rope_utils.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_44_2__pytorch_utils.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_51_3__cache_utils.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_51_3__configuration_llama4.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/transformers_4_51_3__modeling_llama4_attention.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/variable_cache.pymodelopt/torch/puzzletron/decilm/deci_lm_hf_code/vllm_yarn_utils.pymodelopt/torch/puzzletron/mip/mip_and_realize_models.pymodelopt/torch/puzzletron/mip/mip_with_multi_layer_replacements.pymodelopt/torch/puzzletron/mip/run_puzzle.pymodelopt/torch/puzzletron/mip/utils.pymodelopt/torch/puzzletron/nas/plugins/puzzletron_nas_plugin.pymodelopt/torch/puzzletron/pruning/pruning_ckpts.pymodelopt/torch/puzzletron/puzzletron.pymodelopt/torch/puzzletron/replacement_library/build_replacement_library.pymodelopt/torch/puzzletron/replacement_library/replacement_library.pymodelopt/torch/puzzletron/replacement_library/replacement_utils.pymodelopt/torch/puzzletron/scoring/scoring.pymodelopt/torch/puzzletron/sewing_kit/__init__.pymodelopt/torch/puzzletron/sewing_kit/core.pymodelopt/torch/puzzletron/sewing_kit/passage/__init__.pymodelopt/torch/puzzletron/sewing_kit/passage/core.pymodelopt/torch/puzzletron/sewing_kit/utils.pymodelopt/torch/puzzletron/subblock_stats/calc_subblock_params_and_memory.pymodelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.pymodelopt/torch/puzzletron/tools/__init__.pymodelopt/torch/puzzletron/tools/bypassed_training/child_init.pymodelopt/torch/puzzletron/tools/bypassed_training/init_child_from_parent.pymodelopt/torch/puzzletron/tools/checkpoint_utils.pymodelopt/torch/puzzletron/tools/checkpoint_utils_hf.pymodelopt/torch/puzzletron/tools/common.pymodelopt/torch/puzzletron/tools/hydra_utils.pymodelopt/torch/puzzletron/tools/kd_model.pymodelopt/torch/puzzletron/tools/logger.pymodelopt/torch/puzzletron/tools/post_init_sparse.pymodelopt/torch/puzzletron/tools/robust_json.pymodelopt/torch/puzzletron/tools/sharded_checkpoint_utils.pymodelopt/torch/puzzletron/tools/validate_model.pymodelopt/torch/puzzletron/tools/validate_puzzle_with_multi_replacements.pymodelopt/torch/puzzletron/tools/validation_utils.pymodelopt/torch/puzzletron/utils/checkpoint_manager.pymodelopt/torch/puzzletron/utils/data/dataloaders.pymodelopt/torch/puzzletron/utils/data/dataset.pymodelopt/torch/puzzletron/utils/parsing.pymodelopt/torch/puzzletron/utils/utils.pymodelopt/torch/puzzletron/utils/validate_runtime_pipeline.pymodelopt/torch/puzzletron/utils/validation.pypyproject.tomlsetup.pytests/_test_utils/torch/puzzletron/resources/configs/Llama-3_1-8B-attn-pruning.yamltests/_test_utils/torch/puzzletron/resources/configs/Llama-3_1-8B-ffn-pruning.yamltests/_test_utils/torch/puzzletron/resources/configs/pruning/attn_pruning.yamltests/_test_utils/torch/puzzletron/resources/configs/pruning/ffn_pruning.yamltests/_test_utils/torch/puzzletron/resources/configs/pruning/hidden_dim_pruning.yamltests/_test_utils/torch/puzzletron/resources/configs/pruning/pruning_defaults.yamltests/_test_utils/torch/puzzletron/resources/configs/validate_model_defaults.yamltests/_test_utils/torch/puzzletron/resources/configs/validate_solutions_defaults.yamltests/_test_utils/torch/puzzletron/resources/tokenizer/special_tokens_map.jsontests/_test_utils/torch/puzzletron/resources/tokenizer/tokenizer.jsontests/_test_utils/torch/puzzletron/resources/tokenizer/tokenizer_config.jsontests/_test_utils/torch/puzzletron/resources/tokenizer/truncate_tokenizer.pytests/_test_utils/torch/puzzletron/utils.pytests/gpu/torch/puzzletron/conftest.pytests/gpu/torch/puzzletron/decilm/converters/test_convert_llama3_config_to_decilm_config.pytests/gpu/torch/puzzletron/nas/plugins/test_nas_convert.pytests/gpu/torch/puzzletron/nas/plugins/test_nas_search.pytests/gpu/torch/puzzletron/test_puzzletron.py
🧰 Additional context used
🧬 Code graph analysis (27)
modelopt/torch/puzzletron/activation_scoring/score_pruning_activations.py (2)
modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)modelopt/torch/puzzletron/tools/validate_model.py (1)
validate_model(67-216)
tests/_test_utils/torch/puzzletron/utils.py (2)
modelopt/torch/puzzletron/puzzletron.py (1)
puzzletron(32-77)modelopt/torch/puzzletron/tools/hydra_utils.py (1)
register_hydra_resolvers(39-48)
modelopt/torch/puzzletron/tools/checkpoint_utils.py (2)
modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py (1)
load_model_config(102-121)modelopt/torch/puzzletron/tools/common.py (1)
infer_weights_dtype(19-22)
modelopt/torch/puzzletron/utils/data/dataloaders.py (2)
modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)modelopt/torch/puzzletron/utils/data/dataset.py (1)
ConstantLengthDataset(30-220)
modelopt/torch/puzzletron/tools/validate_puzzle_with_multi_replacements.py (3)
modelopt/torch/puzzletron/replacement_library/replacement_utils.py (1)
parse_layer_replacement(29-43)modelopt/torch/puzzletron/tools/checkpoint_utils.py (1)
copy_tokenizer(160-192)modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py (2)
copy_deci_lm_hf_code(443-449)save_safetensors_index(322-363)
tests/gpu/torch/puzzletron/test_puzzletron.py (3)
modelopt/torch/puzzletron/puzzletron.py (1)
puzzletron(32-77)tests/_test_utils/torch/puzzletron/utils.py (1)
setup_test_model_and_data(28-69)modelopt/torch/puzzletron/decilm/converters/convert_llama3_to_decilm.py (1)
convert_llama3_to_decilm(144-149)
modelopt/torch/puzzletron/mip/mip_and_realize_models.py (2)
modelopt/torch/puzzletron/mip/run_puzzle.py (1)
run_puzzle(430-477)modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)
modelopt/torch/nas/plugins/megatron_hooks/base_hooks.py (3)
modelopt/torch/puzzletron/puzzletron.py (1)
puzzletron(32-77)modelopt/torch/puzzletron/tools/logger.py (1)
aprint(141-145)modelopt/torch/puzzletron/tools/robust_json.py (1)
json_dump(62-66)
modelopt/torch/puzzletron/tools/sharded_checkpoint_utils.py (4)
modelopt/torch/puzzletron/decilm/deci_lm_hf_code/modeling_decilm.py (2)
DeciLMDecoderLayer(1011-1548)DeciLMForCausalLM(2435-2630)modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py (1)
load_model_config(102-121)modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)modelopt/torch/puzzletron/utils/utils.py (1)
EmptyInitOnDevice(217-258)
modelopt/torch/puzzletron/tools/bypassed_training/init_child_from_parent.py (5)
modelopt/torch/puzzletron/decilm/deci_lm_hf_code/modeling_decilm.py (1)
DeciLMForCausalLM(2435-2630)modelopt/torch/puzzletron/tools/bypassed_training/child_init.py (5)
GQAInitMode(45-52)HiddenSizeInitMode(70-74)LinearInitMode(65-67)MlpInitMode(55-62)create_child_state_dict(326-512)modelopt/torch/puzzletron/tools/checkpoint_utils.py (1)
copy_tokenizer(160-192)modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py (3)
load_model_config(102-121)_save_checkpoint(128-190)copy_deci_lm_hf_code(443-449)modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)
modelopt/torch/puzzletron/mip/run_puzzle.py (3)
modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)modelopt/torch/puzzletron/tools/robust_json.py (1)
json_dump(62-66)modelopt/torch/puzzletron/utils/parsing.py (2)
parse_json(63-66)parse_path(69-72)
modelopt/torch/puzzletron/build_library_and_stats.py (4)
modelopt/torch/puzzletron/puzzletron.py (1)
puzzletron(32-77)modelopt/torch/puzzletron/replacement_library/build_replacement_library.py (2)
build_replacement_library(69-104)launch_build_replacement_library(107-120)modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py (1)
launch_calc_subblock_stats(217-241)modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)
modelopt/torch/puzzletron/tools/validation_utils.py (3)
modelopt/torch/puzzletron/tools/validate_model.py (1)
validate_model(67-216)modelopt/torch/puzzletron/tools/robust_json.py (1)
json_dump(62-66)modelopt/torch/puzzletron/utils/validation.py (1)
LowMemorySparseTensor(68-85)
modelopt/torch/puzzletron/tools/post_init_sparse.py (1)
modelopt/torch/puzzletron/decilm/deci_lm_hf_code/modeling_decilm.py (1)
DeciLMForCausalLM(2435-2630)
modelopt/torch/puzzletron/pruning/pruning_ckpts.py (3)
modelopt/torch/puzzletron/tools/bypassed_training/child_init.py (1)
GQAInitMode(45-52)modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py (1)
load_model_config(102-121)modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)
modelopt/torch/puzzletron/scoring/scoring.py (3)
modelopt/torch/puzzletron/puzzletron.py (1)
puzzletron(32-77)modelopt/torch/puzzletron/tools/hydra_utils.py (1)
register_hydra_resolvers(39-48)modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)
tests/gpu/torch/puzzletron/decilm/converters/test_convert_llama3_config_to_decilm_config.py (3)
modelopt/torch/puzzletron/puzzletron.py (1)
puzzletron(32-77)tests/_test_utils/torch/puzzletron/utils.py (2)
create_and_save_small_llama_model(72-107)create_tokenizer(110-116)modelopt/torch/puzzletron/decilm/converters/convert_llama3_to_decilm.py (1)
convert_llama3_to_decilm(144-149)
modelopt/torch/puzzletron/dataset/prepare_dataset.py (1)
modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)
modelopt/torch/puzzletron/utils/validate_runtime_pipeline.py (8)
modelopt/torch/puzzletron/puzzletron.py (1)
puzzletron(32-77)modelopt/torch/puzzletron/decilm/deci_lm_hf_code/modeling_decilm.py (2)
DeciLMForCausalLM(2435-2630)LMHead(2429-2432)modelopt/torch/puzzletron/sewing_kit/core.py (6)
ExternalTarget(199-202)ModuleTarget(226-240)Needle(791-883)RemoteTarget(244-260)StitchedModule(342-764)InputReducer(71-89)modelopt/torch/puzzletron/sewing_kit/passage/core.py (1)
InputArgs(38-79)modelopt/torch/puzzletron/sewing_kit/utils.py (2)
distributed_recv_obj(406-422)distributed_send_obj(395-403)modelopt/torch/puzzletron/tools/checkpoint_utils.py (1)
init_module_with_state_dict(99-108)modelopt/torch/puzzletron/tools/sharded_checkpoint_utils.py (1)
DummyBlock(62-73)modelopt/torch/puzzletron/utils/validation.py (1)
_organize_outputs(210-222)
modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py (4)
modelopt/torch/puzzletron/tools/common.py (1)
infer_weights_dtype(19-22)modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)modelopt/torch/puzzletron/tools/post_init_sparse.py (1)
SparsityMethod(28-95)modelopt/torch/puzzletron/tools/robust_json.py (1)
json_dumps(58-59)
modelopt/torch/puzzletron/tools/bypassed_training/child_init.py (5)
modelopt/torch/puzzletron/decilm/deci_lm_hf_code/block_config.py (3)
BlockConfig(297-308)_get_dataclass_type(96-109)_is_dataclass_type(85-93)modelopt/torch/puzzletron/decilm/deci_lm_hf_code/configuration_decilm.py (1)
DeciLMConfig(39-202)modelopt/torch/opt/config.py (1)
keys(132-134)modelopt/torch/quantization/qtensor/base_qtensor.py (1)
dim(110-112)modelopt/onnx/quantization/graph_utils.py (1)
layer_idx(808-810)
modelopt/torch/puzzletron/puzzletron.py (1)
modelopt/torch/puzzletron/tools/hydra_utils.py (1)
initialize_hydra_config_for_dir(51-71)
modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py (5)
modelopt/torch/puzzletron/replacement_library/replacement_utils.py (1)
parse_layer_replacement(29-43)modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py (1)
load_model_config(102-121)modelopt/torch/puzzletron/tools/logger.py (1)
mprint(155-159)modelopt/torch/puzzletron/tools/robust_json.py (1)
json_dump(62-66)modelopt/torch/puzzletron/utils/parsing.py (1)
format_global_config(202-300)
modelopt/torch/puzzletron/replacement_library/replacement_utils.py (4)
modelopt/torch/puzzletron/puzzletron.py (1)
puzzletron(32-77)modelopt/torch/puzzletron/decilm/deci_lm_hf_code/block_config.py (1)
BlockConfig(297-308)modelopt/torch/puzzletron/decilm/deci_lm_hf_code/configuration_decilm.py (1)
DeciLMConfig(39-202)modelopt/torch/puzzletron/mip/utils.py (1)
sort_replacements(27-36)
modelopt/torch/puzzletron/subblock_stats/calc_subblock_params_and_memory.py (4)
modelopt/torch/puzzletron/puzzletron.py (1)
puzzletron(32-77)modelopt/torch/puzzletron/decilm/deci_lm_hf_code/block_config.py (2)
FFNConfig(257-287)MambaConfig(161-165)modelopt/torch/puzzletron/decilm/deci_lm_hf_code/configuration_decilm.py (1)
DeciLMConfig(39-202)modelopt/torch/puzzletron/decilm/deci_lm_hf_code/modeling_decilm.py (1)
DeciLMMoe(2270-2368)
modelopt/torch/puzzletron/utils/checkpoint_manager.py (1)
modelopt/torch/puzzletron/tools/logger.py (2)
aprint(141-145)mprint(155-159)
modelopt/torch/puzzletron/activation_scoring/activation_hooks/utils.py (1)
modelopt/torch/puzzletron/decilm/deci_lm_hf_code/modeling_decilm.py (1)
DeciLMForCausalLM(2435-2630)
🪛 LanguageTool
examples/pruning/README.md
[grammar] ~10-~10: Use a hyphen to join words.
Context: ...ch using Mixed Integer Programming (MIP) based NAS search algorithm. 1. FastNAS: ...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: linux
- GitHub Check: wait-checks / wait
- GitHub Check: code-quality
- GitHub Check: build-docs
🔇 Additional comments (113)
.github/CODEOWNERS (1)
27-27: LGTM!The new CODEOWNERS entry for
modelopt/torch/puzzletronfollows the established naming convention and is correctly placed in alphabetical order among the othermodelopt/torchpaths.Ensure the GitHub team
@NVIDIA/modelopt-torch-puzzletron-codeownershas been created before merging this PR, otherwise code ownership rules won't apply..pre-commit-config.yaml (2)
27-38: LGTM!The exclusion paths for
ruff-checkandruff-formatare consistently updated from_compresstopuzzletron. The comment reference totest_puzzletron.pyaligns with the broader rename.
110-110: LGTM!The license header exclusion path is correctly updated to match the new
puzzletronnamespace.modelopt/torch/puzzletron/tools/post_init_sparse.py (3)
20-20: LGTM!Import path correctly updated to align with the package rename from
_compresstopuzzletron.
29-30: LGTM!Docstring normalization looks good. Note that this abstract-like method relies on subclasses (e.g.,
SparsityMethod2o4) to provide actual implementations.
99-100: LGTM!Docstring normalization consistent with the parent class method.
modelopt/torch/puzzletron/utils/utils.py (2)
217-258: LGTM!The
EmptyInitOnDeviceclass implementation is clean. The docstring reformatting maintains clarity.
24-28: Import path update is correct.The import path has been updated from
modelopt.torch._compresstomodelopt.torch.puzzletronto reflect the package rename, and all three imported classes (AttentionConfig,BlockConfig,FFNConfig) are properly defined in the target module.modelopt/torch/puzzletron/pruning/pruning_ckpts.py (4)
29-39: LGTM!Import paths correctly updated to the new
puzzletronnamespace.
153-163: LGTM!Plain strings are appropriate here since no variable interpolation is needed. Good practice to avoid unnecessary f-string overhead.
330-331: LGTM!Raw string literal is the correct choice for regex patterns containing escape sequences.
43-43: No action required—Python version is already 3.10+.The project explicitly targets Python 3.10+ via the
target-version = "py310"configuration inpyproject.toml. Theint | Noneunion syntax (PEP 604) is natively supported at this version, so no compatibility issue exists andfrom __future__ import annotationsis unnecessary.Likely an incorrect or invalid review comment.
pyproject.toml (1)
83-102: LGTM!The lint ignore path correctly updated from
_compresstopuzzletronnamespace. The TODO comment appropriately reflects the ongoing migration.examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml (1)
11-21: LGTM!The comment updates correctly reflect the rename to puzzletron. Configuration values are unchanged.
modelopt/torch/puzzletron/utils/validation.py (2)
36-36: LGTM!The import path correctly updated from
_compresstopuzzletronnamespace.
16-20: Docstring formatting looks good.The module docstring correctly describes the validation utilities.
modelopt/torch/puzzletron/activation_scoring/activation_hooks/utils.py (1)
30-30: LGTM on import path update!The import path has been correctly updated to the new
puzzletronnamespace.modelopt/torch/puzzletron/subblock_stats/calc_subblock_params_and_memory.py (3)
31-42: Import paths correctly updated to puzzletron namespace.The imports have been properly migrated from
modelopt.torch._compress.*tomodelopt.torch.puzzletron.*, aligning with the PR objective.
119-122: Removal of explicit file mode is acceptable.The explicit
"r"mode was removed fromopen(stats_file). This is safe since"r"(read text mode) is the default foropen().
181-184: Docstring reformatting approved.Minor docstring formatting changes without semantic impact.
modelopt/torch/puzzletron/scoring/scoring.py (1)
29-33: Import paths correctly migrated to puzzletron namespace.The imports have been properly updated from
modelopt.torch._compress.tools.*tomodelopt.torch.puzzletron.tools.*. The imported utilities (register_hydra_resolvers,mprint,validate_puzzle_solutions) are available at the new locations as confirmed by the relevant code snippets.modelopt/torch/puzzletron/decilm/converters/convert_llama3_to_decilm.py (1)
25-28: Import paths correctly migrated to puzzletron namespace.The imports have been properly updated from the old
_compressnamespace tomodelopt.torch.puzzletron.*.modelopt/torch/nas/plugins/megatron_hooks/base_hooks.py (1)
29-30: Import paths correctly migrated to puzzletron namespace.The imports for
aprintandjson_dumphave been properly updated frommodelopt.torch._compress.tools.*tomodelopt.torch.puzzletron.tools.*. The cross-package dependency (nas → puzzletron) is consistent with the broader refactoring in this PR.modelopt/torch/puzzletron/utils/checkpoint_manager.py (3)
21-24: Type annotation modernization and import path updates look good.The removal of
Optionalfrom typing imports and the migration tomodelopt.torch.puzzletron.tools.loggerare consistent with the PR objectives and Python 3.10+ type annotation style.
63-64: Return type annotation correctly modernized.The return type
dict[str, Any] | Noneuses the modern union syntax, consistent with the codebase's type annotation style.
75-76: Removal of explicit file mode is acceptable.The explicit
"r"mode was removed fromopen(self.progress_file). This is safe since read text mode is the default foropen().modelopt/torch/puzzletron/tools/validate_model.py (2)
38-50: LGTM!Import paths correctly updated to the new
puzzletronnamespace. The imports align with the module's new location undermodelopt.torch.puzzletron.
158-158: LGTM!Lazy import of
ScoringCheckpointManagercorrectly updated to the puzzletron namespace.tests/_test_utils/torch/puzzletron/utils.py (3)
25-25: LGTM!Import path correctly updated to the new
puzzletronnamespace for hydra utilities.
32-32: LGTM!Docstring correctly updated to reflect the new "puzzletron" terminology.
114-114: Tokenizer resource directory is properly configured. The directorytests/_test_utils/torch/puzzletron/resources/tokenizerexists and contains all required tokenizer files (special_tokens_map.json, tokenizer.json, tokenizer_config.json, and truncate_tokenizer.py).tests/gpu/torch/puzzletron/decilm/converters/test_convert_llama3_config_to_decilm_config.py (1)
19-23: LGTM!Import paths correctly updated:
- Test utilities now imported from
_test_utils.torch.puzzletron.utils- Converter import updated to
modelopt.torch.puzzletron.decilm.convertersBoth align with the namespace refactor.
modelopt/torch/puzzletron/utils/validate_runtime_pipeline.py (2)
31-51: LGTM!All import paths correctly updated to the
puzzletronnamespace:
- DeciLM model classes from
puzzletron.decilm- Sewing kit components from
puzzletron.sewing_kit- Utility modules from
puzzletron.toolsandpuzzletron.utilsThe imports are consistent with the module structure shown in the relevant code snippets.
72-72: LGTM!Minor docstring formatting change moving content to the same line as the opening quotes.
modelopt/torch/puzzletron/activation_scoring/score_pruning_activations.py (1)
22-23: LGTM!Import paths correctly updated to the
puzzletronnamespace:
mprintfrompuzzletron.tools.loggervalidate_modelfrompuzzletron.tools.validate_modelBoth imports align with the module locations confirmed in the relevant code snippets.
setup.py (1)
105-115: LGTM! Dependency group renamed and expanded consistently.The rename from
compresstopuzzletronaligns with the broader namespace refactoring. The comment correctly references the new subpackage pathmodelopt.torch.puzzletron.modelopt/torch/puzzletron/mip/mip_with_multi_layer_replacements.py (4)
22-33: LGTM! Import modernization and namespace update.The move of
HashableandIterabletocollections.abcfollows Python 3.9+ best practices. The import path update from_compresstopuzzletronis consistent with the PR's renaming objective.
45-45: Type annotation modernization looks good.The
float | Nonesyntax is valid given the project's Python 3.10+ requirement specified insetup.py.
63-77: LGTM! Idiomatic dict iteration.Iterating directly over the dict (
for constraint_key in constraints) is equivalent to iterating over.keys()and is the preferred Pythonic style.
110-120: LGTM! Consistent dict initialization and iteration.Using
dict.fromkeys(constraints.keys(), 0)is an appropriate pattern for initializing a dict with a uniform default value. The direct iteration overconstraintsin the loop is consistent with earlier changes.modelopt/torch/puzzletron/utils/data/dataset.py (3)
17-17: LGTM! Import modernization.Moving
Sequencefromtypingtocollections.abcfollows Python 3.9+ conventions.
61-64: LGTM! Type annotation modernization.The union syntax
Sequence[str] | Noneandint | Noneis cleaner thanOptional[...]and valid for Python 3.10+.
290-291: Good simplification using@functools.cache.
@functools.cache(Python 3.9+) is equivalent to@functools.lru_cache(maxsize=None)and is more concise. Note that this caches based on the tokenizer object's identity/hash, which is the expected behavior here.modelopt/torch/puzzletron/tools/validation_utils.py (3)
24-38: LGTM! Namespace migration and import cleanup.The import paths correctly reference the new
puzzletronnamespace. TheTYPE_CHECKINGimport forStitchedModuleproperly avoids circular imports at runtime.
47-50: LGTM! Function signature with modernized type hints.The
dict[str, Any] | Nonesyntax is cleaner and consistent with the codebase-wide type annotation updates.
80-84: LGTM! Consistent type annotation style.Matches the pattern used in
validate_model_and_extract_hidden_states.modelopt/torch/puzzletron/mip/run_puzzle.py (4)
23-49: LGTM! Import reorganization and namespace migration.The imports have been correctly migrated to:
collections.abcforHashableandIterable(Python 3.9+ best practice)- New
puzzletronnamespace for all internal module imports
422-430: LGTM! Modernized generic type hints.Using
list[dict]andlist[str]instead ofList[dict]andList[str]is the preferred style for Python 3.9+.
582-584: LGTM! Clean default value initialization with override.The pattern
{**dict.fromkeys(all_metric_names, 0.0), **_extract_average_metrics(raw_metrics)}elegantly sets default values for all metrics while allowing real values to override them.
599-604: LGTM! Direct dict iteration.Iterating directly over
raw_metricsis equivalent toraw_metrics.keys()and is the idiomatic Python style.modelopt/torch/puzzletron/puzzletron.py (2)
16-29: Import paths and module docstring updated correctly for the namespace refactor.The imports have been properly migrated from
_compresstopuzzletronnamespace, and the module docstring is appropriately updated.
32-47: Function renamed and documentation updated appropriately.The public API rename from
compresstopuzzletronis consistent with the PR objectives. The TODO comment format change from@TODOtoTODOis a minor improvement.modelopt/torch/puzzletron/replacement_library/build_replacement_library.py (2)
15-63: Import paths and module docstring correctly updated for the namespace refactor.The imports have been properly migrated to the
puzzletronnamespace, and the usage documentation in the docstring is updated accordingly.
378-388: Type annotation modernized correctly.The return type annotation uses Python 3.10+ style
type[AttentionConfig] | type[FFNConfig]instead ofType[...]fromtyping, which is consistent with other type annotation updates in this PR.modelopt/torch/puzzletron/replacement_library/replacement_library.py (4)
15-58: Import paths and module docstring correctly updated for the namespace refactor.All imports have been properly migrated from
_compresstopuzzletronnamespace.
62-66: Type annotation modernized correctly.The parameter type annotation
dict | Noneis consistent with Python 3.10+ style used throughout this refactor.
221-225: Minor iteration simplification.Iterating directly over
state_dictinstead ofstate_dict.keys()is semantically equivalent and slightly more idiomatic.
319-322: Simplified string literal.Using a plain string
"non_block.pth"instead of an f-string is appropriate since no interpolation is needed.modelopt/torch/puzzletron/replacement_library/replacement_utils.py (2)
15-26: Import paths correctly updated for the namespace refactor.The imports have been properly migrated to the
puzzletronnamespace, including the relocatedsort_replacementsfunction frommip.utils.
46-46: Comment updated to reflect relocated function.The comment correctly documents that
sort_replacementswas moved tomodelopt.torch.puzzletron.mip.utils.modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py (5)
22-54: Import paths correctly updated for the namespace refactor.All imports have been properly migrated to the
puzzletronnamespace. TheIterableimport fromcollections.abcis the modern approach (Python 3.9+).
56-61: Type variable and usage documentation updated.The
T_DataClassTypeVar declaration and the usage example path are updated consistently.
69-92: Type annotation modernized correctly.The
benchmark_iterations: int | Noneparameter uses the modern union syntax consistently with other files in this PR.
217-223: Docstring condensed appropriately.The single-line docstring is sufficient for this launcher function.
520-526: Minor iteration change in list comprehension.The iteration
for key in corresponding_bf16_argsis semantically equivalent to the previous form. The change is neutral.modelopt/torch/puzzletron/build_library_and_stats.py (2)
17-37: LGTM on namespace migration.The module docstring and import paths have been correctly updated to the
puzzletronnamespace. The usage example in the docstring (line 25) properly references the new module path.
40-85: LGTM on the unified workflow function.The function correctly orchestrates both steps with proper error handling and informative logging. The defensive
hasattrcheck on line 84 for the optionalmoe_stats_filenameis appropriate.modelopt/torch/puzzletron/nas/plugins/puzzletron_nas_plugin.py (4)
26-46: LGTM on import updates.All imports correctly migrated to the
puzzletronnamespace, maintaining consistency with the broader refactor.
92-138: LGTM on the convert function.The function is correctly renamed and the progress messages are consistently updated to use "Puzzletron Progress". The TODO on line 116 regarding making the converter generic is noted.
141-183: LGTM on restore function and descriptor.The
restore_puzzletron_modelfunction andPuzzletronDescriptorclass are correctly renamed with the mode name updated to"puzzletron". All property returns reference the correct Puzzletron-prefixed types.
186-219: LGTM on the searcher implementation.The
PuzzletronSearcherclass is correctly renamed with updated docstrings and progress messages. The search workflow steps (5-7/8) are properly orchestrated.tests/gpu/torch/puzzletron/nas/plugins/test_nas_convert.py (2)
23-27: LGTM on import updates.Test utility and model imports correctly updated to the
puzzletronnamespace.
46-67: LGTM on test configuration updates.The hydra config directory path and mode string are correctly updated to
puzzletron. Test assertions remain unchanged, ensuring backward compatibility of the functionality.modelopt/torch/puzzletron/utils/data/dataloaders.py (1)
16-32: LGTM on namespace migration.The docstring is appropriately concise, and the import paths are correctly updated to the
puzzletronnamespace. No functional changes to the dataloader utilities.modelopt/torch/puzzletron/dataset/prepare_dataset.py (1)
22-22: LGTM on import update.The
mprintimport is correctly updated to thepuzzletronnamespace. No functional changes to the dataset preparation logic.tests/_test_utils/torch/puzzletron/resources/configs/validate_model_defaults.yaml (1)
17-17: LGTM!The
load_dataset_fnpath is correctly updated to reference the newpuzzletronnamespace, consistent with the broader refactor.modelopt/torch/puzzletron/tools/checkpoint_utils.py (2)
33-34: LGTM!Import paths are correctly updated to the new
puzzletronnamespace, aligning with the refactor.
58-60: LGTM!The local import path is correctly updated to the
puzzletronnamespace while preserving the circular import avoidance pattern.examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/validate_model_defaults.yaml (1)
17-17: LGTM!The
load_dataset_fnpath is correctly updated to reference the newpuzzletronnamespace, consistent with other config files in this refactor.modelopt/torch/puzzletron/tools/validate_puzzle_with_multi_replacements.py (2)
34-52: LGTM!All import paths are correctly updated to the
puzzletronnamespace, maintaining proper module dependencies.
272-276: LGTM!The type annotation modernization from
Optional[str]tostr | Noneis appropriate and consistent with Python 3.10+ conventions used elsewhere in this refactor.modelopt/torch/puzzletron/mip/mip_and_realize_models.py (2)
25-29: LGTM!Import paths are correctly updated to the
puzzletronnamespace, properly importingrun_puzzle,mprint, andvalidate_puzzle_solutionsfrom their new locations.
32-34: LGTM!The return type annotation modernization from
List[str]tolist[str]uses Python 3.9+ built-in generics, which is consistent with the type annotation style used across this refactor.modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py (3)
36-42: LGTM! Import paths correctly updated to puzzletron namespace.The imports are consistently updated from the old
_compressnamespace to the newpuzzletronnamespace, aligning with the PR objectives.
72-77: LGTM! Docstring and import correctly updated.The import path is properly updated to reference the new
puzzletron.tools.checkpoint_utilsmodule.
193-196: LGTM! Consistent import path update.Same import path update applied consistently within
split_checkpoint_to_subblocks.modelopt/torch/puzzletron/tools/sharded_checkpoint_utils.py (2)
38-46: LGTM! Import paths correctly migrated to puzzletron namespace.All imports are consistently updated:
configuration_decilmandmodeling_decilmfrom puzzletron.decilmcheckpoint_utilsandloggerfrom puzzletron.toolsEmptyInitOnDevicefrom puzzletron.utils.utils
243-246: LGTM! Sewing kit import path updated.The import path for distributed utilities is correctly updated to the new puzzletron namespace.
tests/gpu/torch/puzzletron/nas/plugins/test_nas_search.py (2)
22-26: LGTM! Imports correctly updated for puzzletron namespace.The test utilities and model class imports are properly migrated from
_compresstopuzzletron.
45-66: LGTM! Test configuration correctly updated.The hydra config directory path and mode string are properly updated to align with the puzzletron namespace migration. The
PuzzletronModelis correctly instantiated, the mode is set to"puzzletron", and the referenced config fileLlama-3_1-8B-ffn-pruning.yamlexists at the specified path.examples/puzzletron/README.md (4)
1-3: LGTM! Documentation title and description correctly updated.The README properly reflects the rename from "compress" to "puzzletron" while maintaining the reference to the Puzzle paper.
19-19: LGTM! Installation dependency correctly updated.The pip install command correctly references the new
[puzzletron]extra instead of[compress].
37-37: LGTM! Dataset preparation command updated.The module path is correctly updated to
modelopt.torch.puzzletron.dataset.prepare_dataset.
57-70: LGTM! CLI commands and progress labels consistently updated.All example commands and progress log output correctly reference the new
puzzletronnamespace and entry points.modelopt/torch/puzzletron/tools/bypassed_training/init_child_from_parent.py (4)
25-43: LGTM! Import paths correctly migrated to puzzletron namespace.All imports are consistently updated to reference the new
puzzletronmodule paths.
88-88: LGTM! CLI usage example correctly updated.The module invocation path in the usage docstring is properly updated to
modelopt.torch.puzzletron.tools.bypassed_training.init_child_from_parent.
103-108: LGTM! Type annotations modernized to PEP 604 union syntax.The nullable type annotations are correctly updated from
Optional[T]toT | Nonesyntax.
210-219: LGTM! Profiling summary output formatting is clean.The profiling summary formatting changes are minor and improve readability.
tests/gpu/torch/puzzletron/test_puzzletron.py (3)
22-28: LGTM!Import paths correctly updated to the new
puzzletronnamespace. The imports align with the module structure changes in the PR.
36-46: LGTM!Function renames from
test_compresstotest_puzzletronand_test_compress_multiprocess_jobto_test_puzzletron_multiprocess_jobare consistent with the PR objectives.
52-66: LGTM!Config path correctly updated to
tests/_test_utils/torch/puzzletron/resources/configsand the API call updated topuzzletron.puzzletron(...). The changes are consistent with the module rename.examples/puzzletron/main.py (3)
36-43: LGTM!Import paths correctly updated to the
puzzletronnamespace. The imports are consistent with the module restructuring.
91-106: LGTM!Model class updated to
PuzzletronModeland mode string correctly changed from"compress"to"puzzletron". The conversion flow aligns with the renamed NAS plugin.
155-162: LGTM!The
main()function correctly callsrun_full_puzzletronin the non-MIP path.modelopt/torch/puzzletron/tools/bypassed_training/child_init.py (9)
25-30: LGTM on import modernization.Using
Callablefromcollections.abcinstead oftypingis the modern approach (recommended since Python 3.9). The simplified typing import is cleaner.
35-42: LGTM!Import paths correctly updated to the
puzzletronnamespace, consistent with the module restructuring.
122-128: LGTM on idiomatic dict membership checks.Removing
.keys()frominchecks is idiomatic Python.key in dictis equivalent tokey in dict.keys()but more concise.
183-185: LGTM on dict iteration simplification.Iterating directly over the dict (
for new_key in new_state_dict) instead of.keys()is the preferred Python idiom.
334-339: Type hints consistently modernized.The function signature uses modern union syntax throughout. This is consistent with the rest of the file.
1113-1116: LGTM on dict iteration simplification.Iterating directly over
bias_sdinstead ofbias_sd.keys()is the preferred Python idiom.
1366-1371: LGTM on type hints and docstring update.The modernized type hints and updated docstring are consistent with the rest of the file's changes.
1573-1580: LGTM!The function signature and docstring are consistent with the type modernization throughout the file.
96-103: No action needed; union syntax is compliant with project requirements.The project explicitly targets Python 3.10+ (as specified in
pyproject.tomlwithtarget-version = "py310"and confirmed by CI testing on Python 3.10). TheT | Noneunion syntax is fully supported at this version and is appropriately used throughout the codebase.Likely an incorrect or invalid review comment.
861c9fe to
c95e2f2
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## feature/compress #776 +/- ##
=================================================
Coverage 74.62% 74.62%
=================================================
Files 192 192
Lines 18989 18989
=================================================
Hits 14171 14171
Misses 4818 4818 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
c95e2f2 to
8fe2496
Compare
danielkorzekwa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few minor comments
| @@ -312,7 +312,7 @@ def _process_single_layer( | |||
| ]: | |||
| key_possibly_missing_in_student = f".{layer_idx}.{key_possibly_missing_in_student}" | |||
| is_key_missing_from_student = ( | |||
| len([k for k in new_state_dict.keys() if key_possibly_missing_in_student in k]) == 0 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: be careful when removing keys (by cursor ai), one time, cursors ai removed keys() for me where it should not,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not by cursor but by ruff. It only applies changes it is sure of. Unsafe fixes are disabled in ruff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually meant that ruff removed some .keys(), making assumption that some variable was a dictionary but it was not, for me that broke puzzletron e2e test, so I then reviewed carefully all changes ruff was making.
|
|
||
| @dataclass | ||
| class ExternalTarget(TargetWithNamedInputs, TargetWithNamedOutputs, metaclass=Singleton): | ||
| """External target for stitched modules.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meaning less comment, improve it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed to add something else doc building was failing. Pls suggest what should I change this to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep as is, I would need to learn more about sewingkit
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
What does this PR do?
modelopt.torch._compresstomodelopt.torch.puzzletron_compress)Summary by CodeRabbit
New Features
Bug Fixes & Improvements
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.