Skip to content

feat(pt_expt): add dp freeze support and dp test tests for .pte models#5302

Open
wanghan-iapcm wants to merge 1 commit intodeepmodeling:masterfrom
wanghan-iapcm:feat-pt-expt-dpfrztest
Open

feat(pt_expt): add dp freeze support and dp test tests for .pte models#5302
wanghan-iapcm wants to merge 1 commit intodeepmodeling:masterfrom
wanghan-iapcm:feat-pt-expt-dpfrztest

Conversation

@wanghan-iapcm
Copy link
Collaborator

@wanghan-iapcm wanghan-iapcm commented Mar 10, 2026

Summary

  • Add dp freeze support for the pt_expt backend, enabling checkpoint .pt → exported .pte conversion
  • Add end-to-end tests for both dp freeze and dp test with .pte models

Background

The pt_expt backend can export models to .pte via deserialize_to_file(), and dp test can already load .pte models through the registered DeepEval. However, dp freeze was not
wired up — calling dp freeze -b pt-expt hit RuntimeError: Unsupported command 'freeze'.

Changes

deepmd/pt_expt/entrypoints/main.py

  • Add freeze() function: loads .pt checkpoint → reconstructs model via get_model + ModelWrapper → serializes → exports to .pte via deserialize_to_file
  • Wire freeze command in main() dispatcher with checkpoint directory resolution and .pte default suffix

source/tests/pt_expt/test_dp_freeze.py (new)

  • test_freeze_pte — verify .pte file is created from checkpoint
  • test_freeze_main_dispatcher — test main() CLI dispatcher with freeze command
  • test_freeze_default_suffix — verify non-.pte output suffix is corrected to .pte

source/tests/pt_expt/test_dp_test.py (new)

  • test_dp_test_system — test dp test with -s system path, verify .e.out, .f.out, .v.out outputs
  • test_dp_test_input_json — test dp test with --valid-data JSON input

Test plan

  • python -m pytest source/tests/pt_expt/test_dp_freeze.py -v (3 passed)
  • python -m pytest source/tests/pt_expt/test_dp_test.py -v (2 passed)

Summary by CodeRabbit

  • New Features

    • Added a "freeze" CLI command to serialize PyTorch models to portable .pte format files.
    • Automatic output suffix handling with default filename generation.
  • Tests

    • Added comprehensive test coverage for model freezing functionality.
    • Added tests for the dp_test command integration with frozen models.

Add freeze() function to pt_expt backend that loads a .pt checkpoint,
reconstructs the model, serializes it, and exports to .pte via
deserialize_to_file. Wire the freeze command in the main() CLI dispatcher.

Add separate test files for dp freeze (test_dp_freeze.py) and dp test
(test_dp_test.py) verifying the full freeze-then-test pipeline works
end-to-end with .pte models.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 35745c127e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +250 to +253
if Path(FLAGS.checkpoint_folder).is_dir():
checkpoint_path = Path(FLAGS.checkpoint_folder)
latest_ckpt_file = (checkpoint_path / "checkpoint").read_text()
FLAGS.model = str(checkpoint_path.joinpath(latest_ckpt_file))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Support pt_expt checkpoint directory layout in freeze

When dp freeze is run with a directory (-c default is .), this branch unconditionally reads <dir>/checkpoint, but pt_expt training writes checkpoints as <save_ckpt>-<step>.pt plus a <save_ckpt>.pt symlink and does not create a checkpoint file (see deepmd/pt_expt/train/training.py::save_checkpoint). As a result, the normal post-training workflow dp --pt-expt freeze fails with FileNotFoundError before freeze() is called unless users manually pass a .pt file path.

Useful? React with 👍 / 👎.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 10, 2026

📝 Walkthrough

Walkthrough

A new "freeze" CLI command is added to the pt_expt backend for serializing PyTorch model checkpoints to .pte format. The implementation loads model checkpoints, validates multi-task compatibility, builds models, and serializes to frozen output files. Comprehensive tests validate freeze functionality and dp_test command integration with frozen models.

Changes

Cohort / File(s) Summary
pt_expt freeze implementation
deepmd/pt_expt/entrypoints/main.py
Adds freeze() function that loads PyTorch checkpoints, validates multi-task support, builds models via get_model(), wraps with ModelWrapper, and serializes to .pte files. CLI integration in main() resolves checkpoint paths and enforces .pte output suffix.
pt_expt freeze tests
source/tests/pt_expt/test_dp_freeze.py
New test module with TestDPFreezePtExpt class validating freeze functionality: .pte file generation, CLI dispatcher invocation with command="freeze", and default .pte suffix handling when output path ends with .pth.
pt_expt test command tests
source/tests/pt_expt/test_dp_test.py
New test module with TestDPTestPtExpt class testing dp_test entrypoint against frozen .pte models, validating output files (.e.out, .f.out, .v.out) and JSON configuration-based test execution.

Sequence Diagram

sequenceDiagram
    participant CLI as main()
    participant Freeze as freeze()
    participant Model as ModelWrapper
    participant File as Output File
    
    CLI->>Freeze: freeze(checkpoint_path, output_file, head)
    Freeze->>Freeze: Resolve & validate checkpoint path
    Freeze->>Freeze: Load checkpoint state_dict
    Freeze->>Model: get_model() → build model
    Model->>Model: Initialize model parameters
    Freeze->>Model: ModelWrapper(model) + load state_dict
    Model->>Model: Serialize to .pte format
    Freeze->>File: deserialize_to_file() write
    File->>File: Save frozen model
    Freeze->>CLI: Log success message
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

Python, enhancement

Suggested reviewers

  • njzjz
  • iProzd
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(pt_expt): add dp freeze support and dp test tests for .pte models' directly and concisely describes the main changes: adding dp freeze support and dp test tests for .pte models in the pt_expt backend.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
source/tests/pt_expt/test_dp_test.py (1)

27-45: Consider extracting shared test fixtures.

The model_se_e2_a configuration and checkpoint creation pattern are duplicated in test_dp_freeze.py. Consider extracting these to a shared conftest or helper module to reduce duplication.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/test_dp_test.py` around lines 27 - 45, The model
configuration dictionary model_se_e2_a and the checkpoint creation logic
duplicated between test_dp_test.py and test_dp_freeze.py should be extracted
into a shared pytest fixture or helper function (e.g., in conftest.py or a
test_helpers module); create a fixture named model_se_e2_a that returns the dict
and a helper fixture/function (e.g., make_checkpoint or checkpoint_fixture) that
encapsulates the checkpoint creation pattern, then update both tests to accept
those fixtures instead of redefining the dict/checkpoint code so duplication is
removed and maintenance is centralized.
deepmd/pt_expt/entrypoints/main.py (1)

256-257: Minor: .pt2 suffix check might be undocumented.

The code accepts both .pte and .pt2 suffixes, but the docstring and default only mention .pte. Consider documenting .pt2 if it's intentionally supported, or remove it if not needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/entrypoints/main.py` around lines 256 - 257, The FLAGS.output
handling accepts both ".pte" and ".pt2" but only ".pte" is documented; decide
whether ".pt2" is intentional and then update code accordingly: if intended, add
".pt2" to the module/docstring and the FLAGS.output help/default text (where
FLAGS is defined) and update any docs/tests to mention ".pt2"; otherwise remove
".pt2" from the tuple in the conditional so FLAGS.output only normalizes to
".pte". Ensure changes reference FLAGS.output and the suffix check in main.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Around line 250-253: The code reading the checkpoint name using
(checkpoint_path / "checkpoint").read_text() assigns latest_ckpt_file with
possible trailing newline/whitespace which breaks FLAGS.model path construction;
update the read to strip whitespace (e.g., call .strip() on the result) before
using checkpoint_path.joinpath and set FLAGS.model =
str(checkpoint_path.joinpath(latest_ckpt_file.strip())), ensuring you reference
FLAGS.checkpoint_folder, checkpoint_path, latest_ckpt_file and FLAGS.model when
making the change.

---

Nitpick comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Around line 256-257: The FLAGS.output handling accepts both ".pte" and ".pt2"
but only ".pte" is documented; decide whether ".pt2" is intentional and then
update code accordingly: if intended, add ".pt2" to the module/docstring and the
FLAGS.output help/default text (where FLAGS is defined) and update any
docs/tests to mention ".pt2"; otherwise remove ".pt2" from the tuple in the
conditional so FLAGS.output only normalizes to ".pte". Ensure changes reference
FLAGS.output and the suffix check in main.py.

In `@source/tests/pt_expt/test_dp_test.py`:
- Around line 27-45: The model configuration dictionary model_se_e2_a and the
checkpoint creation logic duplicated between test_dp_test.py and
test_dp_freeze.py should be extracted into a shared pytest fixture or helper
function (e.g., in conftest.py or a test_helpers module); create a fixture named
model_se_e2_a that returns the dict and a helper fixture/function (e.g.,
make_checkpoint or checkpoint_fixture) that encapsulates the checkpoint creation
pattern, then update both tests to accept those fixtures instead of redefining
the dict/checkpoint code so duplication is removed and maintenance is
centralized.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e8b4749c-4b84-4f85-a10c-9f671ace384a

📥 Commits

Reviewing files that changed from the base of the PR and between 24e54bf and 35745c1.

📒 Files selected for processing (3)
  • deepmd/pt_expt/entrypoints/main.py
  • source/tests/pt_expt/test_dp_freeze.py
  • source/tests/pt_expt/test_dp_test.py

Comment on lines +250 to +253
if Path(FLAGS.checkpoint_folder).is_dir():
checkpoint_path = Path(FLAGS.checkpoint_folder)
latest_ckpt_file = (checkpoint_path / "checkpoint").read_text()
FLAGS.model = str(checkpoint_path.joinpath(latest_ckpt_file))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Strip whitespace from checkpoint filename to avoid path errors.

read_text() preserves trailing newlines. If the checkpoint file contains "model-100.pt\n", the constructed path will be invalid.

Proposed fix
     if Path(FLAGS.checkpoint_folder).is_dir():
         checkpoint_path = Path(FLAGS.checkpoint_folder)
-        latest_ckpt_file = (checkpoint_path / "checkpoint").read_text()
+        latest_ckpt_file = (checkpoint_path / "checkpoint").read_text().strip()
         FLAGS.model = str(checkpoint_path.joinpath(latest_ckpt_file))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/entrypoints/main.py` around lines 250 - 253, The code reading
the checkpoint name using (checkpoint_path / "checkpoint").read_text() assigns
latest_ckpt_file with possible trailing newline/whitespace which breaks
FLAGS.model path construction; update the read to strip whitespace (e.g., call
.strip() on the result) before using checkpoint_path.joinpath and set
FLAGS.model = str(checkpoint_path.joinpath(latest_ckpt_file.strip())), ensuring
you reference FLAGS.checkpoint_folder, checkpoint_path, latest_ckpt_file and
FLAGS.model when making the change.

@codecov
Copy link

codecov bot commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 85.71429% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.33%. Comparing base (24e54bf) to head (35745c1).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/pt_expt/entrypoints/main.py 85.71% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5302      +/-   ##
==========================================
+ Coverage   82.32%   82.33%   +0.01%     
==========================================
  Files         768      768              
  Lines       77098    77125      +27     
  Branches     3659     3660       +1     
==========================================
+ Hits        63469    63500      +31     
+ Misses      12458    12453       -5     
- Partials     1171     1172       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant