feat(pt_expt): implement .pte inference pipeline with dynamic shapes#5284
feat(pt_expt): implement .pte inference pipeline with dynamic shapes#5284wanghan-iapcm merged 16 commits intodeepmodeling:masterfrom
Conversation
Implement the full pt_expt inference pipeline: serialize models to .pte files via torch.export, and load them for inference via DeepPot/DeepEval. Key changes: - Add DeepEval backend for .pte files (deepmd/pt_expt/infer/deep_eval.py) - Add serialize/deserialize hooks (deepmd/pt_expt/utils/serialization.py) - Wire up backend hooks in deepmd/backend/pt_expt.py - Add forward_common_lower_exportable using make_fx + torch.export - Support dynamic nframes, nloc, and nall dimensions - Fix atomic_virial_corr to use explicit loop instead of vmap Add xp_take_first_n helper to avoid torch.export contiguity guards on [:, :nloc] slices. When torch.export traces tensor[:, :nloc] on a tensor of size nall, it records a Ne(nall, nloc) guard from the view's contiguity check, which fails when nall == nloc (no PBC). Using torch.index_select instead creates a new tensor, avoiding the guard.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a PyTorch-exportable inference backend and supporting utilities: implements DeepEval evaluator, NeighborStat operator, serialization/deserialization for .pte exports, torch-aware array helpers and export tracing changes; replaces direct slicing with xp_take_first_n in multiple modules and adds comprehensive unit tests for pt_expt inference and serialization. Changes
Sequence DiagramsequenceDiagram
participant User as User
participant DeepEval as DeepEval
participant ExportedModule as ExportedModule
participant NeighborList as NeighborList
participant DataProc as DataProcessor
User->>DeepEval: __init__(model_file, output_def)
DeepEval->>ExportedModule: load .pte (module + metadata)
ExportedModule-->>DeepEval: traced module + metadata
DeepEval->>DeepEval: reconstruct ModelOutputDef
User->>DeepEval: eval(coords, cells, atom_types, ...)
DeepEval->>DataProc: determine nframes & natoms
DataProc-->>DeepEval: nframes, natoms
DeepEval->>NeighborList: build neighbor list (native or ASE)
NeighborList-->>DeepEval: nlist, mapping
DeepEval->>DataProc: extend coords/types, prepare tensors
DataProc-->>DeepEval: extended_coord, extended_atype, fparam, aparam
DeepEval->>ExportedModule: forward_common_lower(extended_coord, extended_atype, nlist, mapping, fparam, aparam)
ExportedModule-->>DeepEval: internal output dict
DeepEval->>DataProc: map & reshape outputs per ModelOutputDef
DataProc-->>DeepEval: formatted numpy results
DeepEval-->>User: return results (energy, force, virial, ...)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
deepmd/pt_expt/model/transform_output.py (1)
31-40: Trim autograd graph retention in the component loop.You only need to retain the graph for the first two gradient calls; dropping it on the last iteration reduces peak memory.
♻️ Proposed adjustment
for i in range(3): grad_output = torch.zeros_like(sumce) grad_output[:, i] = 1.0 result = torch.autograd.grad( [sumce], [extended_coord], grad_outputs=[grad_output], create_graph=False, - retain_graph=True, + retain_graph=i < 2, )[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/model/transform_output.py` around lines 31 - 40, The loop that computes gradients for each component currently sets retain_graph=True for all iterations, causing unnecessary memory retention; update the torch.autograd.grad call inside the for i in range(3) loop to set retain_graph=True only for the first two iterations (e.g., retain_graph=(i < 2)) while leaving create_graph=False, so the autograd graph is freed on the last gradient computation; refer to the variables sumce and extended_coord and the torch.autograd.grad call to locate and change the retain_graph condition.deepmd/backend/pt_expt.py (1)
94-98: AlignNeighborStattyping import with the runtime import path.Line 25 type-checks
NeighborStatfromdeepmd.utils.neighbor_stat, while this property returns the class fromdeepmd.dpmodel.utils.neighbor_stat. Keeping both paths consistent will avoid typing drift.♻️ Proposed typing alignment
if TYPE_CHECKING: @@ - from deepmd.utils.neighbor_stat import ( + from deepmd.dpmodel.utils.neighbor_stat import ( NeighborStat, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/backend/pt_expt.py` around lines 94 - 98, The typing import for NeighborStat is from deepmd.utils.neighbor_stat but at runtime pt_expt.py returns the class from deepmd.dpmodel.utils.neighbor_stat; update the type annotation import to the same module path (deepmd.dpmodel.utils.neighbor_stat) so the typing and runtime import for NeighborStat match (adjust the import statement that declares NeighborStat used in annotations to reference deepmd.dpmodel.utils.neighbor_stat).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 107-114: The constructor for the class accepts neighbor_list but
never stores or uses it; fix by storing the passed neighbor_list (e.g.,
self._neighbor_list = neighbor_list) in __init__ and then update _eval_model
(and the other occurrence around lines 350-357) to use the stored neighbor list
when present instead of always calling build_neighbor_list(...); if
self._neighbor_list is None, fall back to the existing build_neighbor_list(...)
call so existing behavior remains unchanged.
- Around line 230-238: The code assumes fparam/aparam already have shapes
(nframes, dim_fparam) and (nframes, natoms, dim_aparam) but the docstring allows
broadcast forms (dim_fparam,), (natoms, dim_aparam) and (dim_aparam,), so before
converting/reshaping to tensors normalize inputs: detect these cases and expand
them to full shape using np.broadcast_to or repeat (for fparam: if fparam.shape
== (dim_fparam,) -> broadcast to (nframes, dim_fparam); if fparam.shape ==
(nframes, dim_fparam) keep; for aparam: if aparam.shape == (dim_aparam,) ->
broadcast to (nframes, natoms, dim_aparam); if aparam.shape == (natoms,
dim_aparam) -> broadcast to (nframes, natoms, dim_aparam); if already (nframes,
natoms, dim_aparam) keep). Apply the same normalization in both places mentioned
(around the fparam/aparam reshape logic at the blocks near lines 230-238 and
370-382) before any tensor conversion so downstream reshape operations succeed.
- Around line 111-114: The signature currently includes unused parameters *args
and **kwargs which trigger Ruff ARG002; to fix, rename them to *_args and
**_kwargs (or prefix the specific unused named param with an underscore) in the
function/method signature so they remain available for interface compatibility
without lint failures (e.g., change "*args: Any" -> "*_args: Any" and "**kwargs:
Any" -> "**_kwargs: Any"); apply the same change for the second occurrence
mentioned (around the other signature at line ~210) and run ruff check . and
ruff format . before committing.
- Line 413: The tensor conversion at model_predict[odef.name].detach().numpy()
is unsafe when tensors live on CUDA; update the code that builds the output to
move the detached tensor to CPU before calling numpy (i.e., use
detach().cpu().numpy() when converting and reshaping), referencing the
model_predict[odef.name] expression and the DEVICE usage that may place tensors
on CUDA.
In `@deepmd/pt_expt/utils/serialization.py`:
- Around line 137-144: The function _build_dynamic_shapes currently declares
parameters ext_coord, ext_atype, nlist, and mapping but does not use them (Ruff
ARG001); rename those four parameters to start with an underscore (e.g.,
_ext_coord, _ext_atype, _nlist, _mapping) in the _build_dynamic_shapes signature
to mark them as intentionally unused (leave fparam and aparam as-is), update any
internal references if present, and run ruff check . and ruff format . before
committing to keep CI green.
- Around line 23-31: The serializer lambda in _numpy_to_json_serializable
currently only handles np.ndarray and will raise when encountering torch.Tensor;
update the converter used by json.dumps (the function named
_numpy_to_json_serializable) to detect torch.Tensor and convert it to a numpy
array (e.g., via tensor.detach().cpu().numpy()) before producing the same dict
structure (with "@class":"np.ndarray", "@is_variable":True, "dtype", "value") so
torch tensors serialize identically to numpy arrays; ensure you import torch
conditionally or guard if torch isn't installed to avoid import errors.
---
Nitpick comments:
In `@deepmd/backend/pt_expt.py`:
- Around line 94-98: The typing import for NeighborStat is from
deepmd.utils.neighbor_stat but at runtime pt_expt.py returns the class from
deepmd.dpmodel.utils.neighbor_stat; update the type annotation import to the
same module path (deepmd.dpmodel.utils.neighbor_stat) so the typing and runtime
import for NeighborStat match (adjust the import statement that declares
NeighborStat used in annotations to reference
deepmd.dpmodel.utils.neighbor_stat).
In `@deepmd/pt_expt/model/transform_output.py`:
- Around line 31-40: The loop that computes gradients for each component
currently sets retain_graph=True for all iterations, causing unnecessary memory
retention; update the torch.autograd.grad call inside the for i in range(3) loop
to set retain_graph=True only for the first two iterations (e.g.,
retain_graph=(i < 2)) while leaving create_graph=False, so the autograd graph is
freed on the last gradient computation; refer to the variables sumce and
extended_coord and the torch.autograd.grad call to locate and change the
retain_graph condition.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
deepmd/backend/pt_expt.pydeepmd/dpmodel/array_api.pydeepmd/dpmodel/atomic_model/base_atomic_model.pydeepmd/dpmodel/atomic_model/dp_atomic_model.pydeepmd/dpmodel/utils/env_mat.pydeepmd/pt_expt/infer/__init__.pydeepmd/pt_expt/infer/deep_eval.pydeepmd/pt_expt/model/ener_model.pydeepmd/pt_expt/model/make_model.pydeepmd/pt_expt/model/transform_output.pydeepmd/pt_expt/utils/serialization.pysource/tests/pt_expt/infer/__init__.pysource/tests/pt_expt/infer/test_deep_eval.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a31fd18ca7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5284 +/- ##
==========================================
+ Coverage 82.25% 82.30% +0.04%
==========================================
Files 765 767 +2
Lines 76631 76984 +353
Branches 3659 3659
==========================================
+ Hits 63034 63360 +326
- Misses 12425 12453 +28
+ Partials 1172 1171 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
deepmd/pt_expt/infer/deep_eval.py (3)
623-623:⚠️ Potential issue | 🔴 CriticalMove tensor to CPU before calling
.numpy().If
DEVICEis CUDA,model_predict[odef.name].detach().numpy()raises at runtime.Suggested fix
- out = model_predict[odef.name].detach().numpy().reshape(shape) + out = ( + model_predict[odef.name] + .detach() + .cpu() + .numpy() + .reshape(shape) + )#!/bin/bash # Verify unsafe tensor->numpy calls and CUDA-capable device usage in this file. rg -n '\.detach\(\)\.numpy\(' deepmd/pt_expt/infer/deep_eval.py rg -n 'DEVICE|device=' deepmd/pt_expt/infer/deep_eval.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/infer/deep_eval.py` at line 623, The call that converts model output to a NumPy array uses model_predict[odef.name].detach().numpy() which will fail on CUDA tensors; change the sequence to move the tensor to CPU before converting (e.g., detach() then cpu()/to('cpu') then numpy()) for the expression around model_predict[odef.name] in deep_eval.py so that out is created from a CPU tensor before reshape.
580-592:⚠️ Potential issue | 🟠 MajorImplement documented
fparam/aparambroadcast forms before tensor conversion.The docstring allows scalar/per-frame/per-atom forms, but current code only supports already-expanded shapes and can fail on valid inputs (for example
fparam.shape == (dim_fparam,),aparam.shape == (natoms, dim_aparam)with multiple frames).Suggested fix
if fparam is not None: + fparam = np.asarray(fparam) + dim_fparam = self.get_dim_fparam() + if fparam.ndim == 1: + fparam = np.broadcast_to(fparam.reshape(1, dim_fparam), (nframes, dim_fparam)) + elif fparam.ndim == 2 and fparam.shape[0] == 1 and nframes > 1: + fparam = np.broadcast_to(fparam, (nframes, dim_fparam)) fparam_t = torch.tensor( fparam.reshape(nframes, self.get_dim_fparam()), dtype=torch.float64, device=DEVICE, ) else: fparam_t = None if aparam is not None: + aparam = np.asarray(aparam) + dim_aparam = self.get_dim_aparam() + if aparam.ndim == 1: + aparam = np.broadcast_to( + aparam.reshape(1, 1, dim_aparam), (nframes, natoms, dim_aparam) + ) + elif aparam.ndim == 2: + aparam = np.broadcast_to( + aparam.reshape(1, natoms, dim_aparam), (nframes, natoms, dim_aparam) + ) aparam_t = torch.tensor( aparam.reshape(nframes, natoms, self.get_dim_aparam()), dtype=torch.float64, device=DEVICE, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/infer/deep_eval.py` around lines 580 - 592, The code that converts fparam/aparam to tensors should first accept and expand the documented broadcast forms (scalar, per-frame, per-atom, and their combinations) before calling torch.tensor; update the block around fparam/aparam handling (where fparam_t and aparam_t are created and where get_dim_fparam()/get_dim_aparam(), nframes and natoms are available) to detect shapes like () or (dim_fparam,), (nframes, dim_fparam) for fparam and () or (dim_aparam,), (natoms, dim_aparam), (nframes, natoms, dim_aparam) for aparam, and broadcast/reshape them (using numpy.reshape/np.broadcast_to or equivalent) into the full (nframes, dim_fparam) and (nframes, natoms, dim_aparam) shapes before converting to torch.tensor so all documented input forms work.
112-115:⚠️ Potential issue | 🟠 MajorResolve Ruff ARG002 by prefixing intentionally unused interface parameters.
*args,**kwargs(constructor) and**kwargs(eval) are currently unused and flagged.Suggested fix
- *args: Any, + *_args: Any, auto_batch_size: bool | int | AutoBatchSize = True, neighbor_list: Optional["ase.neighborlist.NewPrimitiveNeighborList"] = None, - **kwargs: Any, + **_kwargs: Any, @@ - **kwargs: Any, + **_kwargs: Any,As per coding guidelines:
**/*.py: Always runruff check .andruff format .before committing changes or CI will fail.Also applies to: 212-212
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/infer/deep_eval.py` around lines 112 - 115, Rename intentionally unused variadic parameters by prefixing them with an underscore to satisfy Ruff ARG002: change *args to *_args and **kwargs to **_kwargs in the class __init__ signature (the constructor that currently has "*args: Any, auto_batch_size: ... , **kwargs: Any") and do the same for the eval method's **kwargs parameter referenced elsewhere (the eval method flagged at the other occurrence). This preserves the interface while marking them as intentionally unused so Ruff no longer reports ARG002.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Line 388: The unpacked unused variables ext_atype_ase, mapping_ase (and
mapping_nat at the other occurrence) should be prefixed with an underscore to
satisfy Ruff RUF059; update the unpacking statements (e.g., the tuple assignment
that defines ext_coord_ase, ext_atype_ase, nlist_ase, mapping_ase) to use
_ext_atype_ase and _mapping_ase, and similarly rename mapping_nat to
_mapping_nat where it is unpacked, and then run ruff check/format to ensure no
other references remain to those names.
---
Duplicate comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Line 623: The call that converts model output to a NumPy array uses
model_predict[odef.name].detach().numpy() which will fail on CUDA tensors;
change the sequence to move the tensor to CPU before converting (e.g., detach()
then cpu()/to('cpu') then numpy()) for the expression around
model_predict[odef.name] in deep_eval.py so that out is created from a CPU
tensor before reshape.
- Around line 580-592: The code that converts fparam/aparam to tensors should
first accept and expand the documented broadcast forms (scalar, per-frame,
per-atom, and their combinations) before calling torch.tensor; update the block
around fparam/aparam handling (where fparam_t and aparam_t are created and where
get_dim_fparam()/get_dim_aparam(), nframes and natoms are available) to detect
shapes like () or (dim_fparam,), (nframes, dim_fparam) for fparam and () or
(dim_aparam,), (natoms, dim_aparam), (nframes, natoms, dim_aparam) for aparam,
and broadcast/reshape them (using numpy.reshape/np.broadcast_to or equivalent)
into the full (nframes, dim_fparam) and (nframes, natoms, dim_aparam) shapes
before converting to torch.tensor so all documented input forms work.
- Around line 112-115: Rename intentionally unused variadic parameters by
prefixing them with an underscore to satisfy Ruff ARG002: change *args to *_args
and **kwargs to **_kwargs in the class __init__ signature (the constructor that
currently has "*args: Any, auto_batch_size: ... , **kwargs: Any") and do the
same for the eval method's **kwargs parameter referenced elsewhere (the eval
method flagged at the other occurrence). This preserves the interface while
marking them as intentionally unused so Ruff no longer reports ARG002.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 361ad727-4d3c-42cf-80da-7f316c7dba14
📒 Files selected for processing (2)
deepmd/pt_expt/infer/deep_eval.pysource/tests/pt_expt/infer/test_deep_eval.py
There was a problem hiding this comment.
♻️ Duplicate comments (2)
deepmd/pt_expt/infer/deep_eval.py (1)
108-116:⚠️ Potential issue | 🟡 MinorPrefix intentionally-unused interface parameters to satisfy Ruff ARG002.
*args/**kwargshere are kept for interface compatibility but unused. Rename to underscored forms to keep lint/CI green.🔧 Suggested diff
def __init__( self, model_file: str, output_def: ModelOutputDef, - *args: Any, + *_args: Any, auto_batch_size: bool | int | AutoBatchSize = True, neighbor_list: Optional["ase.neighborlist.NewPrimitiveNeighborList"] = None, - **kwargs: Any, + **_kwargs: Any, ) -> None: @@ def eval( @@ - **kwargs: Any, + **_kwargs: Any, ) -> dict[str, np.ndarray]:As per coding guidelines:
**/*.py: Always runruff check .andruff format .before committing changes or CI will fail.Also applies to: 204-213
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/infer/deep_eval.py` around lines 108 - 116, The __init__ signature in deep_eval.py uses unused variadic parameters *args and **kwargs which trigger Ruff ARG002; rename them to underscored forms (e.g., _args and _kwargs) in the DeepEval __init__ (and any other constructors or functions in this file with intentionally-unused variadic params, including the later signature around the same class) to signal they are intentionally unused and keep lint/CI green, then run ruff check/format before committing.deepmd/pt_expt/utils/serialization.py (1)
19-33:⚠️ Potential issue | 🟠 MajorHandle
torch.Tensorin JSON conversion path to avoidTypeErrorat save time.
_numpy_to_json_serializablecurrently converts onlynp.ndarray. Ifdatacontainstorch.Tensorvalues,json.dumps(...)in the extra-files path will fail.💡 Minimal fix
def _numpy_to_json_serializable(model_obj: dict) -> dict: """Convert numpy arrays in a model dict to JSON-serializable lists.""" + def _to_jsonable(x: object) -> object: + if isinstance(x, torch.Tensor): + x = x.detach().cpu().numpy() + if isinstance(x, np.ndarray): + return { + "@class": "np.ndarray", + "@is_variable": True, + "dtype": x.dtype.name, + "value": x.tolist(), + } + return x + return traverse_model_dict( model_obj, - lambda x: ( - { - "@class": "np.ndarray", - "@is_variable": True, - "dtype": x.dtype.name, - "value": x.tolist(), - } - if isinstance(x, np.ndarray) - else x - ), + _to_jsonable, )Based on learnings: In the deepmd-kit PyTorch backend, modifier serialization uses
pickle.dumps/pickle.loadsbecause serialized dictionaries can containtorch.Tensorobjects that are not JSON-serializable.Also applies to: 285-290
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/utils/serialization.py` around lines 19 - 33, The JSON conversion currently only handles np.ndarray in _numpy_to_json_serializable called via traverse_model_dict; extend it to also detect torch.Tensor (use isinstance(x, torch.Tensor)), convert tensors to CPU numpy with x.detach().cpu().numpy(), and then serialize exactly like np.ndarray (include "@class", "@is_variable", "dtype" from the numpy array and "value" as a list) to keep the JSON path valid; ensure torch is imported and mirror the same change for the other JSON-conversion helper in this file that performs the same array-to-serializable transformation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 108-116: The __init__ signature in deep_eval.py uses unused
variadic parameters *args and **kwargs which trigger Ruff ARG002; rename them to
underscored forms (e.g., _args and _kwargs) in the DeepEval __init__ (and any
other constructors or functions in this file with intentionally-unused variadic
params, including the later signature around the same class) to signal they are
intentionally unused and keep lint/CI green, then run ruff check/format before
committing.
In `@deepmd/pt_expt/utils/serialization.py`:
- Around line 19-33: The JSON conversion currently only handles np.ndarray in
_numpy_to_json_serializable called via traverse_model_dict; extend it to also
detect torch.Tensor (use isinstance(x, torch.Tensor)), convert tensors to CPU
numpy with x.detach().cpu().numpy(), and then serialize exactly like np.ndarray
(include "@class", "@is_variable", "dtype" from the numpy array and "value" as a
list) to keep the JSON path valid; ensure torch is imported and mirror the same
change for the other JSON-conversion helper in this file that performs the same
array-to-serializable transformation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 6de0d57e-40fe-4df7-aec7-e82b0d135e17
📒 Files selected for processing (2)
deepmd/pt_expt/infer/deep_eval.pydeepmd/pt_expt/utils/serialization.py
…expt.utils.neighbor_stat (ported from PR deepmodeling#5270)
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/pt_expt/utils/neighbor_stat.py`:
- Around line 82-84: The NumPy-to-torch conversions in neighbor_stat.py
(torch.from_numpy(coord), torch.from_numpy(atype), torch.from_numpy(cell)) must
set explicit dtypes to avoid accidental fp64 on GPU; convert coord and cell to
the project's float dtype by passing GLOBAL_PT_FLOAT_PRECISION (or using
.to(DEVICE, dtype=GLOBAL_PT_FLOAT_PRECISION)) and convert atype to torch.long
(dtype=torch.long) before moving to DEVICE, ensuring all three use the DEVICE
constant and follow project conventions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 8b0b8d09-e96d-4a83-a57a-f5e2f9c647bc
📒 Files selected for processing (2)
deepmd/backend/pt_expt.pydeepmd/pt_expt/utils/neighbor_stat.py
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/backend/pt_expt.py
There was a problem hiding this comment.
🧹 Nitpick comments (2)
deepmd/pt_expt/infer/deep_eval.py (2)
623-626: Unnecessarynp.abs(shape)in fallback path.Shape values should always be positive. Using
np.abs(shape)could mask bugs if negative dimensions somehow appear. Consider removing thenp.abs()call or adding an assertion to catch invalid shapes explicitly.🔧 Suggested fix
else: shape = self._get_output_shape(odef, nframes, natoms) - results.append( - np.full(np.abs(shape), np.nan, dtype=GLOBAL_NP_FLOAT_PRECISION) - ) + results.append( + np.full(shape, np.nan, dtype=GLOBAL_NP_FLOAT_PRECISION) + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/infer/deep_eval.py` around lines 623 - 626, The fallback fills results with arrays using np.abs(shape) which can mask invalid negative dimensions; change this to use shape directly and add an explicit validation: after calling self._get_output_shape(odef, nframes, natoms) validate that all dimensions are non-negative (e.g., assert or raise a ValueError) and then call np.full with the validated shape (reference symbols: self._get_output_shape, shape, results, GLOBAL_NP_FLOAT_PRECISION). Ensure the assertion/error message clearly indicates which output shape and odef caused the invalid value.
343-356: Consider using torch-based neighbor list construction.The current design builds neighbor lists using numpy-based dpmodel utilities, then converts to torch tensors. This works correctly, but as previously noted, using PyTorch-native neighbor list construction could:
- Avoid the numpy→torch conversion overhead
- Keep the entire inference pipeline on the same device (GPU)
If performance is a concern, consider if there's a torch-based equivalent for
extend_coord_with_ghostsandbuild_neighbor_list.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/infer/deep_eval.py` around lines 343 - 356, The current code calls extend_coord_with_ghosts and build_neighbor_list (which are numpy-based) and then converts outputs to torch, causing overhead and CPU/GPU device transfers; replace these calls with PyTorch-native implementations or utilities so neighbor list construction runs on torch tensors on the target device: locate uses of extend_coord_with_ghosts, build_neighbor_list, coord_normalized, atom_types, cells, rcut and implement or call a torch-based extend_coord_with_ghosts + build_neighbor_list that accepts torch tensors (keeping distinguish_types semantics), returns torch tensors for extended_coord/extended_atype/mapping and nlist with identical shapes and indexing, and ensure subsequent code consumes those tensors without any numpy conversion so all operations remain on GPU.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 623-626: The fallback fills results with arrays using
np.abs(shape) which can mask invalid negative dimensions; change this to use
shape directly and add an explicit validation: after calling
self._get_output_shape(odef, nframes, natoms) validate that all dimensions are
non-negative (e.g., assert or raise a ValueError) and then call np.full with the
validated shape (reference symbols: self._get_output_shape, shape, results,
GLOBAL_NP_FLOAT_PRECISION). Ensure the assertion/error message clearly indicates
which output shape and odef caused the invalid value.
- Around line 343-356: The current code calls extend_coord_with_ghosts and
build_neighbor_list (which are numpy-based) and then converts outputs to torch,
causing overhead and CPU/GPU device transfers; replace these calls with
PyTorch-native implementations or utilities so neighbor list construction runs
on torch tensors on the target device: locate uses of extend_coord_with_ghosts,
build_neighbor_list, coord_normalized, atom_types, cells, rcut and implement or
call a torch-based extend_coord_with_ghosts + build_neighbor_list that accepts
torch tensors (keeping distinguish_types semantics), returns torch tensors for
extended_coord/extended_atype/mapping and nlist with identical shapes and
indexing, and ensure subsequent code consumes those tensors without any numpy
conversion so all operations remain on GPU.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d7d1cb15-f653-48e6-957d-ef3a32be35a6
📒 Files selected for processing (1)
deepmd/pt_expt/infer/deep_eval.py
# Conflicts: # deepmd/pt_expt/utils/neighbor_stat.py # source/tests/pt_expt/conftest.py
Implement the full pt_expt inference pipeline: serialize models to .pte files via torch.export, and load them for inference via DeepPot/DeepEval.
Key changes:
Add xp_take_first_n helper to avoid torch.export contiguity guards on [:, :nloc] slices. When torch.export traces tensor[:, :nloc] on a tensor of size nall, it records a Ne(nall, nloc) guard from the view's contiguity check, which fails when nall == nloc (no PBC). Using torch.index_select instead creates a new tensor, avoiding the guard.
Summary by CodeRabbit
New Features
Improvements
Tests
Chores