Skip to content

feat(dp, pt): add force l2 norm loss & mae loss#5294

Open
iProzd wants to merge 8 commits intodeepmodeling:masterfrom
iProzd:0305_l2norm
Open

feat(dp, pt): add force l2 norm loss & mae loss#5294
iProzd wants to merge 8 commits intodeepmodeling:masterfrom
iProzd:0305_l2norm

Conversation

@iProzd
Copy link
Collaborator

@iProzd iProzd commented Mar 5, 2026

Summary by CodeRabbit

  • New Features

    • Introduced loss_func ("mse" or "mae") to select MSE vs MAE for energy/force/virial/atom losses.
    • Added f_use_norm to enable vector‑norm MAE behavior when allowed.
  • Validation

    • Enforced that f_use_norm is only valid when use_huber is enabled or loss_func="mae"; invalid combos are rejected.
  • Tests

    • Extended loss tests and skipping logic to cover loss_func and f_use_norm combinations.
  • Documentation

    • Updated docs to describe loss_func and resulting metric names (rmse_* vs mae_*).
  • Chores

    • New options are persisted in serialized configurations.
  • Notes

    • Some backends currently only support "mse" (MAE not yet available everywhere).

Copilot AI review requested due to automatic review settings March 5, 2026 16:48
@dosubot dosubot bot added the new feature label Mar 5, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a configurable loss mode (loss_func: "mse" or "mae") and a force-norm option (f_use_norm) across energy loss implementations. Validates allowed combinations, branches energy/force/virial/atom-pref computations on loss_func, updates tests, argument docs, and serialization across backends.

Changes

Cohort / File(s) Summary
Core DPModel energy loss
deepmd/dpmodel/loss/ener.py
Add loss_func and f_use_norm to EnergyLoss.__init__; validate values and constraint (f_use_norm only with huber or mae); add mse/mae branches for energy/force/virial/atom-pref terms; include new fields in serialization.
PyTorch energy loss
deepmd/pt/loss/ener.py, deepmd/pt/loss/ener_spin.py
Replace use_l1_all with loss_func, add f_use_norm; branch energy/force/virial/atom losses on loss_func (mse/mae), support optional vector-norm MAE, update display keys and serialization.
Paddle energy loss
deepmd/pd/loss/ener.py
Introduce loss_func and f_use_norm, branch forward logic on loss_func for energy/force/atom/prefactor computations, raise NotImplementedError for unsupported combos, serialize new fields (PD-specific notes present).
TensorFlow energy loss
deepmd/tf/loss/ener.py
Extend EnerStdLoss/EnerSpinLoss signatures with loss_func and f_use_norm; store on instances; serialize new keys; TF backend enforces loss_func == "mse" (NotImplemented for others).
Argument validation & docs
deepmd/utils/argcheck.py
Add loss_func and f_use_norm arguments and documentation for loss_ener and loss_ener_spin, describing allowed values and when f_use_norm applies.
Consistent loss tests
source/tests/consistent/loss/test_ener.py
Extend parameterization to include loss_func and f_use_norm; add dynamic backend skip logic for invalid combos; include new fields in test data and aggregate mae_ metrics alongside rmse_.
Docs
doc/model/train-energy.md, doc/model/train-energy-spin.md
Document loss_func option and resulting metric names (rmse_* for mse, mae_* for mae); update examples and wording.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • njzjz
  • wanghan-iapcm
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main changes: adding force L2 norm loss (f_use_norm parameter) and MAE loss (loss_func parameter) across multiple backends (dp, pt, and others).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt/loss/ener.py`:
- Around line 56-60: The constructor added new state flags (use_mae_loss,
f_use_norm, use_huber, huber_delta) but serialize() does not emit them, so
deserialization resets behavior; update the class's serialize() to include these
fields (use_mae_loss, f_use_norm, use_huber, huber_delta) in the serialized
dict/object and also ensure the corresponding deserialization/load path reads
them back into the instance (same names as the constructor args) so the loss
behavior is preserved after reload; reference the serialize() method and the
constructor parameter names when making the change.

In `@source/tests/consistent/loss/test_ener.py`:
- Around line 88-95: In skip_tf and skip_pd properties replace the full tuple
unpacking that creates unused locals with wildcard placeholders so Ruff won't
flag RUF059; locate the properties skip_tf and skip_pd and change the unpacking
from "(use_huber, enable_atom_ener_coeff, use_mae_loss, f_use_norm) =
self.param" to only bind the used elements (e.g. "(_, _, use_mae_loss,
f_use_norm) = self.param" or similar) so only use_mae_loss and f_use_norm are
named.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4125c703-fe2c-4e34-b371-8b1aa866ee9a

📥 Commits

Reviewing files that changed from the base of the PR and between 4a29836 and beebc5b.

📒 Files selected for processing (4)
  • deepmd/dpmodel/loss/ener.py
  • deepmd/pt/loss/ener.py
  • deepmd/utils/argcheck.py
  • source/tests/consistent/loss/test_ener.py

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@github-actions github-actions bot added the Python label Mar 5, 2026
@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 77.98165% with 48 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.29%. Comparing base (24e54bf) to head (cea6899).

Files with missing lines Patch % Lines
deepmd/pt/loss/ener_spin.py 51.35% 18 Missing ⚠️
deepmd/pd/loss/ener.py 61.11% 14 Missing ⚠️
deepmd/dpmodel/loss/ener.py 90.14% 7 Missing ⚠️
deepmd/pt/loss/ener.py 88.88% 7 Missing ⚠️
deepmd/tf/loss/ener.py 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5294      +/-   ##
==========================================
- Coverage   82.32%   82.29%   -0.03%     
==========================================
  Files         768      768              
  Lines       77098    77222     +124     
  Branches     3659     3659              
==========================================
+ Hits        63469    63552      +83     
- Misses      12458    12498      +40     
- Partials     1171     1172       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pd/loss/ener.py`:
- Around line 60-61: The code stores the f_use_norm flag but silently ignores it
in the force-loss computation; add a fail-fast check for f_use_norm in the
EnerLoss class (constructor or at start of forward()) so that if f_use_norm is
True you raise a clear exception (e.g. NotImplementedError or ValueError)
explaining vector-norm force loss is not implemented, instead of silently
proceeding; update all places that set or pass f_use_norm (the constructor
signature and where it is stored) to ensure the flag triggers this check, and
ensure the error message references f_use_norm and suggests implementing
vector-norm force loss or setting f_use_norm=False.

In `@deepmd/tf/loss/ener.py`:
- Around line 152-155: The TF backend currently stores use_mae_loss and
f_use_norm on self but build() never uses them, so accept-but-ignore can mislead
users; update the class constructor (where self.use_huber, self.huber_delta,
self.use_mae_loss, self.f_use_norm are set) to validate and explicitly reject
unsupported flags: if use_mae_loss or f_use_norm is True, raise a ValueError (or
TypeError) with a clear message referencing the TF backend and the unsupported
options; mention the symbols use_mae_loss, f_use_norm and the build() method in
the error/message so callers know these flags are not implemented in the TF
implementation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 62d5f40b-7bc8-4e83-846a-cc5d6e489ca0

📥 Commits

Reviewing files that changed from the base of the PR and between a4d97a1 and 945d423.

📒 Files selected for processing (2)
  • deepmd/pd/loss/ener.py
  • deepmd/tf/loss/ener.py

Comment on lines +60 to +61
f_use_norm: bool = False,
use_mae_loss: bool | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast when f_use_norm=True in the PD backend.

Line 158 stores f_use_norm, but forward() still computes force loss component-wise and never switches to a vector-norm formulation. Right now f_use_norm=True is silently ignored.

Proposed fix
         self.use_l1_all = use_l1_all
         self.inference = inference
         self.use_huber = use_huber
         self.huber_delta = huber_delta
         self.f_use_norm = f_use_norm
+        if self.f_use_norm:
+            raise RuntimeError("f_use_norm is not implemented in the PD backend.")
         if self.use_huber and (
             self.has_pf or self.has_gf or self.relative_f is not None
         ):

Also applies to: 114-119, 158-158

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pd/loss/ener.py` around lines 60 - 61, The code stores the f_use_norm
flag but silently ignores it in the force-loss computation; add a fail-fast
check for f_use_norm in the EnerLoss class (constructor or at start of
forward()) so that if f_use_norm is True you raise a clear exception (e.g.
NotImplementedError or ValueError) explaining vector-norm force loss is not
implemented, instead of silently proceeding; update all places that set or pass
f_use_norm (the constructor signature and where it is stored) to ensure the flag
triggers this check, and ensure the error message references f_use_norm and
suggests implementing vector-norm force loss or setting f_use_norm=False.

Comment on lines 152 to +155
self.use_huber = use_huber
self.huber_delta = huber_delta
self.use_mae_loss = use_mae_loss
self.f_use_norm = f_use_norm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject unsupported MAE / force-norm flags in the TF backend.

Lines 154-155 persist use_mae_loss and f_use_norm, but build() never reads either flag. Passing either option will silently fall back to the existing L2/Huber behavior, which makes the config lie about the loss being optimized.

Proposed fix
         self.use_huber = use_huber
         self.huber_delta = huber_delta
         self.use_mae_loss = use_mae_loss
         self.f_use_norm = f_use_norm
+        if self.use_mae_loss or self.f_use_norm:
+            raise RuntimeError(
+                "use_mae_loss and f_use_norm are not implemented in the TF backend."
+            )
         if self.use_huber and (
             self.has_pf or self.has_gf or self.relative_f is not None
         ):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/tf/loss/ener.py` around lines 152 - 155, The TF backend currently
stores use_mae_loss and f_use_norm on self but build() never uses them, so
accept-but-ignore can mislead users; update the class constructor (where
self.use_huber, self.huber_delta, self.use_mae_loss, self.f_use_norm are set) to
validate and explicitly reject unsupported flags: if use_mae_loss or f_use_norm
is True, raise a ValueError (or TypeError) with a clear message referencing the
TF backend and the unsupported options; mention the symbols use_mae_loss,
f_use_norm and the build() method in the error/message so callers know these
flags are not implemented in the TF implementation.

self.use_huber = use_huber
self.f_use_norm = f_use_norm
if self.f_use_norm and not (self.use_huber or self.use_mae_loss):
raise RuntimeError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not necessary. When use_huber and use_mae_loss are both false, the value of f_use_norm does not change anything. As a result, f_use_norm can be true in any cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using mse loss, f_use_norm = True is equivalent to False.
See : mean_atoms(||ΔF_i||²) = mean_atoms(Δfx_i² + Δfy_i² + Δfz_i²) = 3 * mean_components(Δf²).
This will make users confused.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@github-actions github-actions bot added the Docs label Mar 9, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
deepmd/pt/loss/ener_spin.py (1)

23-41: ⚠️ Potential issue | 🟠 Major

Missing serialize() and deserialize() methods — loss_func will not persist.

EnergySpinLoss now accepts loss_func but lacks serialization methods. When a model is saved and reloaded, loss_func will revert to the default "mse", silently changing behavior if the user trained with "mae".

The sibling class EnergyStdLoss in deepmd/pt/loss/ener.py (lines 581-631) already includes these methods with loss_func and f_use_norm.

Would you like me to generate the serialize() and deserialize() methods for EnergySpinLoss?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt/loss/ener_spin.py` around lines 23 - 41, EnergySpinLoss is missing
serialize() and deserialize(), so its loss_func will not persist; add
serialize(self) to include loss_func (and any other ctor flags like
enable_atom_ener_coeff) into the saved dict and add deserialize(cls, d) to
restore loss_func (and the same flags) when reloading; follow the pattern used
by EnergyStdLoss (see methods serialize/deserialize in class EnergyStdLoss) and
ensure keys match exactly the constructor parameter names (e.g., "loss_func") so
reloaded objects retain the exact configuration.
deepmd/pd/loss/ener.py (1)

394-416: ⚠️ Potential issue | 🟠 Major

Virial loss does not branch on loss_func — inconsistent with other loss terms.

The virial loss computation always uses MSE (L2), even when loss_func="mae". Other loss terms (energy, force, atom_ener, pref_force) correctly branch on loss_func. This inconsistency will confuse users expecting all terms to follow the configured loss function.

Proposed fix — add MAE branch for virial
         if self.has_v and "virial" in model_pred and "virial" in label:
             find_virial = label.get("find_virial", 0.0)
             pref_v = pref_v * find_virial
             diff_v = label["virial"] - model_pred["virial"].reshape([-1, 9])
-            l2_virial_loss = paddle.mean(paddle.square(diff_v))
-            if not self.inference:
-                more_loss["l2_virial_loss"] = self.display_if_exist(
-                    l2_virial_loss.detach(), find_virial
-                )
-            if not self.use_huber:
-                loss += atom_norm * (pref_v * l2_virial_loss)
-            else:
-                l_huber_loss = custom_huber_loss(
-                    atom_norm * model_pred["virial"].reshape([-1]),
-                    atom_norm * label["virial"].reshape([-1]),
-                    delta=self.huber_delta,
-                )
-                loss += pref_v * l_huber_loss
-            rmse_v = l2_virial_loss.sqrt() * atom_norm
-            more_loss["rmse_v"] = self.display_if_exist(rmse_v.detach(), find_virial)
-            if mae:
-                mae_v = paddle.mean(paddle.abs(diff_v)) * atom_norm
-                more_loss["mae_v"] = self.display_if_exist(mae_v.detach(), find_virial)
+            if self.loss_func == "mse":
+                l2_virial_loss = paddle.mean(paddle.square(diff_v))
+                if not self.inference:
+                    more_loss["l2_virial_loss"] = self.display_if_exist(
+                        l2_virial_loss.detach(), find_virial
+                    )
+                if not self.use_huber:
+                    loss += atom_norm * (pref_v * l2_virial_loss)
+                else:
+                    l_huber_loss = custom_huber_loss(
+                        atom_norm * model_pred["virial"].reshape([-1]),
+                        atom_norm * label["virial"].reshape([-1]),
+                        delta=self.huber_delta,
+                    )
+                    loss += pref_v * l_huber_loss
+                rmse_v = l2_virial_loss.sqrt() * atom_norm
+                more_loss["rmse_v"] = self.display_if_exist(rmse_v.detach(), find_virial)
+                if mae:
+                    mae_v = paddle.mean(paddle.abs(diff_v)) * atom_norm
+                    more_loss["mae_v"] = self.display_if_exist(mae_v.detach(), find_virial)
+            elif self.loss_func == "mae":
+                l1_virial_loss = F.l1_loss(
+                    label["virial"].reshape([-1]),
+                    model_pred["virial"].reshape([-1]),
+                    reduction="mean",
+                )
+                loss += atom_norm * (pref_v * l1_virial_loss)
+                more_loss["mae_v"] = self.display_if_exist(
+                    l1_virial_loss.detach() * atom_norm, find_virial
+                )
+            else:
+                raise NotImplementedError(
+                    f"Loss type {self.loss_func} is not implemented for virial loss."
+                )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pd/loss/ener.py` around lines 394 - 416, The virial term always uses
L2 (l2_virial_loss) regardless of loss_func; update the block that checks
self.has_v to branch on self.loss_func (or loss_func variable) so when loss_func
== "mae" you compute an L1/MAE virial loss (e.g., mean absolute diff_v *
atom_norm and set more_loss["l1_virial_loss"] or reuse keys consistently), keep
the existing huber branch (custom_huber_loss) when self.use_huber is True, and
ensure rmse_v/mae_v are populated only when appropriate using display_if_exist
(use diff_v, l2_virial_loss, and the atom_norm/pref_v scalings consistently as
done for energy/force). Reference symbols: self.has_v, model_pred["virial"],
label["virial"], pref_v, diff_v, l2_virial_loss, custom_huber_loss, use_huber,
more_loss, display_if_exist.
♻️ Duplicate comments (4)
source/tests/consistent/loss/test_ener.py (2)

86-96: ⚠️ Potential issue | 🟡 Minor

Prefix unused unpacked variables with underscore to satisfy Ruff.

Ruff flags use_huber and enable_atom_ener_coeff as unused in skip_tf and skip_pd properties. Per coding guidelines, run ruff check . before committing.

Proposed fix
     `@property`
     def skip_tf(self) -> bool:
-        (use_huber, enable_atom_ener_coeff, loss_func, f_use_norm) = self.param
+        (_, _, loss_func, f_use_norm) = self.param
         # Skip TF for MAE loss tests (not implemented in TF backend)
         return CommonTest.skip_tf or loss_func == "mae" or f_use_norm

     `@property`
     def skip_pd(self) -> bool:
-        (use_huber, enable_atom_ener_coeff, loss_func, f_use_norm) = self.param
+        (_, _, loss_func, f_use_norm) = self.param
         # Skip Paddle for MAE loss tests (not implemented in Paddle backend)
         return not INSTALLED_PD or loss_func == "mae" or f_use_norm

As per coding guidelines **/*.py: Always run ruff check . and ruff format . before committing changes or CI will fail.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/consistent/loss/test_ener.py` around lines 86 - 96, The
unpacking in the properties skip_tf and skip_pd introduces unused variables
use_huber and enable_atom_ener_coeff which Ruff flags; fix by prefixing those
two variables with an underscore when unpacking self.param (e.g. _use_huber,
_enable_atom_ener_coeff) in both the skip_tf and skip_pd methods so the unused
bindings satisfy linting while leaving logic for loss_func and f_use_norm
unchanged.

111-111: ⚠️ Potential issue | 🟡 Minor

Also prefix unused enable_atom_ener_coeff in setUp.

Same Ruff warning applies here.

Proposed fix
     def setUp(self) -> None:
-        (use_huber, enable_atom_ener_coeff, loss_func, f_use_norm) = self.param
+        (use_huber, _, loss_func, f_use_norm) = self.param
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/consistent/loss/test_ener.py` at line 111, The tuple unpacking
in setUp assigns enable_atom_ener_coeff but that variable is unused; rename it
to have a leading underscore to satisfy the linter (e.g., unpack as
_enable_atom_ener_coeff) so Ruff no longer flags it as unused; update the
destructuring in the setUp method where (use_huber, enable_atom_ener_coeff,
loss_func, f_use_norm) = self.param is defined and ensure no other references
rely on enable_atom_ener_coeff.
deepmd/tf/loss/ener.py (1)

122-127: ⚠️ Potential issue | 🟠 Major

Reject unsupported loss_func and f_use_norm flags in the TF backend.

The TF backend stores loss_func and f_use_norm but build() never branches on them—it always computes MSE. Users passing loss_func="mae" or f_use_norm=True will silently get MSE behavior, which misrepresents what the model is actually optimizing.

Proposed fix — fail fast in constructor
         self.loss_func = loss_func
         self.f_use_norm = f_use_norm
+        if self.loss_func != "mse":
+            raise NotImplementedError(
+                f"loss_func='{self.loss_func}' is not implemented in the TF backend. "
+                "Only 'mse' is supported."
+            )
+        if self.f_use_norm:
+            raise NotImplementedError(
+                "f_use_norm is not implemented in the TF backend."
+            )
 
         self.starter_learning_rate = starter_learning_rate
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/tf/loss/ener.py` around lines 122 - 127, The TF backend currently
ignores loss_func and f_use_norm in build(), so update the constructor (the
__init__ that accepts loss_func and f_use_norm in deepmd/tf/loss/ener.py) to
validate inputs and fail fast: if loss_func is not "mse" or if f_use_norm is
True, raise a ValueError with a clear message naming the offending parameter(s)
and stating that only loss_func="mse" and f_use_norm=False are supported by the
TF implementation; this ensures callers are immediately informed instead of
silently getting MSE behavior from build().
deepmd/pd/loss/ener.py (1)

114-116: ⚠️ Potential issue | 🟠 Major

f_use_norm is stored but silently ignored in force loss computation.

The docstring acknowledges this is "not implemented in PD backend", but the code silently proceeds instead of failing fast. Users may configure f_use_norm=True expecting vector-norm force loss but get component-wise loss instead.

Proposed fix — fail fast in constructor
         self.loss_func = loss_func
         self.f_use_norm = f_use_norm
+        if self.f_use_norm:
+            raise NotImplementedError(
+                "f_use_norm is not implemented in the PD backend."
+            )
 
         self.starter_learning_rate = starter_learning_rate

Also applies to: 293-322

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pd/loss/ener.py` around lines 114 - 116, The parameter f_use_norm is
accepted and stored but ignored; change the constructors that accept f_use_norm
(look for the __init__ methods that take f_use_norm in this file) to perform a
fail-fast check: if f_use_norm is True, raise a clear exception (ValueError or
NotImplementedError) indicating "f_use_norm is not implemented in PD backend".
Apply the same check to the other constructor/initializer blocks referenced
around lines 293-322 so any attempt to enable f_use_norm fails immediately
instead of silently using component-wise force loss.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pd/loss/ener.py`:
- Around line 122-123: The constructor contains a duplicate assignment to the
attribute self.f_use_norm (first set alongside self.loss_func, then set again
later); remove the redundant assignment so self.f_use_norm is only assigned once
(keep the initial assignment in the __init__ / EnerLoss constructor and delete
the later duplicate), ensuring no other attribute intended to be set at that
second location is accidentally dropped.

In `@deepmd/tf/loss/ener.py`:
- Around line 126-127: Remove the duplicate assignment of self.f_use_norm in the
constructor of the class in deepmd/tf/loss/ener.py: keep a single assignment
where f_use_norm is first set (the line setting self.f_use_norm after
self.loss_func) and remove the later redundant assignment (the second
self.f_use_norm assignment around where other attributes are initialized),
ensuring only one initialization of self.f_use_norm remains.

---

Outside diff comments:
In `@deepmd/pd/loss/ener.py`:
- Around line 394-416: The virial term always uses L2 (l2_virial_loss)
regardless of loss_func; update the block that checks self.has_v to branch on
self.loss_func (or loss_func variable) so when loss_func == "mae" you compute an
L1/MAE virial loss (e.g., mean absolute diff_v * atom_norm and set
more_loss["l1_virial_loss"] or reuse keys consistently), keep the existing huber
branch (custom_huber_loss) when self.use_huber is True, and ensure rmse_v/mae_v
are populated only when appropriate using display_if_exist (use diff_v,
l2_virial_loss, and the atom_norm/pref_v scalings consistently as done for
energy/force). Reference symbols: self.has_v, model_pred["virial"],
label["virial"], pref_v, diff_v, l2_virial_loss, custom_huber_loss, use_huber,
more_loss, display_if_exist.

In `@deepmd/pt/loss/ener_spin.py`:
- Around line 23-41: EnergySpinLoss is missing serialize() and deserialize(), so
its loss_func will not persist; add serialize(self) to include loss_func (and
any other ctor flags like enable_atom_ener_coeff) into the saved dict and add
deserialize(cls, d) to restore loss_func (and the same flags) when reloading;
follow the pattern used by EnergyStdLoss (see methods serialize/deserialize in
class EnergyStdLoss) and ensure keys match exactly the constructor parameter
names (e.g., "loss_func") so reloaded objects retain the exact configuration.

---

Duplicate comments:
In `@deepmd/pd/loss/ener.py`:
- Around line 114-116: The parameter f_use_norm is accepted and stored but
ignored; change the constructors that accept f_use_norm (look for the __init__
methods that take f_use_norm in this file) to perform a fail-fast check: if
f_use_norm is True, raise a clear exception (ValueError or NotImplementedError)
indicating "f_use_norm is not implemented in PD backend". Apply the same check
to the other constructor/initializer blocks referenced around lines 293-322 so
any attempt to enable f_use_norm fails immediately instead of silently using
component-wise force loss.

In `@deepmd/tf/loss/ener.py`:
- Around line 122-127: The TF backend currently ignores loss_func and f_use_norm
in build(), so update the constructor (the __init__ that accepts loss_func and
f_use_norm in deepmd/tf/loss/ener.py) to validate inputs and fail fast: if
loss_func is not "mse" or if f_use_norm is True, raise a ValueError with a clear
message naming the offending parameter(s) and stating that only loss_func="mse"
and f_use_norm=False are supported by the TF implementation; this ensures
callers are immediately informed instead of silently getting MSE behavior from
build().

In `@source/tests/consistent/loss/test_ener.py`:
- Around line 86-96: The unpacking in the properties skip_tf and skip_pd
introduces unused variables use_huber and enable_atom_ener_coeff which Ruff
flags; fix by prefixing those two variables with an underscore when unpacking
self.param (e.g. _use_huber, _enable_atom_ener_coeff) in both the skip_tf and
skip_pd methods so the unused bindings satisfy linting while leaving logic for
loss_func and f_use_norm unchanged.
- Line 111: The tuple unpacking in setUp assigns enable_atom_ener_coeff but that
variable is unused; rename it to have a leading underscore to satisfy the linter
(e.g., unpack as _enable_atom_ener_coeff) so Ruff no longer flags it as unused;
update the destructuring in the setUp method where (use_huber,
enable_atom_ener_coeff, loss_func, f_use_norm) = self.param is defined and
ensure no other references rely on enable_atom_ener_coeff.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7c9f99e6-b2ad-49c1-8515-19f4702ee840

📥 Commits

Reviewing files that changed from the base of the PR and between 945d423 and 3341fd1.

📒 Files selected for processing (9)
  • deepmd/dpmodel/loss/ener.py
  • deepmd/pd/loss/ener.py
  • deepmd/pt/loss/ener.py
  • deepmd/pt/loss/ener_spin.py
  • deepmd/tf/loss/ener.py
  • deepmd/utils/argcheck.py
  • doc/model/train-energy-spin.md
  • doc/model/train-energy.md
  • source/tests/consistent/loss/test_ener.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • deepmd/utils/argcheck.py

Comment on lines +122 to +123
self.loss_func = loss_func
self.f_use_norm = f_use_norm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate assignment of self.f_use_norm.

self.f_use_norm is assigned on line 123, then again on line 155.

Proposed fix
         self.inference = inference
         self.use_huber = use_huber
         self.huber_delta = huber_delta
-        self.f_use_norm = f_use_norm
         if self.use_huber and (

Also applies to: 155-155

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pd/loss/ener.py` around lines 122 - 123, The constructor contains a
duplicate assignment to the attribute self.f_use_norm (first set alongside
self.loss_func, then set again later); remove the redundant assignment so
self.f_use_norm is only assigned once (keep the initial assignment in the
__init__ / EnerLoss constructor and delete the later duplicate), ensuring no
other attribute intended to be set at that second location is accidentally
dropped.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
deepmd/tf/loss/ener.py (1)

126-131: ⚠️ Potential issue | 🟡 Minor

Duplicate assignment of self.f_use_norm and missing validation for f_use_norm=True.

  1. self.f_use_norm is assigned on line 131, then again on line 161. Remove the duplicate.

  2. When f_use_norm=True, the code silently accepts the flag but build() never uses it—the TF backend always computes element-wise MSE loss. This can mislead users who expect force-norm behavior (as implemented in PyTorch/dpmodel backends per context snippets).

Proposed fix
         if loss_func != "mse":
             raise NotImplementedError(
                 f"TensorFlow backend only supports loss_func='mse', got '{loss_func}'."
             )
+        if f_use_norm:
+            raise NotImplementedError(
+                "f_use_norm=True is not implemented in the TensorFlow backend."
+            )
         self.loss_func = loss_func
         self.f_use_norm = f_use_norm

And remove the duplicate assignment at line 161:

         self.use_huber = use_huber
         self.huber_delta = huber_delta
-        self.f_use_norm = f_use_norm
         if self.use_huber and (

Also applies to: 161-161

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/tf/loss/ener.py` around lines 126 - 131, Remove the duplicate
assignment of self.f_use_norm (the second assignment near line 161) and add
validation in the constructor for f_use_norm: if f_use_norm is True, raise
NotImplementedError (or ValueError) explaining that the TensorFlow backend
(class Ener / its build() loss computation) only implements element-wise MSE and
does not support force-norm behavior; ensure build() remains consistent with
this validation so callers are not misled into thinking force-norm is
implemented.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@deepmd/tf/loss/ener.py`:
- Around line 126-131: Remove the duplicate assignment of self.f_use_norm (the
second assignment near line 161) and add validation in the constructor for
f_use_norm: if f_use_norm is True, raise NotImplementedError (or ValueError)
explaining that the TensorFlow backend (class Ener / its build() loss
computation) only implements element-wise MSE and does not support force-norm
behavior; ensure build() remains consistent with this validation so callers are
not misled into thinking force-norm is implemented.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f775d764-90db-4919-b9a7-252e79a51896

📥 Commits

Reviewing files that changed from the base of the PR and between 3341fd1 and ab95efe.

📒 Files selected for processing (1)
  • deepmd/tf/loss/ener.py

@iProzd iProzd requested review from OutisLi and njzjz March 10, 2026 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants