-
Notifications
You must be signed in to change notification settings - Fork 18
WIP ETACE Calculator Interface and Training Assembly #313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: co/etback
Are you sure you want to change the base?
Conversation
fdadcb9 to
60212bb
Compare
Implements Phase 1 of the ETACE calculator interface plan: - ETACEPotential struct wrapping ETACE models - AtomsCalculators interface (energy, forces, virial) - Combined energy_forces_virial evaluation - Tests comparing against original ACE model - CPU and GPU performance benchmarks Key implementation details: - Forces computed via site_grads() + forces_from_edge_grads() - Force sign: forces_from_edge_grads returns +∇E, negated for F=-∇E - Virial: V = -∑ ∂E/∂𝐫ij ⊗ 𝐫ij Performance results (8-atom Si/O cell, order=3, maxl=6): - Energy: ETACE ~15% slower (graph construction overhead) - Forces: ETACE ~6.5x faster (vectorized gradients) - EFV: ETACE ~5x faster GPU benchmarks use auto-detection from EquivariantTensors utils. GPU gradients skipped due to Polynomials4ML GPU compat issues. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements a composable calculator architecture: - SiteEnergyModel interface: site_energies(), site_energy_grads(), cutoff_radius() - E0Model: One-body reference energies (constant per species, zero forces) - WrappedETACE: Wraps ETACE model with SiteEnergyModel interface - WrappedSiteCalculator: Converts site quantities to global (energy, forces, virial) - StackedCalculator: Combines multiple AtomsCalculators by summing contributions Architecture allows non-site-based calculators (e.g., Coulomb, dispersion) to be added directly to StackedCalculator without requiring site energy decomposition. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move StackedCalculator to src/et_models/stackedcalc.jl for better separation of concerns - it's a generic utility for combining calculators, independent of ETACE-specific code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements linear least squares training support: - length_basis(): Returns number of linear parameters (nbasis * nspecies) - energy_forces_virial_basis(): Compute basis values for E/F/V - potential_energy_basis(): Faster energy-only basis computation - get_linear_parameters(): Extract readout weights as flat vector - set_linear_parameters!(): Set readout weights from flat vector The basis functions allow linear fitting via: E = dot(E_basis, θ) F = F_basis * θ V = sum(θ .* V_basis) Tests verify that linear combination of basis with current parameters reproduces the direct energy/forces/virial evaluation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use compile-time loop unrolling via @generated functions for efficient summation over calculators. The N type parameter allows generating specialized code like: E_1 = potential_energy(sys, calc.calcs[1]) E_2 = potential_energy(sys, calc.calcs[2]) return E_1 + E_2 instead of runtime loops. This enables better inlining and type inference when the number of calculators is small and known at compile time. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- benchmark_comparison.jl: Energy benchmarks (CPU + GPU) - benchmark_forces.jl: Forces benchmarks (CPU only) Results show: - Energy: ETACE CPU 1.7-2.2x faster, ETACE GPU up to 87x faster - Forces: ETACE CPU 7.7-11.4x faster than ACE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Phase 1, 2, 5 complete - Phase 3 (E0/PairModel) assigned to maintainer - Added benchmark results: GPU up to 87x faster, forces 8-11x faster - Documented all new files and test coverage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ACEfit.basis_size dispatch for ETACEPotential - Add ACEfit to test/Project.toml - Test training assembly on multiple structures (5 random) - Test multi-species parameter ordering (pure Si, pure O, mixed) - Verify species-specific basis contributions are correctly separated - Fix soft scope warnings with local declarations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ACEpotentials.ETModels to autodocs in all_exported.md - Add comprehensive integration test for ETACE calculators based on test_silicon workflow - Tests verify energy/forces/virial consistency with original ACE - Tests verify training basis assembly and StackedCalculator composition 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Pre-allocate gradient buffer (∇Ei_buf) outside loop, reuse across iterations - Eliminate W_unit matrix allocation by directly copying ∂𝔹[:, i, k] - Pre-compute zero gradient element for species masking - Pre-extract edge vectors for virial computation - Use zero() instead of zeros() for SMatrix virial accumulator Performance improvement (64-atom system): - Time: 1597ms → 422ms (3.8x faster) - Memory: 3.4 GiB → 412 MiB (8.4x reduction) Also fix variable scoping in test_et_silicon.jl for Julia 1.10+ (added `global` keyword for loop variable updates). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ETACE only implements the many-body basis, not the pair potential. The test was incorrectly comparing full ACE (with pair) against ETACE. Changes: - Create model_nopair with Wpair=0 for fair comparison - Compare ETACE against ACE many-body contribution only - Fix E0Model constructor: use Symbol key (:Si) not Int (14) - Skip isolated atoms in all tests (ETACE requires >= 2 atoms) - Update test comments and summary to clarify scope 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Extract core helpers (_core_site_energies, _core_site_grads) for shared evaluation logic between ETACEPotential and WrappedETACE - Refactor WrappedETACE and ETACEPotential to use core helpers - Simplify stackedcalc.jl: replace manual AST building (_gen_sum, _gen_broadcast_sum) with idiomatic @nexprs/@ntuple from Base.Cartesian - Net reduction of ~50 lines while maintaining identical behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Removes duplicate AtomsCalculators interface and evaluation logic by:
- Making WrappedETACE mutable with co_ps field for training
- Defining ETACEPotential as const alias for WrappedSiteCalculator{WrappedETACE}
- Removing duplicate _evaluate_* functions and AtomsCalculators methods
- Adding accessor helpers (_etace, _ps, _st) for training functions
The evaluation now flows through WrappedSiteCalculator's generic methods
which call site_energies/site_energy_grads on the WrappedETACE model.
This reduces ~66 lines of duplicated code.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove duplicate E0Model in favor of upstream ETOneBody - Unify WrappedSiteCalculator to work with all ETACE-pattern models directly - Document that ETACE, ETPairModel, ETOneBody share identical interface - Plan Phase 6 refactoring to eliminate WrappedETACE indirection - Update architecture diagrams showing target unified structure 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
c51a644 to
7d03f9d
Compare
- Refactor WrappedSiteCalculator to store ps, st, rcut directly
- Remove E0Model (use upstream ETOneBody instead)
- Remove WrappedETACE (functionality merged into WrappedSiteCalculator)
- Remove old SiteEnergyModel interface (site_energies, site_energy_grads)
- Update ETACEPotential to be type alias for WrappedSiteCalculator{ETACE}
- Update training assembly accessors for new flat structure
All ETACE-pattern models (ETACE, ETPairModel, ETOneBody) now work
directly with WrappedSiteCalculator via their common interface:
- model(G, ps, st) -> (site_energies, st)
- site_grads(model, G, ps, st) -> edge gradients
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add convert2et_full() to convert complete ACE model to StackedCalculator - Combines ETOneBody (E0), ETPairModel (pair), and ETACE (many-body) - Returns StackedCalculator compatible with AtomsCalculators - Add _copy_ace_params!() for many-body parameter copying - Copies radial basis Wnlq parameters - Copies readout WB parameters - Add _copy_pair_params!() for pair potential parameter copying - Based on mapping from test/etmodels/test_etpair.jl - Copies pair radial basis and readout parameters 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace E0Model tests with ETOneBody (upstream) tests - Remove WrappedETACE tests (no longer exists) - Update WrappedSiteCalculator tests for new (model, ps, st, rcut) signature - Update ETACEPotential construction test for direct model access - Update silicon integration test to use ETOneBody and unified wrapper Tests now use upstream models directly: - ETOneBody instead of E0Model - WrappedSiteCalculator(model, ps, st, rcut) instead of nested wrappers 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Return NamedTuple with empty edge_data matching ETACE/ETPairModel interface - Remove unnecessary Zygote import (hand-coded since gradient is trivially zero) - Update test to check isempty(∂G.edge_data) instead of zero norms The calling code in et_calculators.jl checks isempty(∂G.edge_data) and returns zero forces/virial when true, which is the correct behavior for ETOneBody (energy depends only on atom types, not positions). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Comment out Pkg.activate in test_committee.jl that was switching away from the test project environment - Update test_etonebody.jl gradient test to check for NamedTuple return type with .edge_data field (matching the updated ETOneBody interface that returns consistent structure with ETACE/ETPairModel) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ET ACE (site_basis_jacobian): - Remove ps.basis and st.basis from _jacobian_X call - The upstream ET._jacobian_X for SparseACEbasis only takes 5 args: (basis, Rnl, Ylm, dRnl, dYlm) ET Pair (site_grads): - Implement hand-coded gradient using evaluate_ed instead of Zygote - Avoids Zygote InplaceableThunk issue with upstream EdgeEmbed rrule - Matches the pattern used in site_basis_jacobian Also inline _apply_etpairmodel to avoid calling site_basis (cleaner). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix z.number → z.atomic_number in E0_dict creation - Fix _copy_ace_params! path: rembed.basis.linl.W → rembed.post.W - Fix _copy_pair_params! path: rembed.basis.rbasis.linl.W → rembed.rbasis.post.W - Add benchmark comparing ACE vs ETACE StackedCalculator for full model 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Address moderator concern about commit 50ed668: - Avoid forming O(nnodes * nbasis) dense intermediate matrix - Compute edge gradients directly using loops - Same numerical results, better memory characteristics 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Bump EquivariantTensors compat to 0.4.2 in main and test Project.toml - Simplify site_basis_jacobian to use 5-arg _jacobian_X API (requires ET >= 0.4.2) - Improve ETPairModel site_grads memory efficiency: - Avoid O(nnodes * nbasis) intermediate matrix allocation - Compute edge gradients directly using loops 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Now that test project uses ET 0.4.2 (which fixed InplaceableThunk bug in EdgeEmbed rrule), we can use the simpler Zygote-based gradient computation for ETPairModel. Also fix _jacobian_X call in ETACE to use 7-arg API (requires ps, st). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add benchmark/gpu_benchmark.jl for GPU energy/forces benchmarks - Test both many-body only (ETACE) and full model (E0 + Pair + Many-Body) - Add LuxCUDA to test/Project.toml for GPU testing support - GPU forces now work with Polynomials4ML v0.5.8+ (bug fix Dec 29, 2024) Results show significant GPU speedups: - Many-body energy: 6x-48x speedup (64-800 atoms) - Many-body forces: 3x-18x speedup (64-800 atoms) - Full model energy: 3x-36x speedup (64-800 atoms) - Full model forces: 1x-14x speedup (64-800 atoms) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Mark all core phases as complete - Add GPU benchmark results (energy and forces) - Document outstanding work: pair training assembly, ACEfit integration - Note basis index design discussion needed with maintainer - Update dependencies: ET 0.4.2, P4ML 0.5.8+ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add ETPairPotential and ETOneBodyPotential type aliases - Implement length_basis, energy_forces_virial_basis, potential_energy_basis for ETPairPotential, ETOneBodyPotential, and StackedCalculator - Add get/set_linear_parameters for all calculator types - Add ACEfit.basis_size dispatch for all calculator types - Import and extend length_basis, energy_forces_virial_basis from Models - ACEfit.assemble now works with full ETACE StackedCalculator 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…neBodyPotential, StackedCalculator Tests cover: - ETOneBodyPotential returns empty arrays (0 learnable parameters) - ETPairPotential training assembly with learnable pair basis - StackedCalculator concatenation of basis from all components - Linear combinations reproduce energy/forces/virial - get/set_linear_parameters round-trip - ACEfit.basis_size dispatch 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@jameskermode - please move ET compat to 4.3 when you have a moment. |
| # Gradient w.r.t. positions is always zero. | ||
| # Return NamedTuple matching Zygote gradient structure with empty edge_data. | ||
| # The calling code checks isempty(∂G.edge_data) and returns zero forces/virial. | ||
| function site_grads(l::ETOneBody, X::ET.ETGraph, ps, st) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will lead to type stable code. If you look at the gradients that are produced by the other model components it will be of quite a different form: there will be additional fields in the returned NamedTuple and the edge_data field will contain VState or NamedTuple and not PState variables. I think this may need a bit of iteration.
That said, it might be a bit much hoping for perfect type stability here. I'll follow up on Slack on how to proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the correct code might be
return (; edge_data = fill(VState(), length(X.edge_data)))The reason this should work is that
VState(r = SA[1.0, 2.0, 3.0]) + VState() == VState(r = SA[1.0, 2.0, 3.0])i.e. an empty VState() just acts like zero when added on any other VState.
|
|
||
| @generated function _stacked_energy(sys::AbstractSystem, calc::StackedCalculator{N}) where {N} | ||
| quote | ||
| @nexprs $N i -> E_i = AtomsCalculators.potential_energy(sys, calc.calcs[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refers to my other comment about the OneBody gradient. This is exactly the correct way to ensure type stability because you have N expressions for N calculators rather than a for loop. So I don't think there will be a type stability problem anywhere.
Summary
Adds EquivariantTensors-based calculators for the ETACE backend, enabling:
New Components
Core Calculators:
ET-Native Models:
Conversion Functions:
convert2et(model): Convert many-body ACE to ETACEconvert2et_full(model, ps, st): Convert full model (E0 + pair + many-body) to StackedCalculatorTraining Assembly
length_basis(calc)- Total number of linear parametersget_linear_parameters(calc)/set_linear_parameters!(calc, θ)- Parameter managementpotential_energy_basis(sys, calc)- Energy design matrix rowenergy_forces_virial_basis(sys, calc)- Full EFV design matrix rowBenchmark Results
GPU Many-Body Only (ETACE) - Energy:
GPU Many-Body Only (ETACE) - Forces:
GPU Full Model (E0 + Pair + Many-Body) - Energy:
GPU Full Model (E0 + Pair + Many-Body) - Forces:
CPU Full Model Forces (ETACE vs ACE):
Notes:
Design Note: Basis Index Handling
The current implementation handles species-separated basis indices at the calculator level (in
energy_forces_virial_basis). Each species gets separate parameter indices viap = (s - 1) * nbasis + k. This design choice may need discussion for future GPU training assembly use cases.Files Added/Modified
New files:
src/et_models/et_ace.jl- ETACE model implementationsrc/et_models/et_pair.jl- ETPairModel implementationsrc/et_models/et_onebody.jl- ETOneBody implementationsrc/et_models/et_calculators.jl- Calculator wrappers and conversion functionssrc/et_models/stackedcalc.jl- StackedCalculator with @generatedsrc/et_models/convert.jl- Model conversion utilitiestest/etmodels/test_etbackend.jl- ETACE teststest/etmodels/test_etpair.jl- ETPairModel teststest/etmodels/test_etonebody.jl- ETOneBody testsbenchmark/gpu_benchmark.jl- GPU benchmark scriptModified files:
src/et_models/et_models.jl- Updated includes and exportsProject.toml- EquivariantTensors compat bumped to 0.4.2test/Project.toml- EquivariantTensors compat bumped to 0.4.2, added LuxCUDATest plan
🤖 Generated with Claude Code