Skip to content

Conversation

@calvinp0
Copy link
Member

@calvinp0 calvinp0 commented Dec 31, 2025

When ORCA is used for optimisation, we attempt to get the scaling factor, however the parser was broken. It caused the scaling factor to return 0.002 at wb97x-d3. This PR should rectify it now to be:

FREQ: A PROGRAM FOR OPTIMIZING SCALE FACTORS (Version 1)
                 written by                 
Haoyu S. Yu, Lucas J. Fiedler, I.M. Alecu, and Donald G. Truhlar
Department of Chemistry and Supercomputing Institute
University of Minnesota, Minnesota 55455-0431
CITATIONS:
1. I.M., Alecu, J. Zheng, Y. Zhao, D.G. Truhlar, J. Chem. Theory Comput. 2010, 6, 9, 2872-2887,
   DOI: 10.1021/ct100326h
2. H.S. Yu, L.J. Fiedler, I.M. Alecu,, D.G. Truhlar, Computer Physics Communications 2017, 210, 132-138,
   DOI: 10.1016/j.cpc.2016.09.004



Level of theory: wb97xd3/def2tzvp
Scale Factor for Zero-Point Energies     = 0.973
Scale Factor for Harmonic Frequencies    = 0.987
Scale Factor for Fundamental Frequencies = 0.948
(execution time: 00:11:35)



You may copy-paste the computed harmonic frequency scaling factor(s) to ARC (under the `freq_dict` in ARC/data/freq_scale_factors.yml):
  'wb97xd3/def2tzvp': 0.987,  # [4]

Also, adjusted the optimisation keywords as they appear to be out of date.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the ORCA parser to correctly extract scaling factors from optimization output and updates ORCA optimization keywords to use current syntax.

Key changes:

  • Fixed ZPE parser to properly handle the 'Eh' unit token in ORCA output
  • Added formatting functions to normalize method and basis set names for ORCA compatibility
  • Updated DFT grid keywords from deprecated Grid5/Grid6 to defgrid2/defgrid3

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
arc/parser/adapters/orca.py Fixed ZPE correction parser to correctly locate the energy value before the 'Eh' unit token
arc/job/adapters/orca.py Added method/basis formatting helpers, updated grid keywords to current ORCA syntax, and integrated formatting into input file generation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 41 to 77
def _format_orca_method(method: str) -> str:
"""
Convert ARC method names to ORCA-friendly labels when needed.
"""
if not method:
return method
if method.lower() == 'wb97xd':
logger.warning('ORCA does not support wb97xd; use wb97x or wb97x-d3. '
'wb97xd3 will be normalized to wb97x-d3.')
return ORCA_METHOD_ALIASES.get(method.lower(), method)


def _format_orca_basis_token(token: str) -> str:
"""
Convert def2 basis tokens to ORCA formatting (e.g., def2tzvp -> def2-TZVP).
"""
if not token:
return token
parts = token.split('/')
base = parts[0]
if base.lower().startswith('def2'):
base_rest = base[4:]
if base_rest.startswith('-'):
base_rest = base_rest[1:]
if base_rest:
base = f"def2-{base_rest.lower()}"
if len(parts) > 1:
parts = [base] + [part.lower() for part in parts[1:]]
return '/'.join(parts)
return base


def _format_orca_basis(basis: str) -> str:
"""
Convert basis strings to ORCA-friendly labels where applicable.
"""
if not basis:
return basis
return ' '.join(_format_orca_basis_token(token) for token in basis.split())
Copy link

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new helper functions '_format_orca_method', '_format_orca_basis_token', and '_format_orca_basis' lack test coverage. Since there are existing test files for the ORCA adapter (arc/job/adapters/orca_test.py), tests should be added to verify these formatting functions work correctly with various inputs like 'wb97xd3', 'def2tzvp', 'def2-TZVP', 'def2tzvp/c', etc.

Copilot uses AI. Check for mistakes.
Normalizes method strings to ORCA-friendly labels and addresses the wb97xd deprecation by suggesting alternatives.

Formats basis set strings to ORCA formatting (e.g., def2tzvp -> def2-TZVP).

Fixed the opt and fine opt keywords to be more inline with 5.0.4 - 6.0.0
Adds unit tests for the ORCA method, basis set token, and basis set formatting helper functions.
These tests ensure the correct conversion of basis set names and methods
to the format expected by ORCA.
calvinp0 and others added 2 commits December 31, 2025 22:48
Addresses an issue where the zero-point energy (ZPE) extraction fails when the "Eh" unit is explicitly present in the line.

Improves the parsing logic to correctly identify and extract the ZPE value, regardless of whether the "Eh" unit is present or not.
Copy link
Member

@JintaoWu98 JintaoWu98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Collaborator

@kfir4444 kfir4444 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! 😄

@calvinp0 calvinp0 merged commit 6fcd667 into main Jan 4, 2026
6 checks passed
@calvinp0 calvinp0 deleted the orca_scaling branch January 4, 2026 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants