Separate 1,3 sigmatropic rearrangement family #550

kspieks · 2021-11-27T22:13:43Z

This PR brings several improvements

Improves rate estimates by separating the 1,3 sigmatropic rearrangement reaction template into the traditional ketoenol tautomerization and the 1,3 sigmatropic rearrangement. This essentially reverts PR Rename ketoenol to 1,3 sigmatropic rearrangement (Part 3) #532. The Ea for the simple H transfer with a traditional ketoenol reaction is much lower than the Ea for the heavy atom transfers with 1,3 sigmatropic rearrangement. Combining these trees messed up the rate estimates for ketoenol reactions since higher level nodes were averaging wildly different Arrhenius parameters. Now, the average error for ketoenol rate estimation (last line from running the ATG notebook) is much lower than it was before, and even slightly lower when compared to PR Refit Ketoenol Rate Rules #524 which had not changed the template yet.
Adds a thermo library for all species that had been added as new training reactions for these families. Previously, GAV was used to estimate the thermo for the reverse reaction when fitting the rate tree. The new thermo library will improve the reverse rate estimate when fitting the rate tree. All species thermo was calculated at CCSD(T)-F12a/cc-pVDZ-F12//ωB97X-D3/def2-TZVP with AECs and BACs that been added in PRs AEC BAC for ωB97X-D3/def2-TZVP and B97-D3/def2-mSVP #459 and Update CCSD(T)-F12 AECs and BACs #508.
Performs a thorough conformer search for all species. No new training reactions were added in this PR. The existing training reactions calculated at CCSD(T)-F12a/cc-pVDZ-F12//ωB97X-D3/def2-TZVP from PRs Refit Ketoenol Rate Rules #524, Refit Ketoenol Rate Rules (Part 2) #527, and Rename ketoenol to 1,3 sigmatropic rearrangement (Part 3) #532 were separated into the new reaction templates. However, a thorough conformer search was done with ACS for all reactants, transition states, and products. The previous PRs that had added these reactions did not do a conformer search because the RDKit function CalcNumRotatableBonds incorrectly classified these species as rigid. I should've tested it more thoroughly. Upon closer inspection, the documentation's recommended settings do not give the expected result for all molecules, and it is actually better to use RotatableBondSmarts instead.

The zip file contains the Arkane outputs for thermo calculations and Arrhenius fitting. It also contains the notebooks used to fit the rate trees.
notebooks.zip

kspieks · 2021-11-30T15:37:26Z

Tagging @davidfarinajr since I noticed he is working on a PR for 1,3 sigmatropic rearrangement

davidfarinajr

Thanks for this PR @kspieks ! I've had issues with some of the rates for 1,3 sigmatropic rearrangement family recently, and I think this PR will go a long way towards fixing those issues. I had a wide variety of comments on this review 😂 , some are requested changes (squashing a commit for example) and some are just thinking out loud so don't feel like you need to address them all.

davidfarinajr · 2021-12-06T21:17:43Z

input/kinetics/families/1,3_sigmatropic_rearrangement/groups.py

 2 *3 R!H u0 {1,S} {4,S}
 3 *1 R!H u0 {1,D}
-4 *4 R   u0 {2,S}
+4 *4 R!H u0 {2,S}


By not allowing *4 to be H, we will no longer generate reactions with *4H-*3C or *4H-*3N (since *3 is only allowed to be O and S for ketoenol family). Do we want to disallow these types of reactions? It seems the barrier is very high for *4H-*3C (~450 kJ/mol for a rxn I calculated here 0f01765) so maybe they will never be relevant and we shouldn't generate them as estimating them to be faster than they are might cause problems

Yea that's a good observation. Technically this PR reduces coverage relative to PR #532. However, the 1,3 sigmatropic template in this PR still expands the coverage of possible reactions that RMG could explore relative to anything before PR #532, so we're not removing legacy reaction templates. Separately, as you point out, these reactions have incredibly high barriers, which almost certainly preclude them from being added to any RMG mechanism. Although we could add training reactions that teach RMG to estimate these ~450 kJ/mol barriers (I had added some in my previous PR), my understanding is that the quality of the estimate (or whether our templates even allow these reactions in the first place) won't have any practical impact since RMG would never include these high barrier reactions.

davidfarinajr · 2021-12-06T21:22:10Z

input/thermo/libraries/Spiekermann_refining_elementary_reactions.py

@@ -1,8 +1,8 @@
 #!/usr/bin/env python
 # encoding: utf-8



Could you please squash this commit 1b42f5f into first commit 0a37dea that adds this thermo library?

Yea absolutely :)

davidfarinajr · 2021-12-06T21:30:07Z

input/kinetics/families/1,3_sigmatropic_rearrangement/rules.py

-    kinetics = ArrheniusBM(A=(3.50759e-62,'s^-1'), n=22.1275, w0=(750402,'J/mol'), E0=(-20429.3,'J/mol'), Tmin=(300,'K'), Tmax=(1500,'K'), uncertainty=RateUncertainty(mu=-0.8800459367924431, var=31.676078103554833, Tref=1000.0, N=56, data_mean=0.0, correlation='Root_N-1R!H-inRing',), comment="""BM rule fitted to 56 training reactions at node Root_N-1R!H-inRing
-    Total Standard Deviation in ln(k): 13.494121433083981"""),
+    label = "Root_1R!H-inRing",
+    kinetics = ArrheniusBM(A=(9.20285e+105,'s^-1'), n=-26.8654, w0=(654062,'J/mol'), E0=(462603,'J/mol'), Tmin=(300,'K'), Tmax=(1500,'K'), uncertainty=RateUncertainty(mu=1.596114397653623, var=262.29968162840566, Tref=1000.0, N=8, data_mean=0.0, correlation='Root_1R!H-inRing',), comment="""BM rule fitted to 8 training reactions at node Root_1R!H-inRing


An A of 9e105 and n of -26 concerns me a bit, especially because temperature range is 300-1500K and someone might run RMG hotter than that (I think large negative n is better than large positive when extrapolating to T>1500, but still a bit concerning). Do we know why there is such a big spread here? Which rxns are fit to this node and if there are any outliers?

A larger spread is expected higher up the tree, so perhaps we should put constraints on the fitted params (-10<n<10 for example) so we don't get large negative or positive values that might fit better over the T range, but extrapolate more poorly? Or we could expand the T range for fitting, but maybe your training reaction kinetics are not valid over that range, or use the collision limit at high T as the high T rate, not sure.

Thanks for bringing this up. To summarize the discussion you, Matt, and I had this morning at RMG developer hours, we ultimately care about the fit quality so it's okay if some of the fitted parameters are odd (very very odd haha) if they give a good fit. Regarding the temperature range, I will override the hardcoded values (link) from the fit_to_reactions method in the ArrheniusBM class to include 2000K. For ketoenol, 14 of the 17 reactions are valid from 300K to 2000K so I'd argue that it makes sense to ask the other 3 training reactions to extrapolate to 2000K rather than restrict all reactions to 1500 K. Similarly, 11 of the 12 training reactions for 1,3 sigmatropic were also fit to 3 parameter Arrhenius from 300K to 2000K. For now, I left these as separate commits so you can see that changes. Including 2000K actually made the fitted parameters more reasonable. I can squash these before you merge

davidfarinajr · 2021-12-06T21:33:53Z

input/kinetics/families/Ketoenol/groups.py

+    ['FORM_BOND', '*4', 1, '*1'],
+])
+
+boundaryAtoms = ["*2", "*3"]


Do we need boundary atoms? I'm not sure how these are currently used in RMG. Based on the Root, it looks like this would only match a 1,3 rearrangement.

Thanks for asking about this at RMG developer hours today. To summarize your discussion, boundary atoms are deprecated. Since ATG, doesn't use them, I will remove this line.

Many reactants in this dataset are not unique. For these cases, only the first instance is included.

kspieks · 2021-12-08T01:21:31Z

Thanks for reviewing @davidfarinajr ! I appreciate your comments and have left replies. For now, I pushed fixup commits that I can squash once this is ready to be merged

davidfarinajr · 2021-12-08T16:37:06Z

Thanks for addressing my comments @kspieks ! This PR looks good to me! By including 2000K into the fit, it looks like the rate changes by an order of magnitude over 1500-2000 K (for this particular example anyways),

So perhaps we should include 2000K point when fitting BM rules for default Trange settings.

I think this is ready to merge, if you want to squash those commits, I'll merge it in. Thanks!

Reinstate the training reactions that were previously in this family (originally added in PR #273). Since this reaction family will exclusively handle ketoenol tautomerization reactions, training reactions from PR #216 will remain in the 1,3 sigmatropic rearrangement reaction family.

R!H ensures this family does not overlap with the new ketoenol template Also remove the previous ketoenol training reactions that were previously merged into this template

kspieks · 2021-12-09T03:40:15Z

Thanks @davidfarinajr! I squashed the commits
I can make a mini PR tomorrow for adding 2000K to the ArrheniusBM :)

davidfarinajr · 2021-12-09T16:12:43Z

I almost forgot about adding Ketoenol family back to recommended families 🤦 . Just did that here c47c48b. I'll merge when tests pass this time ✅

kspieks · 2021-12-09T20:10:59Z

Thanks for reviewing and merging!

kspieks added Topic: Kinetics Topic: Thermo Topic: Tree Gen labels Nov 27, 2021

kspieks self-assigned this Nov 27, 2021

kspieks requested a review from hwpang November 27, 2021 22:15

kspieks force-pushed the separate_1_3_sigmatropic branch from 8cbc102 to 742f5d6 Compare November 27, 2021 22:44

kspieks mentioned this pull request Dec 2, 2021

Invalid k(E) at high T for BM rate rule estimate ReactionMechanismGenerator/RMG-Py#2238

Closed

davidfarinajr reviewed Dec 6, 2021

View reviewed changes

Add thermo values for ketoenol and 1,3 sigmatropic rearrangement

d237d99

Many reactants in this dataset are not unique. For these cases, only the first instance is included.

kspieks force-pushed the separate_1_3_sigmatropic branch from 742f5d6 to 5d504aa Compare December 8, 2021 01:21

davidfarinajr self-requested a review December 8, 2021 16:37

davidfarinajr approved these changes Dec 8, 2021

View reviewed changes

kspieks added 6 commits December 8, 2021 20:43

Ketoenol: add training reactions and refit rate tree

717b6c0

Update reaction family pdf with ketoenol image

8fcb504

Update reaction family for 1,3 sigmatropic rearrangement

00df5ea

R!H ensures this family does not overlap with the new ketoenol template Also remove the previous ketoenol training reactions that were previously merged into this template

1,3 sigmatropic rearrangement: add training reactions + refit rate tree

d3a74ee

Minor: remove .DS_Store file

694a7dc

kspieks force-pushed the separate_1_3_sigmatropic branch from 5d504aa to 694a7dc Compare December 9, 2021 01:43

add Ketoenol family to recommended families

c47c48b

davidfarinajr merged commit b6cfa54 into main Dec 9, 2021

davidfarinajr deleted the separate_1_3_sigmatropic branch December 9, 2021 19:13

kspieks mentioned this pull request Dec 9, 2021

Allow ArrheniusBM Fitting up to 2000K for RMG-database Rate Trees ReactionMechanismGenerator/RMG-Py#2243

Merged

kspieks mentioned this pull request Feb 15, 2022

Rotor scans #565

Merged

Separate 1,3 sigmatropic rearrangement family #550

Separate 1,3 sigmatropic rearrangement family #550

Uh oh!

Conversation

kspieks commented Nov 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kspieks commented Nov 30, 2021

Uh oh!

davidfarinajr left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kspieks commented Dec 8, 2021

Uh oh!

davidfarinajr commented Dec 8, 2021

Uh oh!

kspieks commented Dec 9, 2021

Uh oh!

davidfarinajr commented Dec 9, 2021

Uh oh!

kspieks commented Dec 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kspieks commented Nov 27, 2021 •

edited

Loading