Skip to content

Conversation

@mliu49
Copy link
Contributor

@mliu49 mliu49 commented Aug 30, 2017

Adds two new species constraint options:

  • maxSingletCarbenes: limits the number of carbon atoms with lone pairs in a molecule
  • maxCarbeneRadicals: limits the number of unpaired electrons a molecule with a carbene can have

This eliminates the need for the global forbidden groups added in ReactionMechanismGenerator/RMG-database@2e18d62, and addresses the speed concerns raised in #1079. Since the intention of the forbidden groups was to prevent molecules with 2 carbenes or 1 carbene and 1 radical from being formed, using species constraints provides a more robust filter than the forbidden groups.

Note: I set the default for maxSingletCarbenes to 1 and the default for maxCarbeneRadicals to 0, which makes them the only species constraints with a default value. The reasoning is that this accomplishes the same thing as having the forbidden groups without any extra input from the user. Any thoughts on whether or not this is a good idea would be appreciated.

The following table compares results of eg1 for different RMG-database versions on current master of RMG-Py (f3550f4), with the last column being on this branch of RMG-Py.

-- Before Aromatics 1 PR After Aromatics 1 PR Revert forbidden groups Add species constraints
Total time (s) 586 1386 765 705
isMoleculeForbidden (s) 61 704 69 67
Core species 32 37 37 37
Core rxns 164 206 206 206
Edge species 2055 2449 2646 2325
Edge rxns 6138 7487 8049 7216

Conclusions which can be drawn from these results:

  • The aromatics PR resulted in an increase in the number of core species and reactions
  • The added forbidden groups contributed over 10 min to total runtime
  • The forbidden groups successfully forbid 197 edge species and 562 edge reactions
  • Using species constraints removed 321 edge species and 833 edge reactions
  • The extra species and reactions removed were molecules which were too large to be caught by the forbidden groups
  • The time difference between reverting forbidden groups and adding species constraints is probably not statistically significant, but could simply be a result of having a smaller edge

@mliu49 mliu49 added the Status: Ready for Review PR is complete and ready to be reviewed label Aug 30, 2017
@codecov
Copy link

codecov bot commented Aug 30, 2017

Codecov Report

Merging #1187 into master will increase coverage by 0.07%.
The diff coverage is 46.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1187      +/-   ##
==========================================
+ Coverage   46.44%   46.52%   +0.07%     
==========================================
  Files         152      152              
  Lines       27538    27553      +15     
  Branches     5410     5416       +6     
==========================================
+ Hits        12789    12818      +29     
+ Misses      13868    13860       -8     
+ Partials      881      875       -6
Impacted Files Coverage Δ
rmgpy/data/kinetics/family.py 57.58% <ø> (-0.42%) ⬇️
rmgpy/rmg/input.py 43.11% <ø> (ø) ⬆️
rmgpy/molecule/molecule.py 0% <0%> (ø) ⬆️
rmgpy/constraints.py 95.16% <87.5%> (+48.86%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 81396e4...09ef13c. Read the comment docs.

@mliu49 mliu49 force-pushed the carbene_constraint branch from 7bb2cf9 to 4bfb0ad Compare August 30, 2017 21:03
@mliu49 mliu49 requested a review from KEHANG September 1, 2017 20:27
@mliu49
Copy link
Contributor Author

mliu49 commented Sep 1, 2017

ReactionMechanismGenerator/RMG-database@2e18d62 should be reverted along with this pull request.

@KEHANG KEHANG self-assigned this Sep 4, 2017
@KEHANG
Copy link
Member

KEHANG commented Sep 4, 2017

Side note (not necessarily need to be done in this PR): isMoleculeForbidden() is currently done by looping over all the forbidden groups one by one, which obviously is slow when having large core/edge and lots of forbidden structures. Parallelism would be suitable in this case. But if we could come up with a tree structure to organize forbidden groups, would be another option.

self.assertEqual(mol.getRadicalCount(), 2)
self.assertEqual(mol.getSingletCarbeneCount(), 0)

def testSingletCarbon(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test examples need a bit more complicated. Ideally test example should come from ReactionMechanismGenerator/RMG-database@2e18d62 since you are trying to replace it. Please pick some representatives from there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by more complicated? A molecule with 2 carbenes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are trying to replace the 18 forbidden groups with this PR. So I think the sample molecules from the 18 groups should be tested and more representative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's necessary. First of all, these tests are just for getSingletCarbeneCount(), which counts the number of carbene carbons in a molecule. I don't think the test molecules here need to be more complicated to test that RMG can count properly.

I also wrote tests for the actual species constraint module, to make sure that it catches what we expect it to catch. There I tested sample molecules with carbenes/radicals on adjacent carbons. Since the 18 forbidden groups were just trying to ban carbenes/radicals on the same molecule, I think this should be sufficient. Adding more carbons won't change the effect of the test.

# maximum number of singlet carbenes (lone pair on a carbon atom) in a molecule
maximumSingletCarbenes=1,
# maximum number of radicals on a molecule with a singlet carbene
# should be lower than maximumRadicalElectrons in order to have an effect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change "lower" to "lower or equal"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's equal, then it still won't have any effect, since it will enforce the same limitation as maximumRadicalElectrons, which returns first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I'm trying to say is that it's confusing to have comment written as "lower than" but you have example with "equal"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change the value to 0 then.

@KEHANG
Copy link
Member

KEHANG commented Sep 4, 2017

Review comments on things to be done before merging:

@mliu49
Copy link
Contributor Author

mliu49 commented Sep 4, 2017

A pull request has been made for reverting the forbidden group additions to RMG-database. The RMG-tests build for that branch is up, but still fails to complete the second example, presumably because the benchmark version (with the forbidden groups) is taking to long to finish. The only job to finish (eg1) took about half as long after removing the forbidden groups, as expected. It also had many additional species and reactions, which this PR should remove.

@mliu49 mliu49 force-pushed the carbene_constraint branch from 89cf472 to bd4a9fd Compare September 4, 2017 20:26
@KEHANG
Copy link
Member

KEHANG commented Sep 7, 2017

The code looks good now. Can you run a local RMG-tests with testing version being carbene_constraint -rmgpy and revert_forbidden -db, and provide a table of speed-up comparison?

@mliu49
Copy link
Contributor Author

mliu49 commented Sep 7, 2017

Here are the RMG-tests results.

-- Benchmark Time Tested Time Benchmark Mem Tested Mem
eg1 3:08 1:36 449.96 468.78
eg3 12:13 1:51 399.68 399.90
eg5 1:44 1:04 357.26 351.57
eg6 0:52 0:48 344.66 345.05
eg7 0:36 0:32 335.33 339.53

There were fewer speices and reactions for eg1 in the testing version, mostly due to additional carbenes which were caught by the species constraints.

The test suite stopped at eg7 due to [CH] from GRI-Mech3.0 being forbidden by the species constraints. Because of the way the maxCarbeneRadicals constraint is implemented, a single carbon atom with a radical electron and a lone pair would be caught by the default settings.

Benchmark vs. Testing Version Summary
=========================================
benchmark version of RMG
c5868701bb437c8e4c6a3cb0c520eb33e15ed03f
Fri Sep 1 23:22:39 2017 -0400

testing version of RMG
bb2e2c9b843da3f1520873812426709fe300e0e2
Tue Sep 5 11:28:36 2017 -0400

benchmark version of RMG-database
05619c48fe836ddd97651add611546447008e9fe
Fri Sep 1 09:38:24 2017 -0400

testing version of RMG-database
8b97642d5d26b212c5e4728f2d230856f955551d
Mon Sep 4 12:19:15 2017 -0400
=========================================

@mliu49
Copy link
Contributor Author

mliu49 commented Sep 7, 2017

After talking with @zjburas, I think we should just add the allowed=['reaction libraries] parameter to the problematic examples. We still want to forbid RMG from generating these species since they will probably have very bad thermo estimates.

@KEHANG
Copy link
Member

KEHANG commented Sep 7, 2017

Thanks for the update on RMG-tests. Will fix them soon. In terms of next step for this PR, please git rebase and squash accordingly.

mliu49 and others added 9 commits September 7, 2017 14:40
Counts the number of carbon atoms with a lone pair
Limit the number of radicals in a molecule which has a carbene
Remove maximumHydrogenAtoms since it's not actually implemented.
Change the values for the species constraints in the example
code to more reasonable values in case users use it directly.
@mliu49 mliu49 force-pushed the carbene_constraint branch from bb2e2c9 to 09ef13c Compare September 7, 2017 18:41
@KEHANG
Copy link
Member

KEHANG commented Sep 7, 2017

Looks good to me now. I'm going to merge ReactionMechanismGenerator/RMG-database#211 and this one soon.

@KEHANG KEHANG merged commit f030b20 into master Sep 7, 2017
@KEHANG KEHANG deleted the carbene_constraint branch September 7, 2017 20:23
@amarkpayne amarkpayne removed the Status: Ready for Review PR is complete and ready to be reviewed label Sep 8, 2017
rwest referenced this pull request Jul 7, 2025
Note added by rwest:
This is for the benefit of halocarbon / PFAS chemistry.
Perhaps we should just set the carbene radical count in those input files
and leave the default here at 0. (but with better documentation.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants