This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
We have collected computed barrier heights and reaction energies (and associated model structures) for five enzymes from studies published by Himo and co-workers. Using this data, obtained at the B3LYP/6- 311+G(2d,2p)[LANL2DZ]//B3LYP/6-31G(d,p) level of theory, we then benchmark PM6, PM7, PM7-TS, and DFTB3 and discuss the influence of system size, bulk solvation, and geometry re-optimization on the error. The mean absolute differences (MADs) observed for these five enzyme model systems are similar to those observed for PM6 and PM7 for smaller systems (10-15 kcal/mol), while DFTB results in a MAD that is significantly lower (6 kcal/mol). The MADs for PMx and DFTB3 are each dominated by large errors for a single system and if the system is disregarded the MADs fall to 4-5 kcal/mol. Overall, results for the condensed phase are neither more or less accurate relative to B3LYP than those in the gas phase. With the exception of PM7-TS, the MAD for small and large structural models are very similar, with a maximum deviation of 3 kcal/mol for PM6. Geometry optimization with PM6 shows that for one system this method predicts a different mechanism compared to B3LYP/6-31G(d,p). For the remaining systems geometry optimization of the large structural model increases the MAD relative to single points, by 2.5 and 1.8 kcal/mol for barriers and reaction energies. For the small structural model the corresponding MADs decrease by 0.4 and 1.2 kcal/mol, respectively. However, despite these small changes, significant changes in the structures are observed for some systems, such as proton transfer and hydrogen bonding rearrangements. The paper represents the first step in the process of creating a benchmark set of barriers computed for systems that are relatively large and representative of enzymatic reactions, a considerable challenge for any one research group but possible through a concerted effort by the community. We end by outlining steps needed to expand and improve the data set and how other researchers can contribute to the process.
This is a nice paper, even though the results are frustrating for those who expect that semi-empirical methods will soon become a useful alternative to DFT in real-life applications.
A few comments:
The ball and stick models in Fig.3 are stunningly beautiful, but are unfortunately very hard to interpret, due to the partial blocking of the substrate structure by the phenol moiety. Would a "thin-stick-only model" be easier to look at?
In Fig.2 , would it be possible to highlight the differences betwen the PM6 and the DFT model?
Table 2 is not completely clear at first sight. Wouldn't it be better just to show the actual computed barriers, instead of the change in going from one model to the next?
"the results are frustrating for those who expect that semi-empirical methods will soon become a useful alternative to DFT in real-life applications."
yes, some ways to go it seems
"Would a "thin-stick-only model" be easier to look at?"
I tried it, but it didn't seem to help. The only think that really works is to rotate them side by side. I am working on the Supp Mat so soon the optimised coordinates will be available.
"In Fig.2 , would it be possible to highlight the differences between the PM6 and the DFT model?" I tried to overlay the two structures but that make matters even worse. Again once you can rotate/zoom it works OK. Supp Mat coming up.
"Wouldn't it be better just to show the actual computed barriers, instead of the change in going from one model to the next?"
The main reason to study the effect of system size was to see if one could get a better estimate
of the barrier heights with an ONIOM like approach, so we were interested in how well the SE methods handled the change.
In table 2, there is a sign error in the reaction energy for AspDC (model 3)
"B3LYP -9.5 9.8 -0.9 -1.4 9.1 3.4 4.8"
"B3LYP -9.5 9.8 -0.9 +1.4 9.1 3.4 4.8"
I found it when I realized that, as originally written, the B3LYP energy for the full model would be 6.2 kcal.mol-1 , instead of the 9 kcal.mol-1 shown in table 1. I have checked it in the original paper. Updated MADs are 1.0 kcal.mol-1 (PM6) 2.6kcal.mol-1 (PM7) and 4.1 kcal.mol-1 (DFTB3)
"Following" is like subscribing to any updates related to a preprint.
These updates will appear in your home dashboard each time you visit PeerJ.
You can also choose to receive updates via daily or weekly email digests.
If you are following multiple preprints then we will send you
no more than one email per day or week based on your preferences.
Note: You are now also subscribed to the subject areas of this preprint
and will receive updates in the daily or weekly email digests if turned on.
You can add specific subject areas through your profile settings.