Prediction of pKa values for drug-like molecules using semiempirical quantum chemical methods
- Published
- Accepted
- Subject Areas
- Biophysics, Computational Biology
- Keywords
- pKa prediction
- Copyright
- © 2016 Jensen
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2016. Prediction of pKa values for drug-like molecules using semiempirical quantum chemical methods. PeerJ Preprints 4:e2564v2 https://doi.org/10.7287/peerj.preprints.2564v2
Abstract
Rapid yet accurate pKa prediction for drug-like molecules is a key challenge in computational chemistry. This study uses PM6-DH+/COSMO, PM6/COSMO, PM7/COSMO, PM3/COSMO, AM1/COSMO, PM3/SMD, AM1/SMD, and DFTB3/SMD to predict the pKa values of 53 amine groups in 48 drug-like compounds. The approach uses an isodesmic reaction where the pKa value is computed relative to a chemically related reference compound for which the pKa value has been measured experimentally or estimated using a standard empirical approach. The AM1- and PM3-based methods perform best with RMSE values of 1.4 - 1.6 pH units that have uncertainties of ±0.2-0.3 pH units, which make them statistically equivalent. However, for all but PM3/SMD and AM1/SMD the RMSEs are dominated by a single outlier, cefadoxil, caused by proton transfer in the zwitterionic protonation state. If this outlier is removed, the RMSE values for PM3/COSMO and AM1/COSMO drop to 1.0 ± 0.2 and 1.1 ± 0.3, while PM3/SMD and AM1/SMD remain at 1.5 ± 0.3 and 1.6 ± 0.3/0.4 pH units, making the COSMO-based predictions statistically better than the SMD-based predictions. So for pKa calculations where a zwitterionic state is not involved or proton transfer in a zwitterionic state is not observed then PM3/COSMO or AM1/COSMO is the best pKa prediction method, otherwise PM3/SMD or AM1/SMD should be used. Thus, fast and relatively accurate pKa prediction for 100-1000s of drug-line amines is feasible with the current setup and relatively modest computational resources.
Author Comment
RMSE uncertainties have been recalculated + minor typos fixed