ProCS15: A DFT-based chemical shift predictor for backbone and Cβ atoms in proteins
- Published
- Accepted
- Subject Areas
- Biophysics, Computational Biology
- Keywords
- protein structure, NMR, chemical shifts, quantum chemistry
- Copyright
- © 2015 Larsen et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2015. ProCS15: A DFT-based chemical shift predictor for backbone and Cβ atoms in proteins. PeerJ PrePrints 3:e1308v2 https://doi.org/10.7287/peerj.preprints.1308v2
Abstract
We present ProCS15: A program that computes the isotropic chemical shielding values of backbone and C β atoms given a protein structure in less than a second. ProCS15 is based on around 2.35 million OPBE/6-31G(d,p)//PM6 calculations on tripeptides and small structural models of hydrogen-bonding. The ProCS15-predicted chemical shielding values are compared to experimentally measured chemical shifts for Ubiquitin and the third IgG-binding domain of Protein G through linear regression and yield RMSD values of up to 2.2, 0.7, and 4.8 ppm for carbon, hydrogen, and nitrogen atoms. These RMSD values are very similar to corresponding RMSD values computed using OPBE/6-31G(d,p) for the entire structure for each proteins. These maximum RMSD values can be reduced by using NMR-derived structural ensembles of Ubiquitin. For example, for the largest ensemble the largest RMSD values are 1.7, 0.5, and 3.5 ppm for carbon, hydrogen, and nitrogen. The corresponding RMSD values predicted by several empirical chemical shift predictors range between 0.7 - 1.1, 0.2 - 0.4, and 1.8 - 2.8 ppm for carbon, hydrogen, and nitrogen atoms, respectively.
Author Comment
relatively minor changes based on feedback. Protonation states are now explicitly discussed and links to data and code are now included.
Supplemental Information
Table S1
Overview Table. Column 0 is the central residue type in the tripeptide. Column $1$ contains the grid spacing in the datafile. Column 2 is the size of the data files for a single atom type after data compression. Column 3 is the amount of initial generated samples. Column 4 is number of chemical shift data points after the geometry optimization and NMR calculations. Column 5 is the interpolation method used to interpolate the missing data points. Column 6 is the amino acid's number of side chain angles in ProCS15.