Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates

Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
DOI
10.7287/peerj.preprints.2739v1
Subject Areas
Bioinformatics, Computational Biology, Evolutionary Studies, Statistics
Keywords
evolutionary rate, codon evolution models, amino-acid evolution models, rate variation
Copyright
© 2017 Sydykova et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Sydykova DK, Wilke CO. 2017. Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates. PeerJ Preprints 5:e2739v1

Abstract

Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dN/dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dN/dS values relate to Rate4Site scores is not known. Here we elucidate the relationship between these two rate measurements. We simulate sequences with known dN/dS, using either dN/dS models or mutation--selection models for simulation. We then infer Rate4Site scores on the simulated alignments, and we compare those scores to either true or inferred dN/dS values on the same alignments. We find that Rate4Site scores generally correlate well with true dN/dS, and the correlation strengths increase in alignments with higher sequence divergence and higher number of taxa. Moreover, Rate4Site scores correlate nearly perfectly with inferred dN/dS values, even for small alignments with little divergence. Finally, we verify this relationship between Rate4Site and dN/dS in a variety of natural sequence alignments. We conclude that codon-level and amino-acid-level analysis frameworks are directly comparable and yield near-identical inferences.

Author Comment

This is a preprint submission to PeerJ Preprints.