The shiftability of protein coding genes: the genetic code was optimized for frameshift tolerating
- Published
- Accepted
- Subject Areas
- Computational Biology, Evolutionary Studies, Genetics, Genomics, Molecular Biology
- Keywords
- protein-coding gene, The genetic code, DNA sequence, frameshift mutation, readthrough, shiftability, amino acid substitution score, gene repairing, one gene-three translations
- Copyright
- © 2015 Wang et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2015. The shiftability of protein coding genes: the genetic code was optimized for frameshift tolerating. PeerJ PrePrints 3:e806v1 https://doi.org/10.7287/peerj.preprints.806v1
Abstract
The genetic code defines the relationship between a protein and its coding DNA sequence. It was presumed that most frameshifts would yield non-functional, truncated or cytotoxic products. In this study, we report that in E. coli, a frameshift β-lactamase (bla) gene is still functional if all of the inner stop codons were readthrough or replaced by a sense codon. By analyzing a large dataset including all available protein coding genes in major model organisms, it is demonstrated that in any species, and in any protein-coding genes, the three translational products from the three different reading frames, are always similar to each other and with constant ~50% similarities and ~100% coverages, and the similarities is predefined by the genetic code rather than the sequences themselves. It is likely that a coding gene can be translated into three isoforms from each of the three reading frames, we propose a new gene expression paradigm, “one transcript, three translations”, which is an amendment to the traditional “one gene, one/multiple peptides” hypotheses. Finally, we concluded that the genetic code was optimized for frameshift tolerating in the early evolution, which endows every protein coding gene a character of shiftability, an inherent and everlasting ability to tolerate frameshift mutations, and serves as an innate mechanism for cells to deal with the frameshift problem.
Author Comment
This is the first version of the preprint. Website: www.dnapluspro.com
Supplemental Information
Fig-S1
Fig S1. Alignment of HIV/SIV GP120 (A) coding sequences; (B) Protein sequences.
Table-S1
Table S1. The similarities of natural and simulated proteins and their frameshift isoforms.