Independent evolution of tetraloop in enterovirus oriL replicative element and its putative binding partners in protein 3C
- Published
- Accepted
- Subject Areas
- Evolutionary Studies, Genomics, Virology
- Keywords
- Enterovirus, RNA-protein interaction, tetraloop, virus evolution
- Copyright
- © 2017 Prostova et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2017. Independent evolution of tetraloop in enterovirus oriL replicative element and its putative binding partners in protein 3C. PeerJ Preprints 5:e3153v2 https://doi.org/10.7287/peerj.preprints.3153v2
Abstract
Background. Enteroviruses are small non-enveloped viruses with (+) ssRNA genome with one open reading frame. Enterovirus protein 3C (or 3CD for some species) binds the replicative element oriL to initiate replication. The replication of enteroviruses features low fidelity, which allows the virus to adapt to the changing environment on the one hand, and requires additional mechanisms to maintain the genome stability on the other. Structural disturbances in the apical region of oriL domain d can be compensated by amino acid substitutions in positions 154 or 156 of 3C (amino acid numeration corresponds to poliovirus 3C), thus suggesting the co-evolution of these interacting sequences in nature. The aim of this work was to understand co-evolution patterns of two interacting replication machinery elements in enteroviruses, the apical region of oriL domain d and its putative binding partners in the 3C protein.
Methods.To evaluate the variability of the domain d loop sequence we retrieved all available full enterovirus sequences (>6400 nucleotides), which were present in the NCBI database on February 2017 and analysed the variety and abundance of sequences in domain d of the replicative element oriL and in the protein 3C.
Results.A total of 2,842 full genome sequences was analysed. The majority of domain d apical loops were tetraloops, which belonged to consensus YNHG (Y=U/C, N=any nucleotide, H=A/C/U). The putative RNA-binding tripeptide 154-156 (Enterovirus C 3C protein numeration) was less diverse than the apical domain d loop region and, in contrast to it, was species-specific.
Discussion. Despite the suggestion that the RNA-binding tripeptide interacts with the apical region of domain d, they evolve independently in nature. Together, our data indicate the plastic evolution of both interplayers of 3C-oriL recognition.
Author Comment
Table 2 and Figure 3 contain additional details.
Supplemental Information
Multiple sequence alignment of Enterovirus A full genomes
Multiple sequence alignment of filtered set Enterovirus A full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Enterovirus B full genomes
Multiple sequence alignment of filtered set Enterovirus B full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Enterovirus C full genomes
Multiple sequence alignment of filtered set Enterovirus C full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Enterovirus D full genomes
Multiple sequence alignment of filtered set Enterovirus D full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Enterovirus E full genomes
Multiple sequence alignment of filtered set Enterovirus E full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Enterovirus F full genomes
Multiple sequence alignment of filtered set Enterovirus F full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Enterovirus G full genomes
Multiple sequence alignment of filtered set Enterovirus G full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Enterovirus H full genomes
Multiple sequence alignment of filtered set Enterovirus H full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Enterovirus J full genome
Multiple sequence alignment of filtered set Enterovirus J full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Rhinovirs A full genomes
Multiple sequence alignment of filtered set Rhinovirus A full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Rhinovirus B full genomes
Multiple sequence alignment of filtered set Rhinovirus B full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Multiple sequence alignment of Rhinovirus C full genomes
Multiple sequence alignment of filtered set Rhinovirus C full genomes
Sequences that differed from any other sequence in the dataset by less than 1% of the nucleotide sequence were omitted.
Occurrence of domain d apical sequences in filtered sets of full genomes of different enterovirus species and serotypes
Tetraloops CCCG, UGUG, CAUG and UUGG that were unique for species Enterovirus A, B, C and D and were lost upon filtration, were added and are shown in blue. The gradient coloring from red to green represents abundance heat map for the genomes with different domain d sequence.