Evolutionary analysis of chloroplast tRNA of Gymnosperm revealed the novel structural variation and evolutionary aspect
- Published
- Accepted
- Received
- Academic Editor
- Genlou Sun
- Subject Areas
- Evolutionary Studies, Genomics, Plant Science
- Keywords
- tRNA, Chloroplast, Anti-codon, Evolution, Transition and transversion, Phylogeny
- Copyright
- © 2020 Zhang et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
- Cite this article
- 2020. Evolutionary analysis of chloroplast tRNA of Gymnosperm revealed the novel structural variation and evolutionary aspect. PeerJ 8:e10312 https://doi.org/10.7717/peerj.10312
Abstract
Gymnosperms such as ginkgo, conifers, cycads, and gnetophytes are vital components of land ecosystems, and they have significant economic and ecologic value, as well as important roles as forest vegetation. In this study, we investigated the structural variation and evolution of chloroplast transfer RNAs (tRNAs) in gymnosperms. Chloroplasts are important organelles in photosynthetic plants. tRNAs are key participants in translation where they act as adapter molecules between the information level of nucleic acids and functional level of proteins. The basic structures of gymnosperm chloroplast tRNAs were found to have family-specific conserved sequences. The tRNAΨ -loop was observed to contain a conforming sequence, i.e., U-U-C-N-A-N2. In gymnosperms, tRNAIle was found to encode a “CAU” anticodon, which is usually encoded by tRNAMet. Phylogenetic analysis suggested that plastid tRNAs have a common polyphyletic evolutionary pattern, i.e., rooted in abundant common ancestors. Analyses of duplication and loss events in chloroplast tRNAs showed that gymnosperm tRNAs have experienced little more gene loss than gene duplication. Transition and transversion analysis showed that the tRNAs are iso-acceptor specific and they have experienced unequal evolutionary rates. These results provide new insights into the structural variation and evolution of gymnosperm chloroplast tRNAs, which may improve our comprehensive understanding of the biological characteristics of the tRNA family.
Introduction
Gymnosperms originated in the Paleozoic Devonian Period (about 385 million years ago), and they are key groups in terms of the transformation from spore reproduction to seed reproduction in higher plants (Gerrienne et al., 2004; Crisp & Cook, 2011). According to the latest phylogenetic classification, gymnosperm species are divided into eight orders, 12 families, 84 genera, and more than 1,000 species (Wang & Ran, 2014). Gymnosperms include ginkgo, cycads, conifers, and gnetophytes, which are grown in forests as important timber species and they provide raw materials for human usage, such as fiber, resin, and tannin (Christenhusz et al., 2010). In addition, gymnosperms include some important threatened plants, where 40% are at high risk of extinction (Forest et al., 2018). Recent phylogenetic and evolutionary studies of gymnosperms have demonstrated the rapid evolution of mitochondrial (mt) genes and provided further evidence of sister relationship between conifers and Gnetales (Ran, Gao & Wang, 2010). The high levels of genetic diversity and population differentiation among the Pinus species in gymnosperms have been studied based on plastid DNA markers (Liu et al., 2014). Other studies have indicated patterns related to the physiological ecology, phylogenetic relationships, and population genetic structure of gymnosperm species (Yu et al., 2014; Li et al., 2015; Dong et al., 2016). However, these studies mainly considered the phylogeny and evolution at the whole populations level. Thus, the detailed evolutionary characteristics of gymnosperms still need to be elucidated.
Chloroplasts are the site of photosynthesis and of various essential metabolic pathways, e.g., fatty acid and amino acid biosynthesis and the assimilation of nitrogen, sulfur, and selenium (Hoober, 2006; Des Marais, 2000 Knorr & Heimann, 2001; Pilon-Smits et al., 2002; Guo et al., 2007; Kretschmer, Croll & Kronstad, 2017). It is generally recognized that chloroplasts are derived from proto-eukaryotic symbiotic cyanobacteria that internalized in eukaryotic cells (Hiroki & Daisuke, 2018) and evolved into central organelles. Chloroplasts have their own genome encoding about 100 proteins and they are maternally inherited organelles in most angiosperm plants (Abdallah, Salamini & Leister, 2000; Heuertz et al., 2004; Civan et al., 2014). Among gymnosperms, paternal plastid inheritance is the typical characteristic of conifers (Fauré et al., 1994; Kaundun & Matsumoto, 2011). Studies have shown that the chloroplast genome is quite conserved with an average evolutionary rate of 0.2–1. 0 ×10−9 per site per year, which is only one-fifth of that for the nuclear genome (Drouin, Daoud & Xia, 2008; Duchene & Bromham, 2013). The chloroplast genome is a covalently closed circular structure with four parts comprising the large single copy region (LSC), small single copy region (SSC), inverted repeat region A (IRa), and inverted repeat region B (IRb). The two IRs have the same sequences but in the opposite direction (Wang et al., 2008; Logacheva et al., 2009; Hereward et al., 2018). Due to the independent evolution of the chloroplast genome, it is possible to construct a molecular phylogenetic tree using the chloroplast genome and without requiring any other data. Data analysis based on the conserved evolution of plastids is highly valuable for phylogenetic studies (Kim & Suh, 2013) because it can provide reliable and useful phylogenetic information. The relative completeness and independence of the chloroplast genome means that it can provide valuable material for research purposes.
Transfer RNAs (tRNAs) undergo numerous post-transcriptional nucleotide modifications and they exhibit abundant chemical diversity where the bases experience methylation, formylation, and other modifications (Suzuki & Suzuki, 2014). Chemical nucleotide modifications are frequent in tRNAs and they are important for the structure, stability, correct folding, aminoacylation, and decoding. For example, a previous analysis of the chemically synthesized f5C34-modified anticodon loop of human mt-tRNAMet showed that f5C34 contributes to the anticodon domain structure of the mt-tRNA (Lusic et al., 2008). tRNAs comprise sequences of less than 100 polynucleotides that fold into a clover-type secondary structure and then into an L-shaped tertiary structure (Wilusz, 2015). The secondary structure of tRNAs comprises different arms as well as loops, i.e., the D-arm, acceptor arm, anticodon arm, pseudouridine arm (Ψ-arm), D-loop, variable arm, anticodon loop, and pseudouridine loop (Ψ-loop) (Giegé, Puglisi & Florentz, 1993; Mizutani & Goto, 2000). This unique structure allows tRNA to act as important bridges between the information level of nucleic acids and functional level of proteins. The vital components of tRNAs comprise an anti-codon region that discerns the messenger RNA carried by the specific codons, a 3′-CCA tail for attaching to the cognate amino acid, the Ψ-arm, and a Ψ-loop that has a relationship with the ribosome machinery (Kirchner & Ignatova, 2014). Asymmetric combinations and the divided segments in tRNA genes allow us to understand the diversity of tRNA molecules. tRNA species fulfill various functions in cellular homeostasis, regulation of gene expression and epigenetics, biogenesis, and even biological disease (Ribasd & Dedon, 2014; Kanai, 2015; Schimmel, 2017). The evolutionary relationships determined between cyanobacteria and monocots show that tRNAs evolved polyphyletically and they originated from multiple common ancestors with a high rate of gene loss (Mohanta et al., 2017; Mohanta et al., 2019). Nevertheless, the basic details of the tRNAs in plant chloroplasts still need to be elucidated and on the diverse evolutionary features of gymnosperm tRNAs are still unclear.
In this study, we assessed all of the chloroplast genomes in 12 families of gymnosperms from eight orders. The main aims of this study were as follows: (1) to determine the diversification of nucleotides in the secondary structure of gymnosperm tRNAs; (2) to identify the detailed genomic features of chloroplast tRNAs; (3) to assess the evolutionary relationships among different chloroplast tRNAs; and (4) to evaluate the duplication or loss events that occurred in all of the tRNAs considered. Our findings provide important insights into the biological characteristics and evolutionary variation of the tRNA family.
Materials & Methods
Annotation and identification of chloroplast tRNA sequences in gymnosperms
We downloaded complete chloroplast genomes for 12 representative gymnosperms in eight orders from the National Center for Biotechnology Information database (NCBI, https://www.ncbi.nlm.nih.gov/). The gymnosperm species investigated were: Cycas debaoensis Y. C. Zhong & C. J. Chen (KM459003), Dioon spinulosum Dyer ex Eichler (NC_027512), Ginkgo biloba L. (NC_016986), Cedrus deodara (Roxb.) G. Don (NC_014575), Wollemia nobilis W. G. Jones, K. D. Hill & J. M. Allen (NC_027235), Retrophyllum piresii Silba C. N. (KJ017081), Sciadopitys verticillata (Thunb.) Sieb. et Zucc. (NC_029734), Cunninghamia lanceolata (Lamb.) Hook. (NC_021437), Taxus mairei (Lemee et Levl.) Cheng et L. K. Fu (KJ123824), Welwitschia mirabilis Hook.f. (EU342371), Gnetum gnemon L. (KR476377), and Ephedra equisetina Bge. (NC_011954). The gymnosperm tRNA genomes were annotated using GeSeq-Annotation of Organellar Genomes tool (Tillich et al., 2017) where the parameters were set as: circular sequence(s), chloroplast of sequence source, generate multi FASTA; BLAST protein search identity 25% for annotating plastid IR, 85% identity for BLAST rRNA, tRNA and DNA search, Embryophyta chloroplast (CDS+rRNA), third party tRNA annotator ARAGORN v1.2.38, ARWEN v1.2.3, tRNAScan-SE v2.0, and without Refseq choice.
Structural analysis of chloroplast tRNAs
ARAGORN (Laslett & Canback, 2004) and tRNAScan-SE software (Lowe & Eddy, 1997) were employed to analyze the sequences and the secondary structure of tRNAs in the chloroplast genomes of the involved gymnosperm plants. The default parameters were set in ARAGORN software. The parameters for tRNAScan-SE were set as: sequence source, bacterial; search mode, default; query sequences, formatted (FASTA); and genetic code for tRNA isotype prediction, universal.
Phylogenetic tree construction
A phylogenetic tree was constructed for all of the tRNAs using MEGA7.0 software (Kumar et al., 2008; Kumar, Stecher & Tamura, 2016). To study the evolutionary details of chloroplast tRNAs in gymnosperm species, an alignment file for tRNAs was achieved by CLUSTAL Omega software before the phylogenetic tree was constructed. MEGA7 software was used to transform the alignment file into MEGA format. The phylogenetic tree was constructed with the following parameters: phylogeny reconstruction of analysis, maximum likelihood model, bootstrap method in phylogeny test, 1,000 bootstrap replicates, nucleotides type, gamma distributed with invariant sites (G+I) model, five discrete gamma categories, partial deletion for gaps/missing data treatment, 95% site coverage cut-off, and very strong for branch swap filter.
Transition/transversion analysis
The sequences of the tRNA isotypes were aligned to determine the transition and transversion rates for chloroplast tRNAs in gymnosperm plants. The files covering all 20 types of tRNAs were transformed into the MEGA file format and analyzed separately using MEGA7.0 software (Kumar, Tamura & Nei, 1994). The transition and transversion rates were analyzed for tRNAs with the following parameters: substitution pattern estimation (ML) analysis, automatic (neighbor-joining tree), maximum likelihood statistical method, nucleotide substitution type, Kimura two-parameter model, gamma distributed (G) site rates, five discrete gamma categories, partial deletion of gaps/missing data treatment, 95% of site coverage cut-off, and very strong branch swap filter.
Loss and duplication events analysis for tRNA genes
In order to investigate the duplication or loss events in tRNA genes, the NCBI taxonomy browser was utilized to construct the whole species tree for the 12 gymnosperm species considered. The phylogenetic tree conducted in the evolutionary study was employed as gene tree. The gene tree for the tRNAs and species tree for the gymnosperm species were submitted to Notung 2.9 software (Chen, Durand & Farach-Colton, 2000), and then reconciled to discover duplicated and lost tRNA genes in the chloroplast genomes of gymnosperms.
Order | Family | Subfamily | Genus | Species | NCBI Locus |
---|---|---|---|---|---|
Cycadales | Cycadaceae | Cycas | debaoensis | KM459003 | |
Zamiaceae | Diooideae | Dioon | spinulosum | NC_027512 | |
Ginkgoales | Ginkgoaceae | Ginkgo | biloba | NC_016986 | |
Pinales | Pinaceae | Abieteae | Cedrus | deodara | NC_014575 |
Araucariales | Araucariaceae | Wollemia | nobilis | NC_027235 | |
Podocarpaceae | Retrophyllum | piresii | KJ017081 | ||
Cupressales | Sciadopityaceae | Sciadopitys | verticillata | NC_029734 | |
Cupressaceae | Cunninghamia | Cunninghamia | lanceolata | NC_021437 | |
Taxaceae | Taxus | mairei | KJ123824 | ||
Welwitschiales | Welwitschiaceae | Welwitschia | mirabilis | EU342371 | |
Gnetales | Gnetaceae | Gnetum | gnemon | KR476377 | |
Ephedrales | Ephedraceae | Ephedra | equisetina | NC_011954 |
tRNA isotypes | Number of tRNAs | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
C. debaoensis | D. spinulosum | G. biloba | C. deodara | W. nobilis | R. piresii | S. verticillata | C. lanceolata | T. mairei | W. mirabilis | G. gnemon | E. equisetina | |
Ala | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 |
Gly Pro |
1 | 1 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 2 | 2 | 1 |
2 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 1 | |
Thr | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1 |
Val | 2 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 0 | 2 | 2 | 1 |
Ser | 3 | 3 | 3 | 3 | 4 | 3 | 3 | 3 | 2 | 3 | 3 | 3 |
Arg | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 4 | 3 | 2 |
Leu | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
Phe Asn Lys |
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | |
Asp | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 |
Glu | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 |
His | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Gln | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 1 |
Ile | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 1 | 1 | 1 |
Met/fMet Tyr Cys |
2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
Trp | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 |
Selenocysteine | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Suppressor | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Total | 30 | 31 | 33 | 31 | 32 | 32 | 31 | 32 | 28 | 33 | 32 | 28 |
Results
Genomic features of gymnosperm chloroplast tRNAs
Sequences were analyzed to identify the genomic tRNAs in the chloroplast genomes of 12 gymnosperm species comprising C. debaoensis, D. spinulosum, G. biloba, C. deodara, W. nobilis, R. piresii, S. verticillata, C. lanceolata, T. mairei, W. mirabilis, G. gnemon, and E. equisetina, which were obtained from the NCBI database (Table 1). The results showed that the length of the chloroplast tRNAs vary from the smallest with 64 nucleotides (nt) (tRNAMet -CAU in T. mairei) to the largest with 96 nt (tRNATyr-AUA in W. nobilis, C. deodara, and G. biloba) (Data S1). We found that the chloroplast genomes of gymnosperm plants encode 28 to 33 tRNAs (Table 2), where D. spinulosum, C. deodara, and S. verticillata encode 31 anticodons, W. nobilis, R. piresii, C. lanceolata, and G. gnemon encode 32 tRNA isotypes, G. biloba, and W. mirabilis encode 33 tRNAs. Other species comprising T. mairei, E. equisetina and C. debaoensis encode 28, 28, 30 tRNA isotypes, respectively (Table 2). tRNAAla was not found in R. piresii and T. mairei, and tRNAV al was not detected in T. mairei (Fig. S3). We also observed that all of the species do not encode selenocysteine and its suppressor tRNA (Table 2). Overall, tRNASer (in W. nobilis) and tRNAArg (in W. mirabilis) are the most abundant (four types) followed by tRNALeu (three types) (Table 2).
Figure 1: Certain tRNAs in C. debaoensis, D. spinulosum, G. biloba, C. deodara, and R. piresii contain expanded variable stem and loops.
tRNASer, tRNALeu, and tRNATyr from C. debaoensis (A, tRNASer-GGA), D. spinulosum (B, tRNASer-GCU), G. biloba (C, tRNASer-UGA), C. deodara (D, tRNASer-GGA; E, tRNALeu-CAA), R. piresii (F, tRNATyr-GUA) were observed to contain an expanded variable stem and variable loop (indicated by yellow box). The anti-codon loop of tRNASer (except for tRNASer-GCU of D. spinulosum) was made up of seven nucleotides with the conservative N-U-N-G-A-A-N consensus sequence.Figure 2: Certain tRNAs in S. verticillata, C. lanceolata, T. mairei, W. mirabilis, and G. gnemon contain expanded variable stem and loops.
tRNASer, tRNALeu from S. verticillate (A, tRNASer-UGA; F, tRNALeu-UAG), C. lanceolata (B, tRNALeu-UAA), T. mairei (C, tRNASer-UGA), W. mirabilis (D, tRNASer-GGA), G. gnemon (E, tRNALeu-CAA) were observed to contain a variable stem and variable loop (indicated by yellow box). The anti-codon loop of tRNASer was made up of seven nucleotides with the conservative N-U-N-G-A-A-N consensus sequence, and the consensus sequence was C-U-N-A-N2-A for tRNALeu.Figure 3: An abnormal tRNA structure lacking the D-arm found in W. nobilis.
The tRNAGly with anti-codon UCC was found lacking the D-arm.Variations in structures of chloroplast tRNAs
Some tRNAs with a loop structure in the variable region were found to be encoded in the gymnosperm chloroplast genomes (Figs. 1 and 2). A novel tRNA lacking the D-arm was found in tRNAGly in W. nobilis (Fig. 3). As shown in Figs. 1 and 2, tRNALeu, tRNASer, and tRNATyr contain expanded variable stem/loops. In these tRNAs (except for tRNASer-GCU of D. spinulosum), the anticodon loop of tRNASer contains the conserved consensus sequence N-U-N-G-A-A-N, and tRNAsLeu have the consensus sequence C-U-N-A-N2-A. The variable loop region is predicted to fold into stem-loop structures with apical loops of 3 to 7 nt in tRNASer and several tRNALeu variants. The stems contain up to 7 bp (Figs. 1 and 2). The expanded variable loop structures may play important functions during the protein translation process in chloroplasts.
Chloroplast genomes contain 25 to 30 anticodon-specific tRNAs
The genomes of the species analyzed were found to code for at least two copies of tRNAMet-CAU/tRNAfMet-CAU. Each of the gymnosperm chloroplast genomes encodes 25 to 30 anticodon-specific tRNAs (Tables 2 and 3), where E. equisetina encodes 25 anticodons, T. mairei encodes 26 anticodons, C. debaoensis, S. verticillata, and C. lanceolata encode 28 anticodons, and D. spinulosum, C. deodara, W. mirabilis, and R. piresii encode 29 anticodons. Other species comprising W. nobilis, G. gnemon and G. biloba encodes 30 anticodons (Table 3).
tRNAArg-CCG was present in the genomes of nine gymnosperm species but absent from C. lanceolata, T. mairei, and E. equisetina, while tRNAGly-UCC was lacking from C. debaoensis, S. verticillata, D. spinulosum, C. lanceolata, T. mairei, and E. equisetina (Table 3). The most abundant anticodons found in the chloroplast genomes were tRNAGly-GCC, tRNAPro-UGG, tRNASer-UGA, tRNASer-GCU, tRNAArg-ACG, tRNAArg-UCU, tRNALeu-UAG, tRNALeu-CAA, tRNAPhe-GAA, tRNAAsn-GUU, tRNALys-UUU, tRNAAsp-GUC, tRNAGlu-UUC, tRNAHis-GUG, tRNAGln-UUG, tRNAIle-CAU, tRNAMet-CAU, tRNATyr-GUA, tRNACys-GCA, and tRNATrp-CCA (Table 3). Two tRNATrp iso-acceptors are present in E. equisetina chloroplasts, compared with a single one in the other gymnosperm species analyzed in this study.
tRNA Isotypes | Isoacceptors | tRNA Isotypes | Isoacceptors | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C. debaoensis (28) | S. verticillata (28) | |||||||||||||
Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | |||||
Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 0 | Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 0 | |||||
Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | Pro | AGG: 0 | GGG: 0 | CGG: 0 | UGG: 1 | |||||
Thr | AGU: 0 | GGU: 0 | CGU: 0 | UGU: 1 | Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | |||||
Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 1 | Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 1 | |||||
Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | |
Arg | ACG: 1 | GCG: 0 | CCG:1 | UCG: 0 | CCU: 0 | UCU: 1 | Arg | ACG: 1 | GCG: 0 | CCG:1 | UCG: 0 | CCU: 0 | UCU: 1 | |
Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | |
Phe | AAA: 0 | GAA: 1 | Phe | AAA: 0 | GAA: 1 | |||||||||
Asn | AUU: 0 | GUU: 1 | Asn | AUU: 0 | GUU: 1 | |||||||||
Lys | CUU: 0 | UUU: 1 | Lys | CUU: 0 | UUU: 1 | |||||||||
Asp | AUC: 0 | GUC: 1 | Asp | AUC: 0 | GUC: 1 | |||||||||
Glu | CUC: 0 | UUC: 2 | Glu | CUC: 0 | UUC: 2 | |||||||||
His | AUG: 0 | GUG: 1 | His | AUG: 0 | GUG: 1 | |||||||||
Gln | CUG: 0 | UUG: 1 | Gln | CUG: 0 | UUG: 2 | |||||||||
Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | |||||
Met | CAU: 2 | Met | CAU: 2 | |||||||||||
Tyr | AUA: 0 | GUA: 1 | Tyr | AUA: 0 | GUA: 1 | |||||||||
Cys | ACA: 0 | GCA: 1 | Cys | ACA: 0 | GCA: 1 | |||||||||
Trp | CCA: 1 | Trp | CCA: 1 | |||||||||||
Supressor | CUA: 0 | UUA: 0 | UCA: 0 | Supressor | CUA: 0 | UUA: 0 | UCA: 0 | |||||||
Sec | UCA: 0 | Sec | UCA: 0 | |||||||||||
D. spinulosum (29) | C. lanceolata (28) | |||||||||||||
Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | |||||
Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 0 | Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 0 | |||||
Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | |||||
Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | |||||
Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 1 | Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 1 | |||||
Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | |
Arg | ACG: 1 | GCG: 0 | CCG:1 | UCG: 0 | CCU: 0 | UCU: 1 | Arg | ACG: 1 | GCG: 0 | CCG: 0 | UCG: 0 | CCU: 0 | UCU: 1 | |
Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | |
Phe | AAA: 0 | GAA: 1 | Phe | AAA: 0 | GAA: 1 | |||||||||
Asn | AUU: 0 | GUU: 1 | Asn | AUU: 0 | GUU: 1 | |||||||||
Lys | CUU: 0 | UUU: 1 | Lys | CUU: 1 | UUU: 1 | |||||||||
Asp | AUC: 0 | GUC: 1 | Asp | AUC: 0 | GUC: 1 | |||||||||
Glu | CUC: 0 | UUC: 2 | Glu | CUC: 0 | UUC: 2 | |||||||||
His | AUG: 0 | GUG: 2 | His | AUG: 0 | GUG: 1 | |||||||||
Gln | CUG: 0 | UUG: 1 | Gln | CUG: 0 | UUG: 2 | |||||||||
Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | |||||
Met | CAU: 2 | Met | CAU: 2 | |||||||||||
Tyr | AUA: 0 | GUA: 1 | Tyr | AUA: 0 | GUA: 1 | |||||||||
Cys | ACA: 0 | GCA: 1 | Cys | ACA: 0 | GCA: 1 | |||||||||
Trp | CCA: 1 | Trp | CCA: 1 | |||||||||||
Supressor | CUA: 0 | UUA: 0 | UCA: 0 | Supressor | CUA: 0 | UUA: 0 | UCA: 0 | |||||||
Sec | UCA: 0 | Sec | UCA: 0 | |||||||||||
G. biloba (30) | T. mairei (26) | |||||||||||||
Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 0 | |||||
Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 1 | Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 0 | |||||
Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | |||||
Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | |||||
Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 0 | Val | AAC: 0 | GAC: 0 | CAC: 0 | UAC: 0 | |||||
Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | Ser | AGA: 0 | GGA: 0 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | |
Arg | ACG: 1 | GCG: 0 | CCG:1 | UCG: 0 | CCU: 0 | UCU: 1 | Arg | ACG: 1 | GCG: 0 | CCG: 0 | UCG: 0 | CCU: 0 | UCU: 1 | |
Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 2 | UAA: 0 | Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | |
Phe | AAA: 0 | GAA: 1 | Phe | AAA: 0 | GAA: 1 | |||||||||
Asn | AUU: 0 | GUU: 1 | Asn | AUU: 0 | GUU: 1 | |||||||||
Lys | CUU: 0 | UUU: 1 | Lys | CUU: 0 | UUU: 1 | |||||||||
Asp | AUC: 0 | GUC: 1 | Asp | AUC: 0 | GUC: 1 | |||||||||
Glu | CUC: 0 | UUC: 1 | Glu | CUC: 0 | UUC: 1 | |||||||||
His | AUG: 0 | GUG: 2 | His | AUG: 0 | GUG: 1 | |||||||||
Gln | CUG: 0 | UUG: 1 | Gln | CUG: 0 | UUG: 1 | |||||||||
Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | Ile | AAU: 1 | GAU: 0 | CAU: 2 | UAU: 1 | |||||
Met | CAU: 2 | Met | CAU: 2 | |||||||||||
Tyr | AUA: 1 | GUA: 1 | Tyr | AUA: 0 | GUA: 1 | |||||||||
Cys | ACA: 1 | GCA: 1 | Cys | ACA: 0 | GCA: 1 | |||||||||
Trp | CCA: 1 | Trp | CCA: 1 | |||||||||||
Supressor | CUA: 0 | UUA: 0 | UCA: 0 | Supressor | CUA: 0 | UUA: 0 | UCA: 0 | |||||||
Sec | UCA: 0 | Sec | UCA: 0 | |||||||||||
C. deodara (29) | W. mirabilis (29) | |||||||||||||
Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | |||||
Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 1 | Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 1 | |||||
Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | |||||
Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | |||||
Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 0 | Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 1 | |||||
Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | |
Arg | ACG: 1 | GCG: 0 | CCG:1 | UCG: 0 | CCU: 0 | UCU: 1 | Arg | ACG: 1 | GCG: 0 | CCG: 2 | UCG: 0 | CCU: 0 | UCU: 1 | |
Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | |
Phe | AAA: 0 | GAA: 1 | Phe | AAA: 0 | GAA: 1 | |||||||||
Asn | AUU: 0 | GUU: 1 | Asn | AUU: 0 | GUU: 1 | |||||||||
Lys | CUU: 0 | UUU: 1 | Lys | CUU: 0 | UUU: 1 | |||||||||
Asp | AUC: 0 | GUC: 1 | Asp | AUC: 0 | GUC: 1 | |||||||||
Glu | CUC: 0 | UUC: 2 | Glu | CUC: 0 | UUC: 2 | |||||||||
His | AUG: 0 | GUG: 1 | His | AUG: 0 | GUG: 1 | |||||||||
Gln | CUG: 0 | UUG: 1 | Gln | CUG: 0 | UUG: 1 | |||||||||
Ile | AAU: 0 | GAU: 04 | CAU: 1 | UAU: 0 | Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | |||||
Met | CAU: 2 | Met | CAU: 2 | |||||||||||
Tyr | AUA: 0 | GUA: 1 | Tyr | AUA: 0 | GUA: 1 | |||||||||
Cys | ACA: 0 | GCA: 1 | Cys | ACA: 0 | GCA: 1 | |||||||||
Trp | CCA: 1 | Trp | CCA: 1 | |||||||||||
Supressor | CUA: 0 | UUA: 0 | UCA: 0 | Supressor | CUA: 0 | UUA: 0 | UCA: 0 | |||||||
Sec | UCA: 0 | Sec | UCA: 0 | |||||||||||
W. nobilis (30) | G. gnemon (30) | |||||||||||||
Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | |||||
Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 1 | Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 1 | |||||
Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | |||||
Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | |||||
Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 0 | Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 1 | |||||
Ser | AGA: 0 | GGA: 1 | CGA: 1 | UGA: 1 | ACU: 0 | GCU: 1 | Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | |
Arg | ACG: 1 | GCG: 0 | CCG:1 | UCG: 0 | CCU: 0 | UCU: 1 | Arg | ACG: 1 | GCG: 0 | CCG:1 | UCG: 0 | CCU: 0 | UCU: 1 | |
Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | |
Phe | AAA: 0 | GAA: 1 | Phe | AAA: 0 | GAA: 1 | |||||||||
Asn | AUU: 0 | GUU: 1 | Asn | AUU: 0 | GUU: 1 | |||||||||
Lys | CUU: 0 | UUU: 1 | Lys | CUU: 0 | UUU: 1 | |||||||||
Asp | AUC: 0 | GUC: 1 | Asp | AUC: 0 | GUC: 1 | |||||||||
Glu | CUC: 0 | UUC: 2 | Glu | CUC: 0 | UUC: 2 | |||||||||
His | AUG: 0 | GUG: 1 | His | AUG: 0 | GUG: 1 | |||||||||
Gln | CUG: 0 | UUG: 1 | Gln | CUG: 0 | UUG: 1 | |||||||||
Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | |||||
Met | CAU: 2 | Met | CAU: 2 | |||||||||||
Tyr | AUA: 0 | GUA: 1 | Tyr | AUA: 0 | GUA: 1 | |||||||||
Cys | ACA: 0 | GCA: 1 | Cys | ACA: 0 | GCA: 1 | |||||||||
Trp | CCA: 1 | Trp | CCA: 1 | |||||||||||
Supressor | CUA: 0 | UUA: 0 | UCA: 0 | Supressor | CUA: 0 | UUA: 0 | UCA: 0 | |||||||
Sec | UCA: 0 | Sec | UCA: 0 | |||||||||||
R. piresii (29) | E. equisetina (25) | |||||||||||||
Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 0 | Ala | AGC: 0 | GGC: 0 | CGC: 0 | UGC: 1 | |||||
Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 1 | Gly | ACC: 0 | GCC: 1 | CCC: 0 | UCC: 0 | |||||
Pro | AGG: 0 | GGG: 1 | CGG: 0 | UGG: 1 | Pro | AGG: 0 | GGG: 0 | CGG: 0 | UGG: 1 | |||||
Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 1 | Thr | AGU: 0 | GGU: 1 | CGU: 0 | UGU: 0 | |||||
Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 1 | Val | AAC: 0 | GAC: 1 | CAC: 0 | UAC: 0 | |||||
Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | Ser | AGA: 0 | GGA: 1 | CGA: 0 | UGA: 1 | ACU: 0 | GCU: 1 | |
Arg | ACG: 1 | GCG: 0 | CCG:1 | UCG: 0 | CCU: 0 | UCU: 1 | Arg | ACG: 1 | GCG: 0 | CCG:0 | UCG: 0 | CCU: 0 | UCU: 1 | |
Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | Leu | AAG: 0 | GAG: 0 | CAG: 0 | UAG: 1 | CAA: 1 | UAA: 1 | |
Phe | AAA: 0 | GAA: 1 | Phe | AAA: 0 | GAA: 1 | |||||||||
Asn | AUU: 0 | GUU: 1 | Asn | AUU: 0 | GUU: 1 | |||||||||
Lys | CUU: 0 | UUU: 1 | Lys | CUU: 0 | UUU: 1 | |||||||||
Asp | AUC: 0 | GUC: 2 | Asp | AUC: 0 | GUC: 1 | |||||||||
Glu | CUC: 0 | UUC: 2 | Glu | CUC: 0 | UUC: 2 | |||||||||
His | AUG: 0 | GUG: 1 | His | AUG: 0 | GUG: 1 | |||||||||
Gln | CUG: 0 | UUG: 1 | Gln | CUG: 0 | UUG: 1 | |||||||||
Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | Ile | AAU: 0 | GAU: 0 | CAU: 1 | UAU: 0 | |||||
Met | CAU: 2 | Met | CAU: 2 | |||||||||||
Tyr | AUA: 0 | GUA: 1 | Tyr | AUA: 0 | GUA: 1 | |||||||||
Cys | ACA: 0 | GCA: 1 | Cys | ACA: 0 | GCA: 1 | |||||||||
Trp | CCA: 1 | Trp | CCA: 2 | |||||||||||
Supressor | CUA: 0 | UUA: 0 | UCA: 0 | Supressor | CUA: 0 | UUA: 0 | UCA: 0 | |||||||
Sec | UCA: 0 | Sec | UCA: 0 |
Conserved gymnosperm chloroplast tRNAs
The clover leaf-like secondary structure of a tRNA is shown in Fig. 4. In the study, we found that most tRNAs contain a “G” as the first nucleotide in the D-arm, except for tRNALys, tRNAMet, tRNAPro, tRNAThr, tRNATyr, and tRNAV al. “A” is present in the first and the last position of the D-loop apart from tRNAGly, tRNAIle, tRNALeu, tRNAMet, and tRNAGln. In addition, in the final two positions of the Ψ-arm, all of the tRNAs were found to have conserved “G-G” nucleotides, except for tRNAArg, tRNACys, tRNAPhe, and tRNAV al (Table 4). Small conserved consensus sequences were found in the Ψ region. To be specific, except for tRNASer, the Ψ-loop in tRNAs was found to contain a conserved sequence comprising U-U-C-N-A-N2 according to a multiple sequence alignment of 20 members of the tRNA gene family (Table 4).
![]() |
Notes:
Note that the consensus sequences are shown from 5′ to 3′. The asterisk mark (*) show the absence of conserved nucleotide consensus sequence in respective region of chloroplast tRNAs. 5′ AC-arm, 5′ Acceptor arm; ANC-arm, Anti-codon arm; ANC-loop, Anti-codon loop; Ψ-arm, Pseudouridine arm; Ψ-loop, Pseudouridine loop. The short lines under the bases in anticodon loop of tRNAIle are to indicate its possible modification.
Figure 4: Clover leaf-like structure of gymnosperms tRNA.
The tRNA contains the Acceptor arm (6–7 bp, dark green, >96% conserved), D-arm (3–4 bp, light blue, >65% conserved), D-loop (7–11 nt, purple, >80% conserved), Anti-codon arm (5 bp, dark blue, >75% conserved), anti-codon loop (7 nt, gray, >99% conserved), variable region (3–23 nt, orange, >45% conserved), Ψ-arm (5 bp, light purple, >97% conserved), and Ψ-loop (7 nt, green, >95% conserved). “% conservation” means the conservative ratio of base identities in each stem and loop structure of the whole set of gymnosperm tRNAs. Several tRNAs harbor the nucleotides of C-C-A tail.Diversification of tRNAs structures
The diverse arms and loops of tRNAs allow the regulation and control of protein translation. Each arm and loop has a specific nucleotide composition. Our analysis based on 373 tRNAs showed that the acceptor arm of chloroplast tRNAs contains 6 bp to 7 bp (Table S1). The D-arms were found to contain 3 or 4 bp generally, with a stable “G” in the initial position and “C” in the last position of the D-stem 5′ strand in most tRNAs (such as tRNA tRNAAla, tRNAAsn, tRNAAsp, tRNACys, tRNAGlu, tRNAHis, tRNAIle, and tRNAPhe). Most D-loops usually contain 7 to 11 nt with conserved “A” nucleotides at the two end locations. The anticodon arms of chloroplast tRNAs mainly contain 5 bp (90.4%). We found that 367 (about 99%) tRNAs contain 7 nt in their anticodon loop, thereby indicating that the sequence of the anticodon loop is highly conserved (Table 4, Table S1). The variable loops of different tRNAs contain 3 to 23nt, where those in tRNAAla, tRNAAsp, tRNAHis, tRNAPhe, and tRNAPro contain 5 bp (Table S1). The Ψ-arm contains 5 bp in most of the gymnosperm chloroplast tRNAs, except for tRNAAla and some of the tRNATrp, tRNAGly, tRNAThr, and tRNAArg in chloroplast. The Ψ-loops of most tRNAs contain 7 nt, apart from tRNAAla and several of tRNACys and tRNAThr (Table S1).
Gymnosperm chloroplast tRNAs derived from multiple common ancestors
The phylogenetic tree demonstrated the presence of three major clusters covering 64 groups and the different types of all tRNAs (as shown by the different strings in Fig. S1). We detected 37 groups in cluster I, five in cluster II, and 22 groups in cluster III. Cluster I contains tRNA tRNASer, tRNATyr, tRNAHis, tRNAGln, tRNAThr, tRNAPro, tRNAGly, tRNAMet, tRNAAsp, tRNAArg, tRNAAla, tRNACys, tRNALys, tRNAGlu, tRNAIle, tRNAAsn, tRNAV al, tRNALeu, and tRNATrp. Cluster II contains tRNAHis, tRNASer, tRNATyr, and tRNALeu. Cluster III contains tRNALeu, tRNAIle, tRNAGly, tRNAThr, tRNASer, tRNAV al, tRNAGlu, tRNALys, tRNACys, tRNAGln, tRNAHis, tRNAArg, tRNAPhe, tRNAAla, and tRNAMet (Fig. S1). tRNASer, tRNAHis, and tRNA Leu are present in cluster I but also in cluster II and cluster III, thereby suggesting that these tRNAs evolved from multiple lineages. Most of the tRNAs were found to form more than one group in the phylogenetic tree. In cluster I, the tRNAs that formed two groups in the phylogenetic tree were identified as tRNATyr, tRNAGln, tRNAMet, tRNAAsp, tRNAAla, tRNALys, tRNAIle, and tRNATrp, whereas those that clustered to form three groups were determined as tRNASer, tRNAPro, tRNAArg, tRNAGlu, tRNAAsn, tRNAV al, and tRNALeu. Moreover, tRNAThr clustered into four groups. In cluster II, tRNASer was found to form two groups. In cluster III, tRNAGly and tRNAV al were found to form two groups, whereas tRNAThr formed three groups, tRNAIle formed four groups. Some tRNAs in cluster III were found to group individually, where these tRNAs containing the anticodons C-G-A in tRNA tRNASer, U-U-C in tRNAGlu, U-U-U in tRNALys, G-C-A in tRNACys, U-U-G in tRNAGln, G-U-G in tRNAHis, U-C-U in tRNAArg, G-A-A in tRNAPhe, U-G-C in tRNAAla, and C-A-U in tRNAMet all grouped separately (Fig. S1). The multiple groupings of different tRNAs suggest that they evolved from multiple common ancestors. Furthermore, the tRNAs presented in cluster III, i.e., tRNAMet (CAU), tRNAThr (UGU, GGU), tRNAV al (UAC), tRNAAla (UGC), tRNAPhe (GAA), tRNAArg (UCU), tRNAHis (GUG), tRNAGln (UUG), tRNACys (GCA), tRNALys (UUU), tRNAGlu (UUC), tRNAIle (UAU), tRNAV al (GAC), tRNALeu (CAA), tRNAGly (UCC), tRNASer (CGA), tRNAGly (GCC), and tRNAIle (CAU), tended to be the most basic tRNAs and they had undergone gene duplication and diversification to generate other tRNA molecules.
C-A-U anticodon in tRNAIle
Our detailed genomic study showed that tRNAIle also encodes a C-A-U anticodon in addition to the presence of this typical anticodon in tRNAMet. In general, the C-A-U anticodon is recognized as a typical characteristic of tRNAMet and there is only one iso-acceptor. In particular, we found that the tRNAIle in T. mairei encodes two C-A-U anticodons, and C. debaoensis, S. verticillata, D. spinulosum, C. lanceolata, G. biloba, C. deodara, W. mirabilis, G. gnemon, R. piresii, E. equisetina, and W. nobilis also encode a C-A-U anticodon (Table 3, Data S1, Fig. S3).
From/To | A | U | C | G | From/To | A | U | C | G |
---|---|---|---|---|---|---|---|---|---|
Alanine | lysine | ||||||||
A | – | 12.50 | 12.50 | 0.00 | A | – | 1.47 | 1.47 | 22.06 |
U | 12.50 | – | 0.00 | 12.50 | U | 1.47 | – | 22.06 | 1.47 |
C | 12.50 | 0.00 | – | 12.50 | C | 1.47 | 22.06 | – | 1.47 |
G | 0.00 | 12.50 | 12.50 | – | G | 22.06 | 1.47 | 1.47 | – |
Arginine | Methionine | ||||||||
A | – | 2.93 | 2.93 | 19.13 | A | – | 3.86 | 3.86 | 17.27 |
U | 2.93 | – | 19.13 | 2.93 | U | 3.86 | – | 17.27 | 3.86 |
C | 2.93 | 19.13 | – | 2.93 | C | 3.86 | 17.27 | – | 3.86 |
G | 19.13 | 2.93 | 2.93 | – | G | 17.27 | 3.86 | 3.86 | – |
Asparagine | Phenylalanine | ||||||||
A | – | 1.12 | 1.12 | 22.75 | A | – | 1.21 | 1.21 | 22.58 |
U | 1.12 | – | 22.75 | 1.12 | U | 1.21 | – | 22.58 | 1.21 |
C | 1.12 | 22.75 | – | 1.12 | C | 1.21 | 22.58 | – | 1.21 |
G | 22.75 | 1.12 | 1.12 | – | G | 22.58 | 1.21 | 1.21 | – |
Aspartate | Proline | ||||||||
A | – | 0.00 | 0.00 | 25.00 | A | – | 1.53 | 1.53 | 21.95 |
U | 0.00 | – | 25.00 | 0.00 | U | 1.53 | – | 21.95 | 1.53 |
C | 0.00 | 25.00 | – | 0.00 | C | 1.53 | 21.95 | – | 1.53 |
G | 25.00 | 0.00 | 0.00 | – | G | 21.95 | 1.53 | 1.53 | – |
Cysteine | Serine | ||||||||
A | – | 2.75 | 2.75 | 19.50 | A | – | 5.16 | 5.16 | 14.68 |
U | 2.75 | – | 19.50 | 2.75 | U | 5.16 | – | 14.68 | 5.16 |
C | 2.75 | 19.50 | – | 2.75 | C | 5.16 | 14.68 | – | 5.16 |
G | 19.50 | 2.75 | 2.75 | – | G | 14.68 | 5.16 | 5.16 | – |
Glutamine | Threonine | ||||||||
A | – | 3.83 | 3.83 | 17.35 | A | – | 3.91 | 3.91 | 17.18 |
U | 3.83 | – | 17.35 | 3.83 | U | 3.91 | – | 17.18 | 3.91 |
C | 3.83 | 17.35 | – | 3.83 | C | 3.91 | 17.18 | – | 3.91 |
G | 17.35 | 3.83 | 3.83 | – | G | 17.18 | 3.91 | 3.91 | – |
Glutamate | Ttyptophan | ||||||||
A | – | 4.59 | 4.59 | 15.81 | A | – | 2.02 | 2.02 | 20.96 |
U | 4.59 | – | 15.81 | 4.59 | U | 2.02 | – | 20.96 | 2.02 |
C | 4.59 | 15.81 | – | 4.59 | C | 2.02 | 20.96 | – | 2.02 |
G | 15.81 | 4.59 | 4.59 | – | G | 20.96 | 2.02 | 2.02 | – |
Glycine | Tyrosine | ||||||||
A | – | 1.94 | 1.94 | 21.13 | A | – | 5.34 | 5.34 | 14.33 |
U | 1.94 | – | 21.13 | 1.94 | U | 5.34 | – | 14.33 | 5.34 |
C | 1.94 | 21.13 | – | 1.94 | C | 5.34 | 14.33 | – | 5.34 |
G | 21.13 | 1.94 | 1.94 | – | G | 14.33 | 5.34 | 5.34 | – |
Histidine | Valine | ||||||||
A | – | 1.22 | 1.22 | 22.56 | A | – | 1.73 | 1.73 | 21.54 |
U | 1.22 | – | 22.56 | 1.22 | U | 1.73 | – | 21.54 | 1.73 |
C | 1.22 | 22.56 | – | 1.22 | C | 1.73 | 21.54 | – | 1.73 |
G | 22.56 | 1.22 | 1.22 | – | G | 21.54 | 1.73 | 1.73 | – |
Isoleucine | Overrall | ||||||||
A | – | 4.72 | 4.72 | 15.56 | A | – | 4.81 | 4.81 | 15.38 |
U | 4.72 | – | 15.56 | 4.72 | U | 4.81 | – | 15.38 | 4.81 |
C | 4.72 | 15.56 | – | 4.72 | C | 4.81 | 15.38 | – | 4.81 |
G | 15.56 | 4.72 | 4.72 | – | G | 15.38 | 4.81 | 4.81 | – |
Leucine | |||||||||
From/To | A | U | C | G | |||||
A | – | 4.40 | 4.40 | 16.21 | |||||
U | 4.40 | – | 16.21 | 4.40 | |||||
C | 4.40 | 16.21 | – | 4.40 | |||||
G | 16.21 | 4.40 | 4.40 |
Transition/transversion of tRNAs
A previous study (Mohanta et al., 2019) showed that the evolutionary rates are almost equal for tRNAs with respect to transition and transversion despite the low probability of transition or transversion events in tRNAs. In this study, we identified several intriguing substitutions of gymnosperm chloroplast tRNAs. Overall, our analysis of the substitution rates detected using the whole set of chloroplast tRNAs showed that average transition rate (15.38) was significantly larger than the average transversion rate (4.81) with a ratio of 3:1 (Table 5). The same transition: transversion ratio bias was found in all the set of tRNAs for tRNASer, tRNAGlu, tRNATyr, tRNAIle, tRNAMet, tRNAGln, tRNAThr, and tRNALeu. The ratio was over 6:1 for tRNACys and tRNAArg. The transition rates for tRNATrp, tRNAV al, and tRNAGly were about 10 times higher than their transversion rates. These findings suggest that tRNASer, tRNAGlu, tRNATyr, tRNAIle, tRNAMet, tRNAGln, tRNAThr, tRNALeu, tRNACys, tRNAArg, tRNATrp, tRNAV al and tRNAGly underwent transition substitutions more readily than transversion substitutions during their evolution in gymnosperm chloroplast genomes. In addition, the transition rates in tRNALys and tRNAPro were about 15 times higher than their transversion rates. The transition rates in tRNAAsn, tRNAPhe, and tRNAHis were about 20 times higher than their transversion rates. These results indicate that tRNAs are much more likely to have undergone transition events rather than transversion events. The highest transversion rate of 12.50 was found in tRNAAla and the lowest transversion rate of 0.00 in tRNAAsp (Table 5). Correspondingly, tRNAAla lacks any transitions (Table 5).
tRNA duplication/loss events
In addition to transition and transversion events, gene duplication and loss events have played important roles in gene evolution. Our analysis of duplication and loss events indicated that 153 duplication events (duplication and conditional duplication) have occurred in all of the gymnosperm chloroplast tRNA genes investigated in this study (Fig. S2). In addition, 220 gymnosperm chloroplast tRNA gene loss events were detected (Table S2, Fig. S2). Thus, the loss of genes was slightly more frequent than their duplication for gymnosperm chloroplast tRNA genes.
Discussion
tRNAs are major genetic components of semi-autonomous chloroplasts and our analysis of gymnosperm chloroplast genomes showed that they have several basic conserved genomic features. The gymnosperm chloroplast genomes investigated in the present study were found to encode 28 to 33 tRNA isotypes, thereby indicating that there is substantial variation in the quantity of tRNAs in gymnosperm chloroplast genomes. The lack of tRNAAla in R. piresii and T. mairei, and the absence of tRNAV al in T. mairei were interesting. Thus, it is necessary to understand how the translation process is conducted in chloroplasts without these crucial tRNAs. According to previous studies (Treangen & Rocha, 2011; Mohanta et al., 2019), it is likely that the deficiency of these tRNAs is compensated for by the transfer of corresponding tRNAs from the nucleus or mitochondria. In addition to the absence of tRNAAla and tRNAV al, all of the gymnosperm plants were shown to not encode selenocysteine tRNA and its suppressor tRNA in their chloroplast genomes (Table 2). Selenocysteine tRNA and its suppressor tRNA were also not detected in the chloroplast of Oryza sativa (Mohanta & Bae, 2017).
In addition to the presence of C-A-U anticodon in tRNAMet, we found that tRNA-CAU is present in tRNAIle (Table 3). Similarly, the C-A-U anticodon was detected in tRNAIle in Bacillus subtilis (Ehrenberg) Cohn and spinach (Kashdan & Dudock, 1982; Köhrer et al., 2014). The possible mechanism that governs the specificity of this amino acid may involve modification of the wobble position in the anticodon by a tRNA-modifying enzyme. Chloroplasts originate from bacteria so the tRNA modifications found in bacteria may also occur in chloroplast tRNAs. In bacteria, the tRNA-modifying enzyme TilS can convert the 5′-C residue in the CAU anticodon of specific tRNAIle molecules into lysidine to decode 5′-AUA (Ile) codons instead of 5′-AUG (Met) codons (Soma et al., 2003). In addition, when lysidine decodes isoleucine, the tautomer form of lysidine provides compatible hydrogen bond donor–acceptor sites to allow base pairing with “A” and this may help to the recognition of the codon AUA instead of AUG (Sonawane & Tewari, 2008; Sambhare et al., 2014). The absence of tRNAIle-lysidine synthetase leads to a failure to modify C34 to lysidine in tRNAIle (LAU) (i.e., the synthesis of CAU-tRNAIle) and this inactivates the translation of AUA codons (Köhrer et al., 2014).
During protein coding, a certain species or gene tends to use one or more specific synonym codons, which is referred to as codon usage bias (Comeron & Aguadé, 1998; Rota-Stabelli et al., 2012). In the present study, tRNAArg-CCG was found to be present in the genomes of nine species but absent from C. lanceolata, T. mairei, and E. equisetina. Similarly, tRNAGly-UCC was shown to be absent from the chloroplast genomes of C. debaoensis, S. verticillata, D. spinulosum, C. lanceolata, T. mairei, and E. equisetina (Table 3). These results suggest that gymnosperm chloroplast tRNA genes are characterized by codon usage bias (Wei & Jin, 2017; Li et al., 2015).
In general, the secondary structure of tRNAs is characterized as clover leaf-like, except for a few tRNAs with unusual secondary structures (Jühling et al., 2018). In our study, we identified clover leaf-like tRNAs with expanded variable loop regions (Figs. 1 and 2). Numerous tRNALeu, tRNASer, and tRNATyr were found to have specific variable loop configurations in terms of length and structure, suggesting significant structural variation among chloroplast tRNAs. It is interesting to note that there were also stem-loop structures in variable regions of certain tRNAs in cyanobacteria. This might indicate that similar structural variations exist between chloroplast tRNAs and cyanobacterial tRNAs (Mohanta et al., 2017). Future studies will have to determine the biological importance of these variant tRNAs. The novel tRNA structure lacking the D arm might play some other significative functions in the translation progress and additional research is necessary to elucidate its exact function and mechanisms. Most tRNAs have a clover-like structure formed by complementary base pairing between small segments (Hubert et al., 1998; Florentz, 2002). Previous studies have showed that the acceptor arm of tRNAs in chloroplasts contain 7 bp to 9 bp, the D-arm contains 3 bp to 4 bp, the D-loop has 4 nt to 12 nt, the anticodon arm has 5 bp, the anticodon loop contains 7 nt, the variable region comprises 4 nt to 23 nt, and Ψ-arm contains 5 bp, and the Ψ-loop has 7 nt (Wilusz, 2015; Mohanta & Bae, 2017; Mohanta et al., 2019). In the present study, we found that the acceptor arm of chloroplast tRNAs contains 6 bp to 7 bp in 373 tRNAs, where the D-arm has 3 bp or 4 bp and the D-loop usually contains 7 nt to 11 nt. The anticodon loop of gymnosperm chloroplast tRNAs generally contains 7 nt, and thus the sequence of the anticodon loop is typically conserved (Table 4, Table S1). The variable loop of different tRNAs contain 3 nt to 23 nt (Table S1). The Ψ-arm of gymnosperm chloroplast tRNAs generally contains 5 bp and the Ψ-loop has 7 nt (Table S1). Our results are consistent with previous findings (Wilusz, 2015; Mohanta & Bae, 2017) and they suggest that chloroplast RNAs are significantly conserved. The consensus sequence “U-U-C-N-A-N2” was found in the Ψ region (Table 4). Previous studies also reported the existence of a similar sequence in the Ψ-loop of tRNAs in Oryza sariva and Cyanobacteria (Mohanta & Bae, 2017; Mohanta et al., 2017). This suggests that the consensus “U-U-C-N-A-N2” motif of the Ψ region, identified here and in previous analyses, is a general consensus motif of canonical tRNAs.
Our phylogenetic analysis detected three clear clusters and many tRNA groups. Some tRNAs (tRNASer, tRNAHis, and tRNALeu) in cluster I and cluster II were also in cluster III, thereby indicating that these tRNAs evolved from multiple lineages by gene duplication and gene divergence. Moreover, anticodon types comprising CGA, UUC, UUU, GCA, UUG, GUG, UCU, UGC, and CAU appeared several times in the phylogenetic tree, and thus the corresponding tRNAs evolved from multiple common ancestors. The overlapping of tRNAs groups demonstrates that these tRNAs might have diverse common ancestors in the evolutionary process (Mohanta & Bae, 2017). Phylogenetic analysis also showed that tRNAMet (CAU), tRNAThr (UGU, GGU), tRNAV al (UAC), tRNAAla (UGC), tRNAPhe (GAA), tRNAArg (UCU), tRNAHis (GUG), tRNAGln (UUG), tRNACys (GCA), tRNALys (UUU), tRNAGlu (UUC), tRNAIle (UAU), tRNAV al (GAC), tRNALeu (CAA), tRNAGly (UCC), tRNASer (CGA), tRNAGly (GCC), and tRNAIle (CAU) in cluster III tended to be the most basic tRNAs, whereas tRNAMet tended to be the most original tRNA. Overall, the results clearly indicate that the tRNAs encoded in gymnosperm chloroplast genomes have multiple common evolutionary ancestors.
Our results also provided insights into the gene substitution rates in gymnosperm chloroplast tRNAs. Overall, the average transition rate for tRNAs was greater than the transversion rate, where the relationship was about 3:1 (Table 5). In all of the chloroplast tRNAs, the average transition rate was slightly higher than the average transversion rate, thereby indicating that chloroplast tRNAs have unequal substitution rates.
In addition to the transition and transversion events in tRNAs, loss and duplication events have played significant roles in the evolution of tRNAs in gymnosperm chloroplast genomes (He & Zhang, 2006; Magadum et al., 2013). In general, the gene loss events tended to occur after whole genome duplication events. We found 153 duplication events and 220 loss events in gymnosperm chloroplast tRNAs, and thus loss events have occurred slightly more frequently than duplication events (Table S2).
Conclusions
Our basic structure analysis showed that gymnosperm chloroplast genomes encode 25 to 30 anticodon-specific tRNAs. The acceptor arm of chloroplast tRNA contains 6 bp to 7 bp, the D-arm has 3 bp or 4 bp, the D-loop contains 7 nt to 11 nt mainly, and the anticodon loop usually contains 7 nt. In different tRNAs, the variable loop contains 3 nt to 23 nt. The Ψ-arm contains a conserved sequence comprising U-U-C-N-A-N2. tRNAAla was absent from R. piresii and T. mairei, and tRNAV al was lacking in T. mairei. Gymnosperm chloroplasts do not encode selenocysteine tRNA and its suppressor tRNA in their genomes. A CAU anticodon is encoded in tRNAMet as well as in tRNAIle. A novel tRNA structure lacking the D arm was identified for the chloroplast tRNAGly of W. nobilis. Numerous tRNALeu, tRNASer, and tRNATyr types were found to have expanded variable regions. Phylogenetic analysis showed that tRNAs might have multiple common ancestors in the evolutionary process. Different tRNAs harbored their own transition/transversion rates, i.e., it was iso-acceptor specific. And the transition rate was generally higher than the transversion rate. Furthermore, gene loss events (220) have occurred slightly more frequently than gene duplication events (153) in gymnosperm chloroplast tRNAs. Our results provide new insights into the evolution of gymnosperm chloroplast tRNAs and their diverse roles.
Supplemental Information
tRNA sequences of gymnosperms chloroplast genome in the study
The phylogenetic tree constructed by 373 tRNAs
Multiply tRNAs are shown by different colors. Different groups are marked by different strings. The phylogenetic clades with low bootstrap replicates were collapsed with 50% cutoff values. Phylogenetic analysis illustrates that Gymnosperm chloroplast tRNA derived from common multiple ancestors.
The loss and duplication tree
153 duplication events (duplication and conditional duplication) are detected in all of the gymnosperm chloroplast tRNA genes, and gene loss events are detected with 220. Blue: Duplication events; Gray: Loss events; D: Duplication node; cD: Conditional Duplication node.
tRNA gene content in analyzed gymnosperms chloroplast genome
The tRNA genes are shown in the left (top to bottom). Boxes in light green, dark green, and white represent one copy of tRNA genes, two copies of tRNA genes, and the absence of tRNA genes.
Nucleotide composition in different parts of clover-structure of chloroplast genome tRNA
The acceptor arm of chloroplast tRNAs contains 3 bp to 7 bp, where 357 have 7 bp, 13 have 6 bp, and the remaining tRNAs contain no more than 5 bp. The anticodon arms of chloroplast tRNAs mainly contain 5 bp. The anticodon loop of gymnosperm chloroplast tRNAs generally contains 7 nt, and thus the sequence of the anticodon loop is typically conserved.
Loss events of chloroplast genomic tRNAs
220 loss events and 153 duplication events are detected in gymnosperm chloroplast tRNAs, and loss events have occurred slightly more frequently than duplication events.