Recent mobility of plastid encoded group II introns and twintrons in five strains of the unicellular red alga Porphyridium
A peer-reviewed article of this Preprint also exists.
Author and article information
Abstract
Group II introns are closely linked to eukaryote evolution because nuclear spliceosomal introns and the small RNAs associated with the spliceosome are thought to trace their ancient origins to these mobile elements. Therefore, elucidating how group II introns move, and how they lose mobility can potentially shed light on fundamental aspects of eukaryote biology. To this end, we studied five strains of the unicellular red alga Porphyridium purpureum that surprisingly contain 42 group II introns in their plastid genomes. We focused on a subset of these introns that encode mobility-conferring intron-encoded proteins (IEPs) and found them to be distributed among the strains in a lineage-specific manner. The reverse transcriptase and maturase domains were present in all lineages but the DNA endonuclease domain was deleted in vertically inherited introns, demonstrating a key step in the loss of mobility. P. purpureum plastid intron RNAs had a classic group IIB secondary structure despite variability in the DIII and DVI domains. We report for the first time the presence of twintrons (introns-within-introns, derived from the same mobile element) in Rhodophyta. The P. purpureum IEPs and their mobile introns provide a valuable model for the study of mobile retroelements in eukaryotes and offer promise for biotechnological applications.
Cite this as
2014. Recent mobility of plastid encoded group II introns and twintrons in five strains of the unicellular red alga Porphyridium. PeerJ PrePrints 2:e729v1 https://doi.org/10.7287/peerj.preprints.729v1Author comment
This is a submission to PeerJ for review. Our results provide three major insights into group II intron evolution: 1. Among P. purpureum strains, some mobile group II introns lost their IEP either by complete deletion, partial degeneration, or point mutation, providing direct evidence of events that lead to mobility-impaired introns that are vertically inherited.2. The undamaged IEPs retain mobility and are located in different genes in the plastid genomes. These include the intron encoding the mat1f IEP that has created two different twintron combinations, the first to be found in red algae. Our results suggest that after the initial invasion, outer introns that encode an IEP themselves underwent intron insertion by the same mobile group II intron at a conserved target site in domain DIV.3. We find a sister group relationship between red algal and cryptophyte IEPs, suggesting that the latter may arisen from red algae via plastid secondary endosymbiosis.
Sections
Supplemental Information
Nucleotide alignment of P. purpureum plastid introns
Boundaries used to determine homology are indicated in red (DI stem, DIV stem, DV and DVI stem, respectively). The IEP coding sequences are in yellow. Additional group II introns with degenerate IEPs (i.e., psbN-psbT, int.a rpoB, int mntA, int.b rpoC2) added to analysis are included.
Draft P. purpureum intron structure (intergenic region between atpB-atpE, mat1a IEP)
Only DIV, DV, and DVI were identified.
Alignment of P. purpureum intron-encoded protein domains
The four identified domains are separated by an artificial five amino acid gap. The unboxed 5’ sequence comprises the reverse transcriptase (RT) domain. The maturase (X) domain is boxed in black, the DNA-binding (D) domain in red and the endonuclease (En) domain in blue. The D and En domains are partial or absent in four IEPs (mat1a, mat1b, mat1c and mat1e). Asterisks are placed above the YADD domain.
Figure S4. P. purpureum group II intron/IEP alignment
Alignment of 14 P. purpureum intron/intergenic regions containing an IEP/IEP remnant and four Rhodomonas salina introns. Secondary structures from each domain (DI-DVI) are marked and represented by different colors. The dnaK intron (containing mat1b) does not retain a group IIB intron structure. A partial structure was determined for the atpB-atpE intergenic region (containing mat1a). All the IEPs or IEP remnants are located in domain IV, including the R. salina introns (previously described as the only case of group II intron IEPs located outside of DIV). Twintron insertion sites are indicated with asterisks.
Figure S5. Draft P. purpureum intron structure (int rpoC1, mat1g IEP)
Figure S6. Draft P. purpureum intron structure (int tsf, mat1i IEP)
Figure S7. Draft P. purpureum intron structure (int ycf46, mat1h IEP)
. Draft P. purpureum intron structure (int.a rpoC2, mat1e IEP)
Figure S9. Draft P. purpureum intron structure (int gltB, mat1d IEP)
Figure S10. Draft P. purpureum intron structure (int atpB, mat1f IEP)
Figure S11. Draft P. purpureum intron structure (int.b atpI, ORF remnant and outer twintron)
Figure S12. Draft P. purpureum intron structure (int.b rpoC2, IEP remnant and outer twintron)
Figure S13. Draft P. purpureum intron structure (int.c infC, mat1c IEP)
Figure S14. Draft P. purpureum intron structure (intergene psbN-psbT, IEP remnant)
Figure S15. Draft P. purpureum intron structure (int.a rpoB, IEP remnant)
Figure S16. Draft P. purpureum intron structure (int mntA, IEP remnant)
Figure S17. Nucleotide alignment of the exon and intron binding sites
The P. purpureum EBS and IBS pairings are unique to each intron/IEP. The complementarity between both is generally preserved; if not, the mutation is located in the 5’ region. EBS1 and/or EBS2 were not identified for the mat1a, mat1b, and mat1c introns. “Ghost” refers to remnant IEPs.
Figure S18. Modified Rhodomonas salina group II intron secondary structure (groEL gene, strain CCMP 1178)
Only domains II, III and, IV were modified on the original structure published in Khan et al. (2007).
Figure S20. Modified Rhodomonas salina group II intron secondary structure (intron 1, groEL gene, strain CCMP 2045)
The domains II, III and IV were modified on the original structure designed by Khan et al. (2007).
Figure S20. Modified Rhodomonas salina group II intron secondary structure (intron 2, groEL gene, strain CCMP 2045)
The domains II, III and IV were modified on the original structure designed by Khan et al. (2007).
Figure S21. Modified Rhodomonas salina group II intron secondary structure (psbN gene, strain CCMP 1319)
The domains III and IV were modified on the original structure designed by Khan et al., (2007).
Figure S22. Modified Rhodomonas salina group II intron secondary structure (groEL gene, strain Maier)
The domains I, II, III, IV and VI were modified on the original structure designed by Maier et al. (1995).
Figure S23. Domain IV primary binding site
The binding sites of the maturases were determined by comparing sequence alignments. The stem-loop structure from a purine-rich internal loop is framed in white, whereas the start codon is framed in black.
Table S1. Group II intron sequences used in analysis
Sequences used to guide secondary structure homology search and included in phylogenetic analyses of P. purpureum group II introns.
Additional Information
Competing Interests
The authors declare they have no competing interests.
Author Contributions
Marie-Mathilde Perrineau conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Dana C Price performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Georg Mohr performed the experiments, analyzed the data, prepared figures and/or tables.
Debashish Bhattacharya conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
DNA Deposition
The following information was supplied regarding the deposition of DNA sequences:
The group II intron/IEP sequences described here are accessible using GenBank accession numbers KKJ826367 to KKJ826395.
Funding
The work was funded by a grant from the National Science Foundation (1004213) and from the United States Department of Energy (DE-EE0003373/001) awarded to DB. Research by GM is supported by NIH grant GM37949 and Welch Foundation grant F-1607 to Alan M. Lambowitz. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.