Strategies and difficulties in assembling highly recombinogenic plant organelle genomes: a case study

Centro di ricerca per l'orticoltura (CREA-ORT), Consiglio per la ricerca in agricoltura e l' analisi dell' economia agraria, Pontecagnano Faiano, Italy
Istituto di Bioscienze e BioRisorse (CNR-IBBR), Consiglio Nazionale delle Ricerche, Portici, Italy
DOI
10.7287/peerj.preprints.1599v1
Subject Areas
Bioinformatics, Computational Biology
Keywords
plant mitochondrial DNA, plant organelle genomes, recombinant genome, NGS, potato mtDNA, PacBio, Illumina, de novo assembly, hybrid assembly, repetitive sequences
Copyright
© 2015 Cantarella et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Cantarella C, Tamburino R, Scotti N, Cardi T, D'Agostino N. 2015. Strategies and difficulties in assembling highly recombinogenic plant organelle genomes: a case study. PeerJ PrePrints 3:e1599v1

Abstract

Mitochondrial genomes in plants are larger and more complex than in other eukaryotes due to their recombinogenic nature as widely demonstrated. The mitochondrial DNA (mtDNA) is usually represented as a single circular map, the so-called master molecule. This molecule includes repeated sequences, some of which are able to recombine, generating sub-genomic molecules in various amounts, depending on the balance between their recombination and replication rates. Recent advances in DNA sequencing technology gave a huge boost to plant mitochondrial genome projects. Conventional approaches to mitochondrial genome sequencing involve extraction and enrichment of mitochondrial DNA, cloning, and sequencing. Large repeats and the dynamic mitochondrial genome organization complicate de novo sequence assembly from short reads. The PacBio RS long-read sequencing platform offers the promise of increased read length and unbiased genome coverage and thus the potential to produce genome sequence data of a finished quality (fewer gaps and longer contigs). However, recently published articles revealed that PacBio sequencing is still not sufficient to address mtDNA assembly-related issues. Here we present a preliminary hybrid assembly of a potato mtDNA based on both PacBio and Illumina reads and debate the strategies and obstacles in assembling genomes containing repeated sequences that are recombinationally active and serve as a constant source of rearrangements.

Author Comment

This is an abstract of the presentation at the BBCC2015 conference.