Phylogenomics of 42 tomato chloroplasts using assembly and alignment-free method

CREG - Depto. Cs. Biológicas - Fac. Cs. Exactas, Universidad Nacional de La Plata, La Plata (1900), Provincia de Buenos Aires, Argentina
DOI
10.7287/peerj.preprints.3271v1
Subject Areas
Bioinformatics, Computational Biology
Keywords
Phylogenomics, AAF method, tomato, chloroplast, k-mers
Copyright
© 2017 Amado Cattáneo et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Amado Cattáneo R, Diambra L, McCarthy AN. 2017. Phylogenomics of 42 tomato chloroplasts using assembly and alignment-free method. PeerJ Preprints 5:e3271v1

Abstract

Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are based on the comparison of single DNA sequences, or a concatenation of a number of these. However, with the advent of next-generation DNA sequencing technologies, the approaches that consider large genomic data sets are of growing importance for the elucidation of evolutionary relationships among species. Among these approaches, the assembly and alignment-free methods which allow an efficient distance computation and phylogeny reconstruction are of great importance. However, it is not yet clear under what quality conditions and abundance of genomic data such methods are able to infer phylogenies accurately. In the present study we assess the method originally proposed by Fan et al. for whole genome data, in the elucidation of Tomatoes' chloroplast phylogenetics using short read sequences. We find that this assembly and alignment-free method is capable of reproducing previous results under conditions of high coverage, given that low frequency k-mers (i.e. error prone data) are effectively filter out. Finally, we present a complete chloroplast phylogeny for the best data quality candidates of the recently published 360 tomato genomes.

Author Comment

This preprint is a final version of the authors research article manuscript, as ready for journal submission.