Targeted NGS for species level phylogenomics: “made to measure” or “one size fits all”?
- Published
- Accepted
- Subject Areas
- Bioinformatics, Evolutionary Studies, Plant Science
- Keywords
- Ericaceae, hybridization enrichment, marker development, next-generation sequencing, phylogeny, targeted sequence capture, target enrichment, transcriptome
- Copyright
- © 2017 Kadlec et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2017. Targeted NGS for species level phylogenomics: “made to measure” or “one size fits all”? PeerJ Preprints 5:e2763v3 https://doi.org/10.7287/peerj.preprints.2763v3
Abstract
Targeted high-throughput sequencing using hybrid-enrichment offers a promising source of data for inferring multiple, meaningfully resolved, independent gene trees suitable to address challenging phylogenetic problems in species complexes and rapid radiations. The targets in question can either be adopted directly from more or less universal tools, or custom made for particular clades at considerably greater effort. We applied custom made scripts to select sets of homologous sequence markers from transcriptome and WGS data for use in the flowering plant genus Erica (Ericaceae). We compared the resulting targets to those that would be selected both using different available tools (Hyb-Seq; MarkerMiner), and when optimising for broader clades of more distantly related taxa (Ericales; eudicots). Approaches comparing more divergent genomes (including MarkerMiner, irrespective of input data) delivered fewer and shorter potential markers than those targeted for Erica. The latter may nevertheless be effective for sequence capture across the wider family Ericaceae. We tested the targets delivered by our scripts by obtaining an empirical dataset. The resulting sequence variation was lower than that of standard nuclear ribosomal markers (that in Erica fail to deliver a well resolved gene tree), confirming the importance of maximising the lengths of individual markers. We conclude that rather than searching for “one size fits all” universal markers, we should improve and make more accessible the tools necessary for developing “made to measure” ones.
Author Comment
This is a preprint of a paper currently in review. It differs from the previous version in various revisions made in response to peer review, most notably in further documentation of the previously provided scripts and changes to the selection and presentation of phylogenetic trees in Fig. 6.
Supplemental Information
Supplementary data 1: Exons sequences corresponding to the 134 markers selected for the empirical study and the complete pools of marker selected using each of the methods compared (fasta format)
Summary table of markers
Supplementary data 3 - Table documenting markers as represented in Supplementary data 1-4.
Gene trees from empirical data
Supplementary data 4 - Gene trees inferred under RAxML