Embracing heterogeneity: Building the Tree of Life and the future of phylogenomics

Department of Organismic Biology and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, United States
Gothenburg Global Biodiversity Centre, Göteborg, Sweden
Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
Gothenburg Botanical Garden, Göteborg, Sweden
Department of Computer and Information Science, Linköping University, Linköping, Sweden
Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
Institut de Biologie, Université de Neuchâtel, Neuchâtel, Switzerland
Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
Centre for Ecological and Evolutionary Synthesis, University of Oslo, Oslo, Norway
Institut de Biologie, Ecole Normale Supérieure de Paris, Paris, France
Department of Computer Science, Rice University, Houston, TX, United States
Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
Department of Biology, Lund University, Lund, Sweden
Coordenação de Biodiversidade, Programa de Coleções Científicas Biológicas, Instituto Nacional de Pesquisa da Amazônia, Manaus, AM, Brazil
Department of Computer Science, Rutgers University, Piscataway, NJ, USA
School of Life Sciences, University of Kwazulu-Natal, Pietermaritzburg, South Africa
Gothenburg Centre for Advanced Studies in Science and Technology, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
DOI
10.7287/peerj.preprints.26449v3
Subject Areas
Biodiversity, Computational Biology, Evolutionary Studies, Genomics
Keywords
gene flow, genome, multispecies coalescent model, retroelement, speciation, transcriptome
Copyright
© 2018 Bravo et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Bravo GA, Antonelli A, Bacon CD, Bartoszek K, Blom M, Huynh S, Jones G, Knowles LL, Lamichhaney S, Marcussen T, Morlon H, Nakhleh L, Oxelman B, Pfeil B, Schliep A, Wahlberg N, Werneck F, Wiedenhoeft J, Willows-Munro S, Edwards SV. 2018. Embracing heterogeneity: Building the Tree of Life and the future of phylogenomics. PeerJ Preprints 6:e26449v3

Abstract

Building the Tree of Life (ToL) is a major challenge of modern biology, requiring major advances in cyberinfrastructure, data collection, theory, and more. Here, we argue that phylogenomics stands to benefit by embracing the many heterogeneous genomic signals emerging from the first decade of large-scale phylogenetic analysis spawned by High-throughput sequencing (HTS). Such signals include those most commonly encountered in phylogenomic datasets, such as incomplete lineage sorting, but also those reticulate processes emerging with greater frequency, such as recombination and introgression. We suggest that methods of data acquisition and the types of markers used in phylogenomics will remain restricted until a posteriori methods of marker choice are made possible with routine whole-genome sequencing of taxa of interest. We discuss limitations and potential extensions of a major model supporting innovation in phylogenomics today, the multispecies coalescent model. Macroevolutionary models that use phylogenies, such as character mapping, often ignore the heterogeneity on which building phylogenies increasingly rely, and suggest that assimilating such heterogeneity is an important goal moving forward. Finally, we argue that an integrative cyberinfrastructure linking all steps of the process of building the ToL, from specimen acquisition in the field to publication and tracking of phylogenomic data, as well as a culture that values contributors to each step, are essential for progress.

Author Comment

This version contains an updated version of Supplementary Table S1 and Figure 2 after removal of two data sets. Also, it contains other minor edits in the references and a section on Heterozygosity and Intra-Individual Site Polymorphisms following suggestions made by Thomas Couvreur and Tobias Andermann.

Supplemental Information

Supplementary Table S1: Information on number of species, number of loci, and data set size contained in 164 phylogenomic data sets

Each row represents a data set included in Figure 2. For further details on how this table was built, please see section "Compilation of data in Supplementary Table S1" in the Supplementary Material.

DOI: 10.7287/peerj.preprints.26449v3/supp-1

Supplementary Materials: This file describes the compilation of data contained in Supplementary Table S1 and in Figure 1

DOI: 10.7287/peerj.preprints.26449v3/supp-2