Does more sequence data improve estimates of galliform phylogeny? Analyses of a rapid radiation using a complete data matrix
- Published
- Accepted
- Subject Areas
- Evolutionary Studies, Taxonomy, Zoology
- Keywords
- sampling strategies, data matrix size, rapid radiation, Galliformes
- Copyright
- © 2013 Kimball et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- Cite this article
- 2013. Does more sequence data improve estimates of galliform phylogeny? Analyses of a rapid radiation using a complete data matrix. PeerJ PrePrints 1:e131v1 https://doi.org/10.7287/peerj.preprints.131v1
Abstract
The resolution of rapid evolutionary radiations or “bushes” in the tree of life has been one of the most difficult and interesting problems in phylogenetics. The avian order Galliformes appears to have undergone several rapid radiations that have limited the resolution of prior studies and obscured the position of taxa important both agriculturally and as model systems (chicken, turkey, Japanese quail). Here we present analyses of a multi-locus data matrix comprising over 15,000 sites, primarily from nuclear introns but also including three mitochondrial regions, from 46 galliform taxa with all gene regions sampled for all taxa. The increased sampling of unlinked nuclear genes provided strong bootstrap support for all but a small number of relationships. Coalescent-based methods to combine individual gene trees and analyses of datasets independent of published data indicated that this well-supported topology is likely to reflect the galliform species tree. Some of the key findings include support for a second major clade within the core phasianids that includes the chicken and Japanese quail and clarification of the phylogenetic relationships of turkey. Jackknifed datasets suggested that there is an advantage to sampling many independent regions across the genome rather than obtaining long sequences for a small number of loci, possibly reflecting the differences among gene trees that differ due to incomplete lineage sorting. Despite the novel insights we obtained using this increased sampling of gene regions, some nodes remain unresolved, likely due to periods of rapid diversification. Resolving these remaining groups will likely require sequencing a very large number of gene regions, but our analyses now appear to support a robust backbone for this order.