Comparative genomics and phylogenetic discordance of cultivated tomato and close wild relatives

Boyce Thompson Institute for Plant Research, Ithaca, NY, USA
Keygene Inc., Rockville, MD, USA
Department of Plant Pathology and Plant-Microbe Biology, Cornell University, Ithaca, NY, USA
DOI
10.7287/peerj.preprints.377v1
Subject Areas
Agricultural Science, Bioinformatics, Evolutionary Studies, Genomics, Plant Science
Keywords
tomato, phylogeny, Solanum, genome, incomplete lineage sorting, introgression, selection
Copyright
© 2014 Strickler et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Strickler SR, Bombarely A, Munkvold JD, Menda N, Martin GB, Mueller LA. 2014. Comparative genomics and phylogenetic discordance of cultivated tomato and close wild relatives. PeerJ PrePrints 2:e377v1

Abstract

Background Studies of ancestry are difficult in tomato because it crosses with many wild relatives and species in the tomato clade have diverged very recently. As a result, the phylogeny in relation to its closest relatives remains uncertain. By using coding sequence from Solanum lycopericum, S. galapagense, S. pimpinellifolium, S. corneliomuelleri, and S. tuberosum and genomic sequence from two of cultivated tomato’s closest relatives, S. galapagense and S. pimpinellifolium, as well as an heirloom line, S. lycopersicum ‘Yellow Pear’, we have aimed to resolve the phylogenies of these closely related species as well as identify phylogenetic discordance in the reference cultivated tomato. Results Divergence date estimates suggest divergence of S. lycopersicum, S. galapagense, and S. pimpinellifolium happened less than 0.5 MYA. Phylogenies based on 8,857 coding sequences support grouping of S. lycopersicum and S. galapagense, although two secondary trees are also highly represented. A total of 29 genes in our analysis showed evidence of selection along the S. lycopersicum lineage. Whole genome phylogenies showed that while incongruence is prevalent in genomic comparisons between these accessions, likely as a result of incomplete lineage sorting and introgression, a primary phylogenetic history was strongly supported. Conclusions Based on analysis of these accessions, S. galapagense appears to be closely related to S. lycopersicum, suggesting they had a common ancestor prior to the arrival of an S. galapagense ancestor to the Galápagos Islands, but after divergence of the sequenced S. pimpinellifolium. Genes showing selection along the S. lycopersicum lineage may be important in domestication. Further analysis of intraspecific data in these species will help to establish the evolutionary history of cultivated tomato. The use of an heirloom line is helpful in deducing true phylogenetic information of S. lycopersicum and identifying regions of introgression from wild species.

Supplemental Information

A figure of SNPs and read coverage over all chromosomes for YP-1, S. galapagense, and S. pimpinellifolium.

DOI: 10.7287/peerj.preprints.377v1/supp-1

A table of putative gaps greater than 20 bp in YP-1, S. galapagense, and S. pimpinellifolium.

DOI: 10.7287/peerj.preprints.377v1/supp-2

A table of putatively missing genes in YP-1, S. galapagense, and S. pimpinellifolium.

DOI: 10.7287/peerj.preprints.377v1/supp-3

A table of putative insertions in S. pimpinellifolium.

DOI: 10.7287/peerj.preprints.377v1/supp-4

A vcf of H1706 SNPS and indels.

DOI: 10.7287/peerj.preprints.377v1/supp-5

A graph of phylogenetic topologies over all other chromosomes.

DOI: 10.7287/peerj.preprints.377v1/supp-6

De novo assembly mapping results.

DOI: 10.7287/peerj.preprints.377v1/supp-7

A list of putative gaps in sequenced accessions.

DOI: 10.7287/peerj.preprints.377v1/supp-8

A list of deleted genes and predicted functions.

DOI: 10.7287/peerj.preprints.377v1/supp-9

Putative insertions in LA1589.

DOI: 10.7287/peerj.preprints.377v1/supp-10

PAML site-branch test analysis results.

DOI: 10.7287/peerj.preprints.377v1/supp-11

Whole genome phylogeny results for all chromosomes.

DOI: 10.7287/peerj.preprints.377v1/supp-13

Genes predicted to be in introgressions in H1706 genome.

DOI: 10.7287/peerj.preprints.377v1/supp-14