This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Allopolyploidy combines two progenitor genomes in the same nucleus, and is a common mechanism for producing new species, especially in plants. Deciphering the origins of polyploid species is a complex problem, due to, among other things, extinct progenitors, multiple origins, gene flow between different polyploid populations, and loss of parental contributions through gene or chromosome loss. In this work,
we studied three allopolyploid species in the genus Glycine, which includes the cultivated soybean (G. max). Previous work based on two nuclear sequences showed that these allopolyploids combine the genomes of extant diploid species in the G. tomentella complex. We use several phylogenetic and population genomics approaches to clarify the origin of these species using single nucleotide polymorphism data and a guided transcriptome assembly. The results support the hypothesis that each of the three polyploid species are fixed hybrids combining the homoeologous genomes of its two putative parents. Based on mapping to the soybean reference genome, there appear to be no large regions for which one homoeologous contribution is missing. Phylogenetic analyses of 27 selected transcripts using a coalescent approach also indicates multiple origins for G. tomentella polyploid species, and suggest that origins occurred within the last several hundred thousands years.