VCF2PopTree: a one-click client-side software to construct population phylogeny from genome-wide SNPs
- Published
- Accepted
- Subject Areas
- Bioinformatics, Evolutionary Studies, Genetics, Genomics, Population Biology
- Keywords
- VCF, Phylogeny, UPGMA, Neighbour-Joining, MEGA and PHYLIP, Whole genome, SNP, MEGA, PHYLIP, Neighbor-joining
- Copyright
- © 2019 Subramanian et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2019. VCF2PopTree: a one-click client-side software to construct population phylogeny from genome-wide SNPs. PeerJ Preprints 7:e27682v2 https://doi.org/10.7287/peerj.preprints.27682v2
Abstract
In the past decades a number of software programs have been developed to deduce the phylogenetic relationship between populations. However, these programs are not suited for large-scale whole genome data. Recently, a few standalone or web applications have been developed to handle genome-wide data, but they were either computationally intensive, dependent on third party software or required significant time and resource of a web server. In the post-genomic era, researchers are able to obtain bioinformatically processed high-quality publication-ready whole genome data for many individuals in a population from next generation sequencing companies due to the reduction in the cost of sequencing and analysis. Such genotype data is typically presented in the Variant Call Format (VCF) and there is no simple software available that uses this data to construct the phylogeny of populations in a short time. To address this limitation, we have developed a one-click user-friendly software, VCF2PopTree that uses gnome-wide SNPs to construct and display phylogenetic trees in seconds to minutes. For example, it reads a 1 GB VCF file and draws a tree in less than 5 minutes. VCF2PopTree accepts genotype data from a local machine, constructs a tree using UPGMA and Neighbour-Joining algorithms and displays it on a web-browser. It also produces pairwise-diversity matrix in MEGA and PHYLIP file formats as well as trees in the Newick format which could be directly used by other popular phylogenetic software programs. The software including the source code, a test VCF input file and short documentation are available at: https://github.com/sansubs/vcf2pop.
Author Comment
The source location of the software has been changed. Usage example in page 6 has been modified. One paragraph (last in page 7) has been inserted to provide justification for not using complex evolutionary models.