MLSTar: automatic multilocus and core genome sequence typing in R
- Published
- Accepted
- Subject Areas
- Bioinformatics, Ecology, Genomics, Microbiology, Population Biology
- Keywords
- MLST, cgMLST, Bacterial genomics, R package, Population genomics
- Copyright
- © 2018 Ferrés et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2018. MLSTar: automatic multilocus and core genome sequence typing in R. PeerJ Preprints 6:e26630v2 https://doi.org/10.7287/peerj.preprints.26630v2
Abstract
Multilocus sequence typing (MLST) is a standard tool in population genetics and bacterial epidemiology that assesses the genetic variation present in a reduced number of housekeeping genes (typically seven) along the genome. This methodology assigns arbitrary integer identifiers to genetic variations at these loci allowing to efficiently compare bacterial isolates using allele-based methods. Now, the increasing availability of whole-genome sequences for hundreds to thousands of strains from the same bacterial species has motivated to upgrade the resolution of traditional MLST schemes using larger gene sets or even the core genome (cgMLST). The PubMLST database is the most comprehensive resource of described MLST and cgMLST schemes available for a wide variety of species. Here we present MLSTar as the first R package that allows to i) connect with the PubMLST database to select a target scheme, ii) screen a desired set of genomes to assign alleles and sequence types and iii) interact with other widely used R packages to analyze and produce graphical representations of the data. We applied MLSTar to analyze a set of 400 Campylobacter coli genomes, showing great accuracy and comparable performance with previously published command-line tools. MLSTar can be freely downloaded from http://github.com/iferres/MLSTar.
Author Comment
Fixed GitHub repository link. Updated manuscript sections according to PeerJ recommendations.