epcALGO: a home-grown algorithm for entire proteome comparison
- Published
- Accepted
- Subject Areas
- Bioinformatics
- Keywords
- proteome comparison, sequence, BLAST, algorithm
- Copyright
- © 2014 Kumar et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2014. epcALGO: a home-grown algorithm for entire proteome comparison. PeerJ PrePrints 2:e210v1 https://doi.org/10.7287/peerj.preprints.210v1
Abstract
Due to the advancement of bioinformatics and genome sequencing project, entire genome and proteome sequences of different organisms are available in the public domain. These vast data are repeatedly compared and explored to find out identical and similar sequence patterns. In this paper we employed NCBI’s Standalone BLAST program for entire proteome comparison of any two strains / species and illustrate a simple algorithm for the same. The implementation of this epcALGO algorithm is to identify systematically conserved proteins that are missing in a given proteome and also identify proteins unique to a particular species. This algorithm is simple and quick to apply for revealing the species / strain variation among any two closely related species / strains by identifying identical and non-identical proteins in their proteomes and also identifying where there is mutation in the protein sequence. We implemented this algorithm for proteome comparison of two strains of Mycobacterium tuberculosis H37Rv and H37Ra and elucidated the methodology for finding out their proteomic variation.