Greedy motif-based approach to parsing large and diverge coiled-coil proteins into domains
- Published
- Accepted
- Subject Areas
- Bioinformatics, Infectious Diseases, Data Science
- Keywords
- Host-pathogen interaction, Protein domains, Conserved motifs, S. pyogenes
- Copyright
- © 2017 Khakzad et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2017. Greedy motif-based approach to parsing large and diverge coiled-coil proteins into domains. PeerJ Preprints 5:e3118v2 https://doi.org/10.7287/peerj.preprints.3118v2
Abstract
Bacterial surfaces are complex, built of from membranes, peptide-glycans and, importantly, proteins. The proteins play crucial roles as the key regulator of how the bacterium interacts with its environment. A full catalog of the motifs in coiled-coil proteins and their relative conservation grade is a pre-requisite to target the protein-protein interaction that bacterial surface protein makes to host proteins. Here, we present a greedy approach to iteratively identify conserved motifs in large sequence collections, identify all occurrences of these motifs and mask them. Remaining unmasked sequences are subjected to the second round of motif detection until no more significant motifs can be found or all protein segments have been assigned to a motif. We present the results for the S. pyogenes M protein. Given the speed and flexibility of our approach, we believe it will be useful in breaking analyzing surface protein of pathogens as these proteins are under high selective pressure and therefore cannot be analyzed using more traditional approaches such as multiple-sequence alignments. Preliminary data indicates that many of the newly discovered motifs are not always present together with adjacent motifs, indicating that they might have different and independent functions.
Author Comment
The conclusion part in this version has been changed and the acknowledgement part is modified.
This is an abstract which has been accepted for the NETTAB 2017 Workshop.