Image based effective feature generation for protein structural class and ligand binding prediction
- Published
- Accepted
- Subject Areas
- Bioinformatics, Computational Biology, Data Mining and Machine Learning
- Keywords
- Protein Structural Class Prediction, Protein Ligand Binding, Image Based Features, Similarity based Supervised Learning Algorithms
- Copyright
- © 2019 Sadique et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2019. Image based effective feature generation for protein structural class and ligand binding prediction. PeerJ Preprints 7:e27743v1 https://doi.org/10.7287/peerj.preprints.27743v1
Abstract
Proteins are the building blocks of all cells in both human and all our living creatures of the world. Most of the work in the living organism is performed by Proteins. Proteins are polymers of amino acid monomers which are biomolecules or macromolecules. The tertiary structure of protein represents the three-dimensional shape of a protein. The functions, classification and binding sites are governed by protein’s tertiary structure. If two protein structures are alike then the two proteins can be of the same kind implying similar structural class and ligand binding properties. In this paper, we have used protein structure to generate effective features for applications in structural similarity to detect structural class and ligand binding. Firstly, we analyze the effectiveness of a group of image based features to predict the structural class of a protein. These features are derived from the image generated by the distance matrix of the tertiary structure of a given protein. They include local binary pattern histogram, Gabor filtered local binary pattern histogram, separate row multiplication matrix with uniform local binary pattern histogram, neighbour block subtraction matrix with uniform local binary pattern histogram and atom bond. The experiments were done on a standard benchmark dataset. We have demonstrated the effectiveness of these features over a large variety of supervised machine learning algorithms. Experiments suggest Random Forest is the best performing classifier on the selected dataset using the set of features. We believe the excellent performance of Hybrid LBP in terms of accuracy would motivate the researchers and practitioners to use it to identify protein structural class. To facilitate that, a classification model using Hybrid LBP is readily available for use at http://brl.uiu.ac.bd/PL/.
Protein-Ligand binding is accountable for managing the tasks of biological receptors that helps to cure diseases and many more. So, binding prediction between protein and ligand is important for understanding a protein’s activity or to accelerate docking computations in virtual screening-based drug design. Protein-Ligand Binding Prediction requires three-dimensional tertiary structure of the target protein to be searched for ligand binding. In this paper, we’ve proposed a supervised learning algorithm for predicting Protein-Ligand Binding which is a Similarity-Based Clustering approach using the same set of features. Our algorithm works better than most popular and widely used machine learning algorithms
Author Comment
This is a submission to PeerJ Computer Science for review.