Prediction of deleterious nonsynonymous SNPs by integrating multiple classifiers – An application to neurodegenerative diseases

Department of Biochemistry, Faculty of Science, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
DOI
10.7287/peerj.preprints.994v1
Subject Areas
Bioinformatics, Computational Biology
Keywords
Neurodegenerative Disease, Nonsynonymous SNP, logistic regression, Mendelian disease, UniProt, missense mutation, ROC AUC, deleterious SNP
Copyright
© 2015 Mesbah-Uddin
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Mesbah-Uddin M. 2015. Prediction of deleterious nonsynonymous SNPs by integrating multiple classifiers – An application to neurodegenerative diseases. PeerJ PrePrints 3:e994v1

Abstract

In this study, we propose a logistic regression model to classify deleterious missense mutation from a list of nonsynonymous SNPs (nsSNPs) – where multiple features (i.e. rank scores of 18 classifiers e.g. SIFT, PolyPhen2, MutationTaster, MutationAssessor, FATHMM, VEST 3.0, RadialSVM, LR, CADD, etc. from dbNSFP v2.5) are combined for 44,702 UniProt human polymorphisms and disease mutations (19,033 disease and 25,669 neutral). The model is trained and validated on 80% of the data (15,226 disease + 20,535 neutral nsSNPs), and tested on remaining 20% (3,807 disease + 5134 neutral nsSNPs); and finally applied to a neurodegenerative disease-specific dataset (NeuroTest) from UniProt. The ROC AUC of the model is 0.97 on test set and 0.92 on NeuroTest dataset, with an accuracy of 0.91 and 0.86, respectively. Our model outperforms SIFT, PolyPhen2, MutationTaster, MutationAssessor, and the two ensemble classifiers of dbNSFP v2.5, on both the testing sets.

Author Comment

This is a submission to PeerJ PrePrints for review.