Microbe-Drug Association Prediction with Bernoulli Random Forest based on Convolutional Neural Network
Abstract
Background. Published researches showed that antibiotic drugs play a key role in treating diseases caused by microbes. However, along with the drug widespread use, more and more microbes are resistant to drug. Therefore, it is necessary to predict microbe related drug for helping researchers develop new drug. Since lots of time and money required for traditional wet experiments, computational models can be used as novel method to discover potential microbe-drug associations. Methods. In this study, we proposed a new computational model of Convolutional Neural Network with Bernoulli Random Forest for Microbe-Drug Association prediction (CNNBRFMDA). The main task of CNNBRFMDA is to construct feature vector for all microbe-drug pairs based on the known microbe-drug association, microbe similarity and drug similarity. Following this, a subset of these feature vectors is randomly selected to create the training set. Subsequently, Convolutional Neural Network (CNN) was used to reduce the dimensionality of the feature vectors for all microbe-drug pairs, including those within the training set. The dimensionality-reduced training set is trained by Bernoulli Random Forest (BRF) to predict potential microbe-drug associations. Innovation lies in our novel integration of CNN and BRF, presenting a fresh approach utilizing neural networks for feature extraction to enhance BRF predictions. The feature vectors undergo nonlinear processing, creating low-dimensional representations for microbe-drug pairs. This transformative step not only optimizes computational efficiency but also refines the model's ability to capture intricate patterns and relationships within the data, thereby improving the precision and interpretability of drug response predictions to various microbes. Leveraging the Bernoulli distribution twice in BRF not only ensures algorithmic consistency but also demonstrates superior performance. Results. To evaluate the performance of CNNBRFMDA, we carried out 5-fold cross validation based on MDAD and abiofilm. In five-fold cross-validation, the mean AUC and standard deviation of CNNBRFMDA are 0.9017 +/- 0.0032 (MDAD), 0.9146 +/- 0.0041 (abiofilm), respectively. We further applied two different types of case studies to further evaluate the reliability of CNNBRFMDA. The results showed that 41 out of the top 50 predicted microbes associated with Ciprofloxacin were verified by searching literatures. The other case study revealed that 38 out of the top 50 predicted microbes related to Moxifloxacin were validated by published literatures.