Automatic discovery of transferable patterns in protein-ligand interaction networks

Adrem Data Lab, University of Antwerp, Antwerp, Belgium
Biomedical Informatics Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium
Laboratory of Medicinal Chemistry, University of Antwerp, Wilrijk, Belgium
Computational Biology and Drug Design (CBDD), CRCM (INSERM U1068), F-13009 Marseille, France; Institut Paoli-Calmettes, F-13009 Marseille, France; AMU, F-13284 Marseille, France; CNRS (UMR7258), F-13009 Marseille, France, Marseille, France
Laboratory of Protein Chemistry, Proteomics and Epigenetic Signaling (PPES), University of Antwerp, Wilrijk, Belgium
DOI
10.7287/peerj.preprints.27002v1
Subject Areas
Bioinformatics, Drugs and Devices, Computational Science, Data Mining and Machine Learning, Data Science
Keywords
Drug repurposing, Data mining, Drug screening, Protein-ligand interactions, Drug-target interactions, Frequent itemset mining, Frequent pattern mining, Drug repositioning
Copyright
© 2018 Mrzic et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Mrzic A, Van Rompaey D, Naulaerts S, De Winter H, Vanden Berghe W, Meysman P, Laukens K. 2018. Automatic discovery of transferable patterns in protein-ligand interaction networks. PeerJ Preprints 6:e27002v1

Abstract

In recent years, the pharmaceutical industry has been confronted with rising R&D costs paired with decreasing productivity. Attrition rates for new molecules are tremendous, with a substantial number of molecules failing in an advanced stage of development. Repositioning previously approved drugs for new indications can mitigate these issues by reducing both risk and cost of development. Computational methods have been developed to allow for the prediction of drug-target interactions, but it remains difficult to branch out into new areas of application where information is scarce.

Here, we present a proof-of-concept for discovering patterns in protein-ligand data using frequent itemset mining. Two key advantages of our method are the transferability of our patterns to different application domains and the facile interpretation of our recommendations. Starting from a set of known protein-ligand relationships, we identify patterns of molecular substructures and protein domains that lie at the basis of these interactions. We show that these same patterns also underpin metabolic pathways in humans. We further demonstrate how association rules mined from human protein-ligand interaction patterns can be used to predict antibiotics susceptible to bacterial resistance mechanisms.

Author Comment

This is a submission to PeerJ for review.

Supplemental Information

Scripts used for data processing and itemset mining

DOI: 10.7287/peerj.preprints.27002v1/supp-1