An extensive survey of molecular docking tools and their applications using text mining and deep curation strategies.
- Published
- Accepted
- Subject Areas
- Bioinformatics, Biotechnology, Computational Biology, Data Mining and Machine Learning
- Keywords
- tools, docking, software, Side effect prediction; adverse drug reactions prediction; drug repositioning; drug repurposing; drug indication prediction, database, wet lab validations, benchmarking, collaborative writing.
- Copyright
- © 2019 Rawal et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2019. An extensive survey of molecular docking tools and their applications using text mining and deep curation strategies. PeerJ Preprints 7:e27538v1 https://doi.org/10.7287/peerj.preprints.27538v1
Abstract
The technology of docking molecules in-silico has evolved significantly in recent years and has become a crucial component of the drug discovery tool process that includes virtual screening, lead optimization, and side-effect predictions. To date over 43,000 abstracts/papers have been published on docking, thereby highlighting the importance of this computational approach in the context of drug development. Considering the large amount of genomic and proteomic consortia active in the public domain, docking can exploit this data on a correspondingly ‘large scale’ to address a variety of research questions. Over 160 robust and accurate molecular docking tools based on different algorithms have been made available to users across the world. Further, 109 scoring functions have been reported in the literature till date. Despite these advancements, there continue to be several bottlenecks during the implementation stage. These problems or issues range from choosing the right docking algorithm, selecting a binding site in target proteins, performance of the given docking tool, integration of molecular dynamics information, ligand-induced conformational changes, use of solvent molecules, choice of docking pose, and choice of databases. Further, so far, not always have experimental studies been used to validate the docking results. In this review, basic features and key concepts of docking have been highlighted, with particular emphasis on its applications such as drug repositioning and prediction of side effects. Also, the use of docking in conjunction with wet lab experimentations and epitope predictions has been summarized. Attempts have been made to systematically address the above-mentioned challenges using expert-curation and text mining strategies. Our work shows the use of machine-assisted literature mining to process and analyze huge amounts of available information in a short time frame. With this work, we also propose to build a platform that combines human expertise (deep curation) and machine learning in a collaborative way and thus helps to solve ambitious problems (i.e. building fast, efficient docking systems by combining the best tools or to perform large scale docking at human proteome level).
Author Comment
We have built a new hybrid system which combines text mining, machine learning and deep curation techniques to process huge amount of text data in the shortest time frame. The authors can use this system to write comprehensive review articles in a unbiased and accurate manner. The system can be utilised for literature review for writing introduction and discussion section of any research articles in a minimal time frame. This approach shall be useful for legal fraternity to mine large datasets to build evidence for their legal cases, citations of legal cases and arguments. The other potential applications could be analysing massive dataset for social media analytics such as tweets, comments etc in terms of positive, negative and neutral connotations. This system also integrates an automated reviews collection and analysis system to rank a given research article. In this work, we have screened over 900 research articles manually in context of molecular docking. Apart from that, over 43000 abstracts were analysed using text mining and machine learning scripts. Thereafter, over 160 docking tools and 109 scoring functions were retrieved from text data for subsequent analysis. Next, we compiled information on comparative performances of docking tools using information literature datasets. In addition, we also extracted information on wet lab experiments which had used molecular docking as a tool. The work was extended in finding the applications of molecular docking in terms of side effect prediction and drug repurposing as well.