Finding pattern in biochemical reaction networks - a sub-graph mining approach

Scientific Databases and Visualization, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
DOI
10.7287/peerj.preprints.1479v1
Subject Areas
Bioinformatics, Computational Biology, Data Mining and Machine Learning, Data Science
Keywords
Subgraph Mining, Systems Biology, Biochemical Reaction Networks, Pattern Detection, Graph Database, Knowledge Discovery
Copyright
© 2015 Henkel et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Henkel R, Lambusch F, Waltemath D. 2015. Finding pattern in biochemical reaction networks - a sub-graph mining approach. PeerJ PrePrints 3:e1479v1

Abstract

Biological questions today are often answered with the help of simulation models. Many of these models encode biological processes as biochemical reaction networks. The increasing amount of published models and the growing size of encoded reaction networks demand methods to analyse models. Specifically, researchers need to identify reoccurring and biologically relevant patterns. However, pattern recognition in large networks is a hard problem, and only partial solutions for very specific biological networks exist until now. In addition, while such patterns where already postulated, identifying them manually is barley feasible given a large set of complex models. This paper examines automatic methods to find reoccurring patterns structural similarities in models represented as bipartite graphs. An approach is presented to find the most frequent structures within the models. Appropriate patterns were found, which occur in a major part of the 575 input models. The occurrences of the resulting structures can provide insight into the encoding of certain biological processes, evaluate the postulated structures and serve as a reasonable similarity measure for grouping models that share many common structures.

Author Comment

This is the initial version of this manuscript. Title and short title are subject to change.