This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Computational models in biology encode molecular and cell biological processes. These models often can be represented as biochemical reaction networks. Studying such networks, one is mostly interested in systems that share similar reactions and mechanisms. Typical goals of an investigation include understanding of the parts of a model, identification of reoccurring patterns, and recognition of biologically relevant motifs. The large number and size of available models, however, require automated methods to support researchers in achieving their goals. Specifically for the problem of finding patterns in large networks only partial solutions exist. We propose a workflow that identifies frequent structural patterns in biochemical reaction networks encoded in the Systems Biology Markup Language. The workflow utilises a subgraph mining algorithm to detect frequent network patterns. Once patterns are identified, the textual pattern description can automatically be converted into a graphical representation.Furthermore, information about the distribution of patterns among the selected set of models can be retrieved.The workflow was validated with 575 models from the curated branch of BioModels. In this paper, we highlight interesting and frequent structural patterns. Further, we provide exemplary patterns that incorporate terms from the Systems Biology Ontology. Our workflow can be applied to a custom set of models or to models already existing in our graph database MaSyMoS. The occurrences of frequent patterns may give insight into the encoding of central biological processes, evaluate postulated biological motifs, or serve as a similarity measure for models that share common structures.
Published Version: Fabienne Lambusch, Dagmar Waltemath, Olaf Wolkenhauer, Kurt Sandkuhl, Christian Rosenke, Ron Henkel; Identifying frequent patterns in biochemical reaction networks: a workflow, Database , Volume 2018, 1 January 2018, bay051, https://doi.org/10.1093/database/bay051