Protein complex similarity based on Weisfeiler-Lehman labeling

Bianca K Stöcker; Till Schäfer; Petra Mutzel; Johannes Köster; Nils Kriege; Sven Rahmann

doi:10.7287/peerj.preprints.26612v1

Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

NOT PEER-REVIEWED

"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

Protein complex similarity based on Weisfeiler-Lehman labeling

Bianca K Stöcker ^1,2, Till Schäfer³, Petra Mutzel³, Johannes Köster^1,2,4, Nils Kriege³, Sven Rahmann ^1,3

1 Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany

2 Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg-Essen, Essen, Germany

3 Department of Computer Science, TU Dortmund University, Dortmund, Germany

4 Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA

DOI: 10.7287/peerj.preprints.26612v1

Published: 2018-03-03
Accepted: 2018-03-03

Subject Areas: Bioinformatics, Scientific Computing and Simulation
Keywords: Protein complexes, Weisfeiler-Lehman labeling, Similarity measure, Graph edit distance, Constrained protein interaction networks, Jaccard similarity

Copyright: © 2018 Stöcker et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Stöcker BK, Schäfer T, Mutzel P, Köster J, Kriege N, Rahmann S. 2018. Protein complex similarity based on Weisfeiler-Lehman labeling. PeerJ Preprints 6:e26612v1 https://doi.org/10.7287/peerj.preprints.26612v1

Abstract

Being able to quantify the similarity between two protein complexes is essential for numerous applications. Prominent examples are database searches for known complexes with a given query complex, comparison of the output of different protein complex prediction algorithms, or summarizing and clustering protein complexes, e.g., for visualization. While the corresponding problems have received much attention on single proteins and protein families, the question about how to model and compute similarity between protein complexes has not yet been systematically studied. Because protein complexes can be naturally modeled as graphs, in principle general graph similarity measures may be used, but these are often computationally hard to obtain and do not take typical properties of protein complexes into account. Here we propose a parametric family of similarity measures based on Weisfeiler-Lehman labeling. We evaluate it on simulated complexes of the extended human integrin adhesome network. Because the connectivity (graph topology) of real complexes is often unknown and hard to obtain experimentally, we use both known protein-protein interaction networks and known interdependencies (constraints) between interactions to simulate more realistic complexes than from interaction networks alone. We empirically show that the defined family of similarity measures is in good agreement with edit similarity, a similarity measure derived from graph edit distance, but can be much more efficiently computed. It can therefore be used in large-scale studies and simulations and serve as a basis for further refinements of modeling protein complex similarity.

Author Comment

This is a submission to PeerJ for review.

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)

By posting this you agree to PeerJ's commenting policies

Questions

Ask a question

Learn more about Q&A

Links

Add a link

Content

Alert

Just enter your email

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article