Increasing the precision of orthology-based complex prediction through network alignment

Joint IRB-BSC Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), Barcelona, Spain
Department of Bioengineering and Therapeutic Sciences, University of California San Francisco (UCSF), San Francisco, CA, USA
Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
DOI
10.7287/peerj.preprints.280v2
Subject Areas
Bioinformatics, Computational Biology, Evolutionary Studies, Molecular Biology, Computational Science
Keywords
Protein complexes, Complex prediction, Network alignment, Macromolecular assemblies, Protein-protein interactions, Evolutionary conservation
Copyright
© 2014 Pache et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Pache R, Aloy P. 2014. Increasing the precision of orthology-based complex prediction through network alignment. PeerJ PrePrints 2:e280v2

Abstract

Macromolecular assemblies play an important role in almost all cellular processes. However, despite several large-scale studies, our current knowledge about protein complexes is still quite limited, thus advocating the use of in silico predictions to gather information on complex composition in model organisms. Since protein-protein interactions present certain constraints on the functional divergence of macromolecular assemblies during evolution, it is possible to predict complexes based on orthology data. Here, we show that incorporating interaction information through network alignment significantly increases the precision of orthology-based complex prediction. Moreover, we performed a large-scale in silico screen for protein complexes in human, yeast and fly, through the alignment of hundreds of known complexes to whole organism interactomes. Systematic comparison of the resulting network alignments to all complexes currently known in those species revealed many conserved complexes, as well as several novel complex components. In addition to validating our predictions using orthogonal data, we were able to assign specific functional roles to the predicted complexes. In several cases, the incorporation of interaction data through network alignment allowed to distinguish real complex components from other orthologous proteins. Our analyses indicate that current knowledge of yeast protein complexes exceeds that in other organisms and that predicting complexes in fly based on human and yeast data is complementary rather than redundant. Lastly, assessing the conservation of protein complexes of the human pathogen Mycoplasma pneumoniae, we discovered that its complexes repertoire is different from that of eukaryotes, suggesting new points of therapeutic intervention, whereas targeting the pathogen’s Restriction enzyme complex might lead to adverse effects due to its similarity to ATP-dependent metalloproteases in the human host.

Author Comment

This article is intended for publication in PeerJ.

Affiliations have been updated compared to the previous version.

Supplemental Information

Supplemental Table 1: Lists of yeast and human complexes predicted through complex to interactome alignment.

This table contains four sheets with the following contents: HC yeast predictions - Prediction of yeast complexes by aligning human complexes to the yeast interactome (high-confidence predictions). All yeast predictions - Prediction of yeast complexes by aligning human complexes to the yeast interactome (all predictions). HC human predictions - Prediction of human complexes by aligning yeast complexes to the human interactome (high-confidence predictions). All human predictions - Prediction of human complexes by aligning yeast complexes to the human interactome (all predictions). #QCC/#TCC: number of query/target complex components; only significant GO terms are shown; those GO terms that also belong to the set of most abundant significant GO terms of the given query complex are highlighted in capitals; complexes are sorted by homogeneity, followed by the fraction and number of new components compared to all known complexes in the target species; new components are shown in lower case.

DOI: 10.7287/peerj.preprints.280v2/supp-1

Supplemental Table 2: Lists of fly complexes predicted through aligning human and yeast complexes to the fly interactome.

This table contains four sheets with the following contents: HC predictions from human - Prediction of fly complexes by aligning human complexes to the fly interactome (high-confidence predictions). All predictions from human - Prediction of fly complexes by aligning human complexes to the fly interactome (all predictions). HC predictions from yeast - Prediction of fly complexes by aligning yeast complexes to the fly interactome (high-confidence predictions). All predictions from yeast - Prediction of fly complexes by aligning yeast complexes to the fly interactome (all predictions). #QCC/#TCC: number of query/target complex components; only significant GO terms are shown; those GO terms that also belong to the set of most abundant significant GO terms of the given query complex are highlighted in capitals; complexes are sorted by homogeneity, followed by the fraction and number of new components compared to all known fly complexes; new components are shown in lower case.

DOI: 10.7287/peerj.preprints.280v2/supp-2

Supplemental Table 3: Lists of human, fly and yeast complexes predicted through aligning mycoplasma complexes to the given species interactome.

This table contains six sheets with the following contents: HC predictions in human - Prediction of human complexes by aligning mycoplasma complexes to the human interactome (high-confidence predictions). All predictions in human - Prediction of human complexes by aligning mycoplasma complexes to the human interactome (all predictions). HC predictions in fly - Prediction of fly complexes by aligning mycoplasma complexes to the fly interactome (high-confidence predictions). All predictions in fly - Prediction of fly complexes by aligning mycoplasma complexes to the fly interactome (all predictions). HC predictions in yeast - Prediction of yeast complexes by aligning mycoplasma complexes to the yeast interactome (high-confidence predictions). All predictions in yeast - Prediction of yeast complexes by aligning mycoplasma complexes to the yeast interactome (all predictions). #QCC/#TCC: number of query/target complex components; only significant GO terms are shown; those GO terms that also belong to the set of most abundant significant GO terms of the given query complex are highlighted in capitals; complexes are sorted by homogeneity, followed by the fraction and number of new components compared to all known complexes in the target species; new components are shown in lower case.

DOI: 10.7287/peerj.preprints.280v2/supp-3