Comprehensive comparison of large-scale tissue expression datasets

Cellular Network Biology Group, NNF Center for Protein Research, København Universitet, Copenhagen, Denmark
Digital Productivity, CSIRO, Sydney, NSW, Australia
Ferring Pharmaceuticals, Copenhagen, Denmark
Computational informatics, CSIRO, Sydney, Australia
Garvan Institute of Medical Research, Sydney, Australia
Cellular Network Biology Group, NNF Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
DOI
10.7287/peerj.preprints.1072v1
Subject Areas
Computational Biology, Genomics
Keywords
Immunohistochemistry, RNA sequencing, Tissue expression, Mass spectrometry, Microarrays, databases, tissue-specificity
Copyright
© 2015 Santos et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Santos A, Tsafou K, Stolte C, Pletscher-Frankild S, O’Donoghue SI, Jensen LJ. 2015. Comprehensive comparison of large-scale tissue expression datasets. PeerJ PrePrints 3:e1072v1

Abstract

For tissues to carry out their functions, they rely on the right proteins to be present. Several high-throughput technologies have been used to map out which proteins are expressed in which tissues; however, the data have not previously been systematically compared and integrated. We present a comprehensive evaluation of tissue expression data from a variety of experimental techniques and show that these agree surprisingly well with each other and with results from literature curation and text mining. We further found that most datasets support the assumed but not demonstrated distinction between tissue-specific and ubiquitous expression. By developing comparable confidence scores for all types of evidence, we show that it is possible to improve both quality and coverage by combining the datasets. To facilitate use and visualization of our work, we have developed the TISSUES resource ( http://tissues.jensenlab.org ), which makes all the scored and integrated data available through a single user-friendly web interface.

Author Comment

This is a submission to PeerJ for review.