Decentralized provenance-aware publishing with nanopublications

Department of Computer Science, VU University Amsterdam, Amsterdam,, The Netherlands
Nestle Institute of Health Sciences, Lausanne, Switzerland
Yale University School of Medicine, Yale University, New Haven, Connecticut, United States
Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain
Data Science Lab, Ghent University, Ghent, Belgium
Institute of Informatics and Telecommunications, NCSR Demokritos, Athens, Greece
SciFY Private Not-for-profit Company, Athens, Greece
AKSW Research Group, University of Leipzig, Leipzig, Germany
University of Maryland, College Park, Maryland, United States
Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, United States
DOI
10.7287/peerj.preprints.1760v1
Subject Areas
Bioinformatics, Computer Networks and Communications, Digital Libraries, World Wide Web and Web Science
Keywords
Data publishing, nanopublications, provenance, Linked Data, Semantic Web
Copyright
© 2016 Kuhn et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Kuhn T, Chichester C, Krauthammer M, Queralt-Rosinach N, Verborgh R, Giannakopoulos G, Ngonga Ngomo A, Viglianti R, Dumontier M. 2016. Decentralized provenance-aware publishing with nanopublications. PeerJ PrePrints 4:e1760v1

Abstract

Publication and archival of scientific results is still commonly considered the responsability of classical publishing companies. Classical forms of publishing, however, which center around printed narrative articles, no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this article, we propose to design scientific data publishing as a Web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used as a low-level data publication layer to serve the Semantic Web in general. Our evaluation of the current network shows that this system is efficient and reliable.

Author Comment

This is a submission to PeerJ Computer Science for review.