Challenges as enablers for high quality linked data: Insights from the semantic publishing challenge

Anastasia Dimou; Sahar Vahdati; Angelo Di Iorio; Christoph Lange; Ruben Verborgh; Erik Mannens

doi:10.7287/peerj.preprints.2616v1

Challenges as enablers for high quality linked data: Insights from the semantic publishing challenge

Anastasia Dimou ^1,2, Sahar Vahdati³, Angelo Di Iorio⁴, Christoph Lange^3,5, Ruben Verborgh^1,2, Erik Mannens^1,2

1 Faculty of Engineering and Architecture, Ghent University, Ghent, Belgium

2 imec, Leuven, Belgium

3 Department of Intelligent Systems, University of Bonn, Bonn, Germany

4 Department of Computer Science and Engineering, University of Bologna, Bologna, Italy

5 Enterprise Information Systems, Fraunhofer IAIS, Sankt Augustin, Germany

DOI: 10.7287/peerj.preprints.2616v1

Published: 2016-12-05
Accepted: 2016-12-05

Subject Areas: Data Science, Digital Libraries, Emerging Technologies, World Wide Web and Web Science
Keywords: Linked Data, Semantic Web, Linked Data publishing, Semantic Publishing, Challenge, survey

Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Dimou A, Vahdati S, Di Iorio A, Lange C, Verborgh R, Mannens E. 2016. Challenges as enablers for high quality linked data: Insights from the semantic publishing challenge. PeerJ Preprints 4:e2616v1 https://doi.org/10.7287/peerj.preprints.2616v1

Abstract

While most challenges organized so far in the Semantic Web domain are focused on comparing tools with respect to different criteria such as their features and competencies, or exploiting semantically enriched data, the Semantic Web Evaluation Challenges series, co-located with the ESWC Semantic Web Conference, aims to compare them based on their output, namely the produced dataset. The Semantic Publishing Challenge is one of these challenges. Its goal is to involve participants in extracting data from heterogeneous sources on scholarly publications, and producing Linked Data that can be exploited by the community itself. This paper reviews lessons learned from both (i) the overall organization of the Semantic Publishing Challenge, regarding the definition of the tasks, building the input dataset and forming the evaluation, and (ii) the results produced by the participants, regarding the proposed approaches, the used tools, the preferred vocabularies and the results produced in the three editions of 2014, 2015 and 2016. We compared these lessons to other Semantic Web Evaluation challenges. In this paper, we (i) distill best practices for organizing such challenges that could be applied to similar events, and (ii) report observations on Linked Data publishing derived from the submitted solutions. We conclude that higher quality may be achieved when Linked Data is produced as a result of a challenge, because the competition becomes an incentive, while solutions become better with respect to Linked Data publishing best practices when they are evaluated against the rules of the challenge.

Author Comment

This is a submission to PeerJ Computer Science for review.