Achieving human and machine accessibility of cited data in scholarly publications

California Digital Library, University of California, Office of the President, Oakland, CA, USA
Institute of Quantitative Social Science, Harvard University, Cambridge, MA, USA
Stanford Center for Biomedical Informatics Research, School of Medicine, Stanford University, Palo Alto, CA, USA
Center for International Earth Science Information Network (CIESIN), Columbia University, Palisades, NY, USA
University of Colorado at Boulder, National Snow and Ice Data Center, Boulder, CO, USA
ORCID, Inc., Bethesda, MD, USA
Oregon Health and Science University, Portland, OR, USA
W3C/CWI, Amsterdam, The Netherlands
CODATA (ICSU Committee on Data for Science and Technology), Paris, FR
Solar Data Analysis Center, NASA Goddard Space Flight Center, Greenbelt, MD, USA
Public Library of Science, San Francisco, CA, USA
European Organization for Nuclear Research (CERN), Geneva, Switzerland
Columbia University Libraries/Information Services, New York, NY, USA
SBA Research, Vienna, AT
Institute of Software Technology and Interactive Systems, Vienna University of Technology / TU Wien, Vienna, AT
Journal Information Systems, American Physical Society, Ridge, NY, USA
Elsevier, Oxford, UK
Department of Neurology, Harvard Medical School, Boston, United States
DOI
10.7287/peerj.preprints.697v4
Subject Areas
Human-Computer Interaction, Data Science, Digital Libraries, World Wide Web and Web Science
Keywords
data citation, machine accessibility, data citation, data archiving
Copyright
© 2015 Starr et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Starr J, Castro E, Crosas M, Dumontier M, Downs RR, Duerr R, Haak L, Haendel M, Herman I, Hodson S, Hourclé J, Kratz JE, Lin J, Nielsen LH, Nurnberger A, Pröll S, Rauber A, Sacchi S, Smith AP, Taylor M, Clark T. 2015. Achieving human and machine accessibility of cited data in scholarly publications. PeerJ PrePrints 3:e697v4

Abstract

Reproducibility and reusability of research results is an important concern in scientific communication and science policy. A foundational element of reproducibility and reusability is the open and persistently available presentation of research data. However, many common approaches for primary data publication in use today do not achieve sufficient long-term robustness, openness, accessibility or uniformity. Nor do they permit comprehensive exploitation by modern Web technologies. This has led to several authoritative studies recommending uniform direct citation of data archived in persistent repositories. Data are to be considered as first-class scholarly objects, and treated similarly in many ways to cited and archived scientific and scholarly literature. Here we briefly review the most current and widely agreed set of principle-based recommendations for scholarly data citation, the Joint Declaration of Data Citation Principles (JDDCP). We then present a framework for operationalizing the JDDCP; and a set of initial recommendations on identifier schemes, identifier resolution behavior, required metadata elements, and best practices for realizing programmatic machine actionability of cited data. The main target audience for the common implementation guidelines in this article consists of publishers, scholarly organizations, and persistent data repositories, including technical staff members in these organizations. But ordinary researchers can also benefit from these recommendations. The guidance provided here is intended to help achieve widespread, uniform human and machine accessibility of deposited data, in support of significantly improved verification, validation, reproducibility and re-use of scholarly/scientific data.

Author Comment

This is an updated version with minor revisions.