To increase transparency, PeerJ operates a system of 'optional signed reviews and history'. This takes two forms: (1) peer reviewers are encouraged, but not required, to provide their names (if they do so, then their profile page records the articles they have reviewed), and (2) authors are given the option of reproducing their entire peer review history alongside their published article (in which case the complete peer review process is provided, including revisions, rebuttal letters and editor decision letters).
Thanks for being open to the reviewer and editorial comments.
This revision nicely addresses the issues raised in the first round of reviews, resulting in a much clearer and more thorough paper.
I have a few small content comments that should be addressed in the final submission, and one question that I would like to raise:
1. At the end of the paragraph starting with "The publishing workflow...", you have a sentence with nested open parens and only one close paren.
2. The last sentence on page 5, describing REST applications, ends without a period. Furthermore, the final claim seems a bit funny - REST is not particularly tied to mobile apps.
3. The first paragraph on page 11 mentions an ORE resource map, without any citation or description of what that might refer to.
I am assuming these issues can be addressed quickly and should require no further review.
My question regards web-linking. I am not familiar with this proposal, but it seems very ad-hoc. Are there any alternative approaches that provide for potentially more useful metadata, perhaps through RDF-a? I would not suggest revising the manuscript to raise this question, but I am curious...
The new draft is more thorough, clear, and readable.
Regarding concerns expressed by reviewers 1 and 2, I will start by clarifying that I have been asked to consider this submission to be within scope for PeerJ, so questions of appropriateness have not been considered in this review.
I am very supportive of the spirit and the goals of this work, and of the Force11 effort in general. However, I share Reviewer 3's concerns about the readability and clarity of this paper.
As written, this submission does not provide enough context about the goals of the Force11 effort, the need for human and machine accessibility of data, and the specific mechanisms being discussed. As someone who has followed these efforts, and who is sympathetic to the goals, I found myself a bit befuddled by the content of the JDDCP principles (why not cite all 8?) and some of the acronyms in table 1 (NBN? N2T ARK?). I fear that readers who are less familiar with these topics would be thoroughly confused.
Given that this paper is trying to argue for a set of practices that would involve a change of practice for many potentially recalcitrant investigators, I suggest the addition of some additional introductory material that would more clearly express the need for this sort of data description, the existing landscape, and the potential solutions. A strong and clear description that would convince readers that this sort of description is both possible and not unduly burdensome would be most effective for meeting the Force11 goals. I fear that the paper as is would befuddle readers and hinder realization of these goals.
All reviewers provided useful feedback - I suggest accounting for their concerns. Defining acronyms and providing the complete JDDCP definitions would be particularly useful. I also identified two questions that I would like to see discussed:
1. Regarding machine accessibility, is REST the only possible approach? Some repositories might, for example, prefer SPARQL access to triple stores - would that not be considered accessible? Some discussion might help.
2. Regarding descriptions of software, is it reasonable to discuss URIs for software tools?
There are some confusing aspects to Table 1. Please clarify how the HTTP(s) and PURL URI identifier schemes meet JDDCP criteria if they ‘fail’ upon object removal; this does not appear to agree with Principle 6. How can these ‘achieve persistence’ if they may not persist?
Further attention to clarity in the text, with an eye to removing possible ambiguities and redundancies, would make the article read better. I urge the authors to consider issues such as the following:
1. Page 2, paragraph 2: use of parentheses
2. Page 2, paragraph 3, sentence 2: what does ‘It’ refer to?
3. Page 2, paragraph 3: introduce the acronym DCIG before using it later on the same page
4. Throughout the manuscript: minor errors in spelling, punctuation, and sentence structure
5. Page 3, paragraph 1, sentence 1: is ‘has’ the correct word here? Perhaps ‘reflects’ or ‘demonstrates’, etc?
6. Page 3, paragraph 4: is ‘vend’ the correct word to use?
7. Page 5: a bullet point related to #3 refers to “b”; do the authors mean “2”?
8. Page 6, paragraph 3: do the authors mean ‘as a draft of NISO JATS version 1.1’, as ‘1.1d2’ denotes the second draft?
Many of these stated criteria for this area are not relevant to the article, which does not describe primary research in the Biological Sciences, Medical Sciences, or Health Sciences. However, the subject matter of the article (guidelines/proposed methods for improving access, citation, and deposition of data related to scholarly publications) is certainly applicable to Biological, Medical, and Health sciences. I leave it to the Academic Editor to determine whether the article is appropriate for this publication.
Please see my comments under Experimental Design above.
The guidelines are generally clear and contain sufficient detail for implementation. The manuscript is well-organized in presenting them.
This paper is a short note that provides practical guidance on the implementation of the standards for data deposition and discoverability formulated in the Joint Declaration of Data Citation Principles. The writing is generally clear, although a bit choppy in places. Please see the attached pdf for specific suggestions.
I found it hard to identify the intended audience for the paper. I suspect that the technical details will make most sense to web designers or database engineers that are making decisions on the design and appearance of data entries and the scholarly work that cites those entries. However, the introduction seems to be aimed at a broader audience, perhaps those in publishing, libraries or research institutions that are encountering the Force11 initiative for the first time and need to be convinced of the need for standardizing data citation and hosting practices. Even if that’s not the intent, it would benefit the article to add a little more detail and explanation throughout, and to strenuously avoid acronyms and other terms that outsiders (like me) may find opaque. A more accessible article would likely reach a broader audience, which can only be a good thing.
There is no original primary research being presented here – the work of choosing these particular standards while rejecting others has clearly gone on beforehand, and the paper presents the conclusions of that work.
I am not well placed to comment on whether the guidance presented here is valid or the best possible practice, particularly because there is no detail on how the presented solution was decided upon.
I'm supportive of this paper, as it's important to have a published version of record that others can point to when considering data citation issues. However, the paper is not obviously within the Biological, Medical or Health Sciences, and seems closest to Computer Science. Moreover, the paper does not present ‘research’ as such, as all the evaluation of various options (and the process behind that) is not presented. The article may therefore not be a good fit to PeerJ’s remit, and this is the reason behind my 'reject' recommendation. The final decision on suitability is, of course, up to the editor.
Although this is intended as a brief piece to provide operational guidance, as it will be published as a journal article rather than a technical report, I suggest providing additional background and explanation throughout the paper. Consider that the relevant audience may be more than just repository managers who are highly proficient with technical jargon, but also other related stakeholders e.g., in managerial or other advisory positions. Additional explanation and clarification would make this paper more accessible and appropriate for the publication venue.
The title is broad considering the specific focus of the article. The current title reflects the overall goal of the Force11 data citation principles, rather than the specific points addressed by this article.
Provide some background on Force11 as not all readers may be intimately familiar with the organization. Why are they trustworthy? What is their mission? Why the JDCCP, given the existence of other guidelines for data citation? How does this relate to other guidelines for data citation and metadata? Consider that the reader might appreciate a listing of all 8 principles in the introduction for additional context.
You state that the JDDCP “deliberately” does not provide implementation guidelines. Why not?
Don’t leave me hanging: what are some other specific implementation issues that we can expect to be addressed in the future?
Provide parathenetical acronym for Data Citation Implementation Group (DCIG) the first time it is used.
What is Machine Accessibility?:
Only a very cursory definition/description of machine accessibility is provided, yet guidelines for machine accessibility is the main point of the article. Some additional description with attention to reasons for the importance of machine accessibility would be appreciated. Consider a brief description of RESTful Web services---although this is a standard for accessing functions for others to use they would need documentation. Is provision of documentation a best practice?
What do you mean by “long term commitment to persistence”?
If the criteria in Table 1 are important, why are they not introduced and discussed in the main body of the text? Is there a particular recommendation on any of these criteria?
“First, as ‘mandated’ in the JDDCP – consider word choice here with ‘mandated’, may be too strong.
Sentence order in the first paragraph here does not flow appropriately. After the “First” point the sentence explaining more about metadata should follow. After the “Second” the sentence explaining credential validation should appear.
“Landing pages should combine human-readable and machine-readable information on a selection of the following items” – What selection should I choose from the list? Are these all optional items?
“Explanatory or contextual information” – Should the documentation be part of the landing page or a separate document?
“Dataset descriptions” – Need better definition of what “description” is in this context.
Regarding persistence and data availability, and the persistence of metadata beyond
de-accessioning: should this be parallel to journal articles? Why is data different?
Minimum acceptable information on landing pages:
Under 2, the 6th bullet, is there a particular ISO standard that can be referenced?
Best practices for dataset description:
What kind of description are we talking about here exactly? Why is it safe to say that a standard that has only been very recently released is already widely used and settled? Do you anticipate the release of any additional domain specific standards?
Data access methods:
Consider providing here some additional explanation. Points 1 and 2 are generically applicable. Point 3 follows a linked data model that is library applicable but might not translate to other stakeholders.
Can you make the relationship to persistent identifiers more explicit? This section feels somewhat overreaching. Is this a little much for citation practices? Or is this really a trusted repository issue?
Are there any existing examples out there that already meet these criteria that you can share?
In an article on citation standards, it is imperative that the reference list be correctly formatted and provide all necessary information to easily retrieve the listed documents. Check the author guidelines and make certain that both in-text and full citations in the reference list are done appropriately. For example, provide URLs and access dates for technical reports. It sends a mixed message to promote higher standards for data citation than document citation.
Should the JDDCP itself be included in the reference list?
No comments - covered above.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.