This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Within the statistics community, a number of guiding principles for sharing data have emerged; however, these principles are not always made clear to collaborators generating the data. To bridge this divide, we have established a set of guidelines for sharing data. In these, we highlight the need to provide raw data to the statistician, the importance of consistent formatting, and the necessity of including all essential experimental information and pre-processing steps carried out to the statistician. With these guidelines we hope to avoid errors and delays in data analysis.
Great to see some easy to grasp best practice in data management.
I think the manuscript could gain in clarity and effectiveness if you would add:
- one section about the definition of "human readability" and "computer readability". (there are examples along the manuscript, but it might be more effective to make a special section for it). One could use the occasion to specify what is computer readable in an excel file and what is not.
- one section explaining why the Data Organization should be created as soon as possible, (i.e. before the data acquisition). Discuss whether one should involve the statistician at that step.
- one word about data analysis reproducibility in the discussion would be welcome, maybe something about pre-registration too?