As studies into researchers’ attitudes regarding open science/research data management often point to a need for practical tips, this is a welcome guide containing a wealth of useful resources and advice from a researcher’s perspective, especially for those working in the life sciences.
Below are some comments/suggestions:
- Open Research could indeed be a more inclusive term than Open Science (although in some cultural contexts “science” is actually understood in a much broader sense than in the Anglosphere – see for example ‘Literaturwissenschaft’ in German: literally ‘literarature science’).
- While the authors rightly focus on the important pillars of open data, open code, open papers and open peer review, I was wondering what their thoughts are on the perhaps more recent calls for open lab notebooks (https://en.wikipedia.org/wiki/Opennotebookscience)? I guess this also touches on previous comments regarding openness of the experimental setup as an important dimension of open science.
- It is suggested that “privacy sensitive data” do not belong in the category of public assets available to the public. Although personal data can indeed not be made openly available for legal and ethical reasons, sometimes it can nevertheless be possible to legally and ethically share research data containing personally identifiable information, albeit usually under more restricted conditions (e.g. some trustworthy data repositories are equipped and have the right procedures in place to offer researchers restricted access to personal or otherwise sensitive data). While such data would of course not constitute fully open data in the sense of the Open Definition, maybe the data sharing story should be presented as somewhat more nuanced than a binary choice between either fully open or fully closed data?
- Technically speaking, there is a third type of data repository, namely the institutional data repository (although this kind is usually less relevant for research domains characterized by greater data volumes and larger degrees of standardization and international collaboration – as these domains often build their own international infrastructures).
- Some (well-known) data repositories focus more on publicly disseminating data than on their preservation, so when selecting a repository it’s usually a good idea to also check whether it has an explicit commitment to/policy regarding long-term preservation (of course, certification will provide a strong indication of this, but not all repositories in re3data.org are certified). Other sensible criteria to take into account can be found here: https://www.openaire.eu/opendatapilot-repository. One that is worth emphasizing is whether the repository assigns persistent and unique identifiers, because this is vital to enabling a culture of data citation (which in turn gives researchers credit for making data available).
- For publicly shared data, standard licenses are in principle more interesting than bespoke licenses, because they allow for legal interoperability. An interesting tool to help you select an appropriate standard license for data (or software) is the EUDAT License Selector (http://ufal.github.io/public-license-selector/), although it also includes licenses that are not conformant with the Open Definition.
- The FAIR data concept is indeed gaining prominence among data sharing advocates, and it may be useful to point out one of its distinctive features, namely its emphasis on making data findable, accessible, interoperable and reusable to humans as well as machines. Although discussions about the FAIR data concept’s implementation and operationalization are still very much ongoing and although appropriate metadata are definitely a crucial element, it also involves other things such as persistent identifiers, user licenses, non-proprietary formats and standard vocabularies. So maybe the FAIR data concept shouldn’t just be mentioned as part of the section on metadata?
- As regards open access to publications: besides posting preprints, another option for authors to make their work open while still publishing in subscription-based journals is to deposit post-prints in their institutional repositories, although some publishers require an embargo period before the post-print can be made open access. As enablers of the self-archiving, “green” route to open access, institutional repositories are nevertheless a vital part of the open access ecosystem.
- The Open Access Directory (http://oad.simmons.edu/oadwiki/Main_Page) and OpenAIRE (https://www.openaire.eu/) might be other useful resources.
- In the context of the general lack of credit for non-traditional research outputs (such as peer reviews), it might also be worthwile pointing to new initiatives attempting to address this issue such as the RIO Open Science Journal (http://riojournal.com/), which publishes a wide variety of research outputs (including e.g. grant proposals and data management plans).
Overall, a great guide for those who want to start practicing open science but don't know where to start!
Dear Myriam, many thanks for this feedback full of useful comments!
I draft here some responses, and we will make sure to address your points in a later version of the article.
- Yes, Open Research is more than Open Science, other comments have pointed to the same 'semantic' issue. We will perhaps still talk about Open Science, but explicitly mention that this is put in a much bigger context.
- Yes, we are aware of the Open Notebook Science practice (https://en.wikipedia.org/wiki/Opennotebookscience), and indeed this falls into the broader context of experimental setups and tools, which will certainly discuss in the revised version of the article.
- This is a very good point, yes. We could indeed present the data sharing story as a more nuanced one, than as a binary choice: open or close. If you have any idea about any nuanced situation, I'd love to hear more about it.
- Yes, understood. Just out of curiosity, do institutional data repositories allow access/visibility of data to third parties? Namely people who are not affiliated with the institution? I believe the answer is no, but I would like to be sure.
- Thank you for this great resource! We will definitely stress more the importance of long-term preservation, and the assignment of unique and persistent IDs.
- The EUDAT License Selector is definitely another great resource, and we will incorporate it in the next round of edits!
- I see why talking about the FAIR concept in the metadata section could arise confusion. Perhaps it is indeed better to elaborate a bit more on it, and shift its discussion in a more generic section on open data.
- Yes, we should indicate indeed more explicitly the possibility to deposit postprints in institutional repositories. Thanks for pointing this out.
- Noted!
- Noted again!
Thanks, this is absolutely helpful!
Dear Paola,
- As regards the more nuanced story of data sharing: basically, apart from fully open access, various other levels of access exist. An example are the three access levels provided by the UK Data Service (open, safeguarded or controlled access), depending on what is suitable given the nature of the data and the permissions that are in place. See: https://www.ukdataservice.ac.uk/deposit-data/how-to/regular-depositors/negotiate
- regarding institutional data repositories: usually these are set up to disseminate the digital data assets of an institution, so they will focus on publicly sharing research data under open access conditions. Some institutional repositories will also offer more restricted levels of access to data that cannot be made fully open for legal, contractual or ethical reasons. It is possible that some of those restricted data are only accessible to people within the institution (but even then, the metadata record for the research data may still be publicly visible).
Good luck with the revision of the paper!