I have three major issues with this manuscript:
Firstly, the authors describe five goals:
1. test the usability of the new OA data and methods
2. explore and present data: show OA levels by field, language, country, institution, funder & topic
3. suggest explanations & further research
4. show applicable uses of this method for informing policy development and monitoring their effects
5. suggest improvements in data availability and methods.
re goal #1.
In my opinion, what is lacking to reach this goal is the definition of usability and a description of usable for what purpose, a description of how usability is tested/measured and a description of what, preferably quantitative, criteria are used for a conclusion on usability.
re goal #2 and #3
I find it a pity that this manuscript has only the ambition to explore and present data and suggest explanations. The work would profit from more focus and clear testable hypotheses and research questions.
Secondly, the authors describe a number of caveats (p9) that ...
Caveat 1c (coverage of IRs by oaDOI) is a major problem for the presented data. As far as I can judge, green OA numbers for Dutch universities presented in this manuscript are incorrect. I suspect this is due to incomplete harvesting of Dutch IRs and the authors should check this. On p10 they claim "green OA data reported in Web of Science are restricted to green only, hence quite low". However, the authors have not investigated this, thus cannot make this statement. Dutch universities also report green only and report much higher percentages (up to 20%). Not "a minimal effect on overall OA levels". I'm worried about how this caveat affects the data presented. As it is not clear what the coverage of oaDOI is and data are largely presented as overall OA percentages, it is impossible to interpret the results.
On a sidenote, on p29, the authors claim that "the overall percentage of OA ... for all Dutch universities as reported by the VSNU.... is consistent with the overall figure for the Netherlands in the WoS data (both 42%). This is not quite true. In the WoS data the 42% includes a large percentage of bronze OA. Not all Dutch universities include this category in their analysis. Same goes for green OA not included in an IR (but in PMC or arXiv for instance).
With regards to caveat 2, the authors could easily quantify the impact of this caveat by comparing the results of the OA tool in WoS with the results of oaDOI, for instance for the last year.
Similarly, caveat 3, the impact of the inclusion of the ESCI cold be easily tested by excluding ESCI from the dataset and compare the results with the full dataset.
Caveat 4 could also be addressed in more detail. Gold and hybrid percentages are not expected to change much with the progress of time, and it would be interesting to zoom in to those two categories. Also, the hypothesis that this has an effect on both green (due to embargo times) and bronze (due to moving walls) could be investigated further by looking at the longitudinal data of those categories separately.
I have a lot of questions re Caveat 5. For instance:
- what instance of the DOAJ is used in oaDOI? the most recent year only or different versions for publications of different age? what would be the consequence of this?
- wouldn't publications in journals that have been removed from the DOAJ pop up as hybrid or bronze OA in the analysis? or have those journals been removed from WoS as well? what would be the consequence of that?
More detail in the methodology and more awareness in the results/analysis section would be valuable.
Re caveat 6. It would be informative if the authors would indicate where the number of publications is low and specify the threshold they use.
The third issue I have with this manuscript is the data quality. The data are incorrect for the Dutch situation and that doesn't give me much trust in the overall numbers in other countries, fields, languages etc. With different initiatives to measure and present OA information, I do welcome thorough analysis of new tools and databases, but I think an important issue is data quality; something not addressed in this manuscript and that is a real loss.