Thanks for your brilliant work. Your approach to a more centralized and transparent way of large-scale Open Access monitoring will be of great use for the scientific community.
While reading your article as well as the comments and having a look at the underlying data, two points occurred to me that might be worth considering:
1. A closer look at the reported accuracy of oaDOI.
Following the links to the 43 reported false negative OA articles (pp. 7 f. and data in accuracy_analysis.xslx), a lot of those seem to fall in one or several of the following categories:
- miscellaneous document types (letters, editorials, news, tables of content etcetera)
- old publications (10 years old or much older; for some copyright has even expired)
- ephemeral or incomplete PDFs (PDFs were not found or did only show an excerpt of the full
This short test indicates that oaDOI would probably have a much higher recall for a subset of more recent articles, reviews, proceedings and/or a stricter definition of false negative errors.
2. Some thoughts on the OA categorization.
Hybrid. The way I see it, the categorization of Hybrid OA is perfectly fine without any addition and fits well within the established definition of the OA community. Hybridity in the context of OA refers to the coexistence of subscription and publication based pricing schemes with regards to a certain entity, most commonly the journal. While one might equate an OA article in a hybrid journal with an article in a full OA journal based on their accessibility and license, it is not true to say that there is no inherent difference between them, as one of the commentators did. For example, there is a substantial difference in terms of their underlying business models and pricing mechanisms which render their conceptual differentiation and empirical identification a vital contribution to OA monitoring.
Bronze. This novel category is interesting but, as also mentioned in the article, still a bit too blurred. The manual inspection of a small sample of Bronze articles shows that nearly half of the articles are akin to Gold but rather hidden (p. 13).
The rest sounds more like a sort of Grey OA to me. With the characteristics no OA licence, delayed access on commercial plattforms and questionable persistence of content dependent on unforeseeable inhibition activities by publishers they are, from a non-publisher perspective, actually quite similar to the content shared on academic social networks (disregarding the legal differences of course). For the sake of consistency, this Grey OA should possibly be excluded from the oaDOI service or included in its entirety, complete with academic social network content. In order to develop oaDOI into a comprehensive OA monitoring tool, I would personally prefer the latter because it leaves ex post deletion of the Grey OA type at the discretion of the monitoring analyst.