This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Both classical taxonomy and DNA barcoding are engaged in the task of digitising the living world. Much of the taxonomic literature remains undigitised. The rise of open access publishing this century, and the freeing of older literature from the shackles of copyright has greatly increased the online availability of taxonomic descriptions, but much of the literature of the mid- to late twentieth century remains oﬄine ("dark texts"). DNA barcoding is generating a wealth of computable data that in many ways is much easier to work with than classical taxonomic descriptions, but many of the sequences are not identiﬁed to species level. These "dark taxa" hamper the classical method of integrating biodiversity data using shared taxonomic names. Voucher specimens are a potential common currency of both the taxonomic literature and sequence databases, and could be used to help link names, literature, and sequences. An obstacle to this approach is the lack of stable, resolvable specimen identiﬁers. The paper concludes with an appeal for a global "digital dashboard" to assess the extent to which biodiversity data is available online.
Preprint of invited contribution to DNA barcoding themed issue of Philosophical Transactions of the Royal Society B.