Evolution of anatomical concept usage over time: Mining 200 years of biodiversity literature
- Published
- Accepted
- Subject Areas
- Biodiversity, Bioinformatics, Computational Biology, Computational Science
- Keywords
- data mining, biodiversity literature, BHL, anatomy ontology
- Copyright
- © 2017 Manda et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2017. Evolution of anatomical concept usage over time: Mining 200 years of biodiversity literature. PeerJ Preprints 5:e2747v1 https://doi.org/10.7287/peerj.preprints.2747v1
Abstract
The scientific literature contains an historic record of the changing ways in which we describe the world. Shifts in understanding of scientific concepts are reflected in the introduction of new terms and the changing usage and context of existing ones. We conducted an ontology-based temporal data mining analysis of biodiversity literature from the 1700s to 2000s to quantitatively measure how the context of usage for vertebrate anatomical concepts has changed over time. The corpus of literature was divided into nine non-overlapping time periods with comparable amounts of data and context vectors of anatomical concepts were compared to measure the magnitude of concept drift both between adjacent time periods and cumulatively relative to the initial state. Surprisingly, we found that while anatomical concept drift between adjacent time periods was substantial (55% to 68%), it was of the same magnitude as cumulative concept drift across multiple time periods. Such a process, bound by an overall mean drift, fits the expectations of a mean-reverting process.
Author Comment
This is a preprint submission to PeerJ Preprints.