Evolution of anatomical concept usage over time: Mining 200 years of biodiversity literature

Department of Computer Science, University of North Carolina at Greensboro, Greensboro, North Carolina, United States
Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
DOI
10.7287/peerj.preprints.2747v1
Subject Areas
Biodiversity, Bioinformatics, Computational Biology, Computational Science
Keywords
data mining, biodiversity literature, BHL, anatomy ontology
Copyright
© 2017 Manda et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Manda P, Vision TJ. 2017. Evolution of anatomical concept usage over time: Mining 200 years of biodiversity literature. PeerJ Preprints 5:e2747v1

Abstract

The scientific literature contains an historic record of the changing ways in which we describe the world. Shifts in understanding of scientific concepts are reflected in the introduction of new terms and the changing usage and context of existing ones. We conducted an ontology-based temporal data mining analysis of biodiversity literature from the 1700s to 2000s to quantitatively measure how the context of usage for vertebrate anatomical concepts has changed over time. The corpus of literature was divided into nine non-overlapping time periods with comparable amounts of data and context vectors of anatomical concepts were compared to measure the magnitude of concept drift both between adjacent time periods and cumulatively relative to the initial state. Surprisingly, we found that while anatomical concept drift between adjacent time periods was substantial (55% to 68%), it was of the same magnitude as cumulative concept drift across multiple time periods. Such a process, bound by an overall mean drift, fits the expectations of a mean-reverting process.

Author Comment

This is a preprint submission to PeerJ Preprints.