The Vertebrate Taxonomy Ontology: A framework for reasoning across model organism and species phenotypes
- Published
- Accepted
- Subject Areas
- Bioinformatics, Evolutionary Studies, Paleontology, Taxonomy
- Keywords
- paleontology, data integration, taxonomic rank, evolutionary biology
- Copyright
- © 2013 Midford et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- Cite this article
- 2013. The Vertebrate Taxonomy Ontology: A framework for reasoning across model organism and species phenotypes. PeerJ PrePrints 1:e28v1 https://doi.org/10.7287/peerj.preprints.28v1
Abstract
Background: A hierarchical taxonomy of organisms is a prerequisite for semantic integration of biodiversity data. Ideally, there would be a single, expansive, authoritative taxonomy that includes extinct and extant taxa, information on synonyms and common names, and monophyletic supraspecific taxa that reflect our current understanding of phylogenetic relationships.
Description: As a step towards development of such a resource, and to enable large-scale integration of phenotypic data across the vertebrates, we created the Vertebrate Taxonomy Ontology (VTO), a semantically defined taxonomic resource derived from the integration of existing taxonomic compilations, and freely distributed under a Creative Commons Zero (CC0) public domain waiver. The VTO includes both extant and extinct vertebrates and currently contains 106,927 taxonomic terms, 23 taxonomic ranks, 104,506 synonyms, and 162,132 taxonomic cross-references. Key challenges in constructing the VTO included (1) extracting and merging names, synonyms, and identifiers from heterogeneous sources; (2) replacing subgroups with more authoritative local taxonomies; and (3) automating this process as much as possible to accommodate updates in source taxonomies.
Conclusions: The VTO is the primary source of taxonomic information used by the Phenoscape Knowledgebase (http://phenoscape.org/), which integrates genetic and evolutionary phenotype data across both model and nonmodel vertebrates. The VTO is useful for crudely inferring phenotypic changes on the vertebrate tree of life, which enables queries for candidate genes for different episodes in vertebrate evolution.