The Vertebrate Taxonomy Ontology: A framework for reasoning across model organism and species phenotypes

Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, USA
National Evolutionary Synthesis Center, Durham, North Carolina, United States
Department of Biology, University of South Dakota, Vermillion, South Dakota, USA
Department of Biology, University of North Carolina, Chapel Hill, North Carolina, USA
Department of Organismal Biology and Anatomy, University of Chicago, Chicago, Illinois, USA
Academy of Natural Sciences, Philadelphia, Pennsylvania, USA
Institute of Neuroscience, University of Oregon, Eugene, Oregon, USA
Department of Vertebrate Zoology and Anthropology, California Academy of Sciences, San Francisco, California, USA
DOI
10.7287/peerj.preprints.28v1
Subject Areas
Bioinformatics, Evolutionary Studies, Paleontology, Taxonomy
Keywords
paleontology, data integration, taxonomic rank, evolutionary biology
Copyright
© 2013 Midford et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Cite this article
Midford PE, Dececchi TA, Balhoff JP, Dahdul WM, Ibrahim N, Lapp H, Lundberg JG, Mabee PM, Sereno PC, Westerfield M, Vision TJ, Blackburn DC. 2013. The Vertebrate Taxonomy Ontology: A framework for reasoning across model organism and species phenotypes. PeerJ PrePrints 1:e28v1

Abstract

Background: A hierarchical taxonomy of organisms is a prerequisite for semantic integration of biodiversity data. Ideally, there would be a single, expansive, authoritative taxonomy that includes extinct and extant taxa, information on synonyms and common names, and monophyletic supraspecific taxa that reflect our current understanding of phylogenetic relationships.

Description: As a step towards development of such a resource, and to enable large-scale integration of phenotypic data across the vertebrates, we created the Vertebrate Taxonomy Ontology (VTO), a semantically defined taxonomic resource derived from the integration of existing taxonomic compilations, and freely distributed under a Creative Commons Zero (CC0) public domain waiver. The VTO includes both extant and extinct vertebrates and currently contains 106,927 taxonomic terms, 23 taxonomic ranks, 104,506 synonyms, and 162,132 taxonomic cross-references. Key challenges in constructing the VTO included (1) extracting and merging names, synonyms, and identifiers from heterogeneous sources; (2) replacing subgroups with more authoritative local taxonomies; and (3) automating this process as much as possible to accommodate updates in source taxonomies.

Conclusions: The VTO is the primary source of taxonomic information used by the Phenoscape Knowledgebase (http://phenoscape.org/), which integrates genetic and evolutionary phenotype data across both model and nonmodel vertebrates. The VTO is useful for crudely inferring phenotypic changes on the vertebrate tree of life, which enables queries for candidate genes for different episodes in vertebrate evolution.