Biotea, semantics for Pubmed Central

Ontology Engineering Group, Universidad Politécnica de Madrid, Madrid, Spain
Escuela de Ingeniería de Sistemas y Computación, Universidad del Valle, Cali, Colombia
Temporal Knowledge Bases Group, Department of Computer Languages and Systems, Universitat Jaume I, Castelló de la Plana, Spain
Maastricht University, Institute of Data Science, Maastricht, The Netherlands
DOI
10.7287/peerj.preprints.3469v1
Subject Areas
Bioinformatics, Data Mining and Machine Learning, Data Science
Keywords
semantic web, ontology, linked data, RDF, SPARQL, semantic
Copyright
© 2017 Garcia et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Garcia A, Lopez F, Garcia L, Giraldo O, Bucheli V, Dumontier M. 2017. Biotea, semantics for Pubmed Central. PeerJ Preprints 5:e3469v1

Abstract

A significant portion of biomedical literature is represented in a manner that makes it difficult for consumers to find or aggregate content through a computational query. One approach to facilitate reuse of the scientific literature is to structure this information as linked data using standardized web technologies. In this paper we present the second version of Biotea, a semantic, linked data version of the open-access subset of PubMed Central that has been enhanced with specialized annotation pipelines that uses existing infrastructure from the National Center for Biomedical Ontology. We expose our models, services, software and datasets. Our infrastructure enables manual and semi-automatic annotation, resulting data are represented as RDF-based linked data and can be readily queried using the SPARQL query language. We illustrate the utility of our system with several use cases. Availability: Our datasets, methods and techniques are available at http://biotea.github.io

Author Comment

This is a submission to PeerJ for review.