rentrez: An R package for the NCBI eUtils API

Institute of Fundamental Sciences, Massey University, Palmerston North, Manawatu, New Zealand
DOI
10.7287/peerj.preprints.3179v2
Subject Areas
Bioinformatics, Genetics, Genomics, Molecular Biology, Data Mining and Machine Learning
Keywords
Rstats, NCBI, Pubmed, reproducible research, genetics, genomics, data-mining
Copyright
© 2017 Winter
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Winter DJ. 2017. rentrez: An R package for the NCBI eUtils API. PeerJ Preprints 5:e3179v2

Abstract

The USA National Center for Biotechnology Information (NCBI) is one of the world’s most important sources of biological information. NCBI databases like PubMed and GenBank contain millions of records describing bibliographic, genetic, genomic, and medical data. Here I present rentrez, a package which provides an R interface to 50 NCBI databases. The package is well-documented, contains an extensive suite of unit tests and has an active user base. The programmatic interface to the NCBI provided by rentrez allows researchers to query databases and download or import particular records into R sessions for subsequent analysis. The complete nature of the package, its extensive test-suite and the fact the package implements the NCBI’s usage policies all make rentrez a powerful aid to developers of new packages that perform more specific tasks.

Author Comment

This version was submitted to The R Journal and fixes a number of typographical errors.