GenomeHubs: Simple containerised setup of a custom Ensembl database and web server for any species
- Published
- Accepted
- Subject Areas
- Bioinformatics, Genomics
- Keywords
- Genome databasing, Ensembl, Genome browser, Pipeline, Docker, Containerisation
- Copyright
- © 2017 Challis et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2017. GenomeHubs: Simple containerised setup of a custom Ensembl database and web server for any species. PeerJ Preprints 5:e2401v2 https://doi.org/10.7287/peerj.preprints.2401v2
Abstract
As the generation and use of genomic datasets is becoming increasingly common in all areas of biology, the need for resources to collate, analyse and present data from one or more genome projects is becoming more pressing. The Ensembl platform is a powerful tool to make genome data and cross-species analyses easily accessible through a web interface and a comprehensive API. Here we introduce GenomeHubs, which provide a containerised environment to facilitate the setup and hosting of custom Ensembl genome browsers. This simplifies mirroring of existing content and import of new genomic data into the Ensembl database schema.GenomeHubs also provide a set of analysis containers to decorate imported genomes with results of standard analyses and functional annotations and support export to flat files, including EMBL format for submission of assemblies and annotations to INSDC.Database URL: http://GenomeHubs.org
Author Comment
The focus of this article has changed to reflect the inclusion of the EasyMirror and EasyImport pipelines in GenomeHubs, which uses Docker to provide a set of tools for setting up a custom, Ensembl-based resource for hosting genomic data.
Supplemental Information
Supplementary File 1
Example GenomeHubs commands and configuration files to mirror and import into an Ensembl site
Supplementary File 2
The entire script to extract CDS and domain counts from the Ensembl Perl API (Figure 2; Table 1)
