tmod: an R package for general and multivariate enrichment analysis

January Weiner 3rd; Teresa Domaszewska

doi:10.7287/peerj.preprints.2420v1

tmod: an R package for general and multivariate enrichment analysis

January Weiner 3rd , Teresa Domaszewska

Department of Immunology, Max Planck Institute for Infection Biology, Berlin, Germany

DOI: 10.7287/peerj.preprints.2420v1

Published: 2016-09-04
Accepted: 2016-09-04

Subject Areas: Bioinformatics, Immunology, Statistics, Metabolic Sciences, Computational Science
Keywords: gene set enrichment analysis, Fisher's method, gsea, metabolomics, systems biology, functional multivariate analysis, visualization

Copyright: © 2016 Weiner 3rd et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Weiner 3rd J, Domaszewska T. 2016. tmod: an R package for general and multivariate enrichment analysis. PeerJ Preprints 4:e2420v1 https://doi.org/10.7287/peerj.preprints.2420v1

Abstract

“Omics” studies generate long lists of genes, proteins, metabolites or other features which can be difficult to decipher. Feature set enrichment analysis utilizing annotated groups/classes of features (such as pathways, gene ontology terms or gene/metabolic modules) can provide a powerful gateway to associate data to phenotypes such as disease process or treatment progression. At the same time, the increasing use of technologies to generate multidimensional omics data sets based on specific cell types or responses to stimuli increases the number and breadth of annotated feature sets available for enrichment analysis, facilitating the ability to draw biologically relevant conclusions. However, existing tools and applications for enrichment analysis are adapted specifically to gene set enrichment and lack functionalities to analyze rapidly growing amounts of metabolomics and other data. Moreover, such tools often provide only a limited range of statistical methods, rely on permutation tests, lack suitable visualization tools to facilitate result interpretation in complex experimental setups, and lack standalone versions usable in semi-automatized workflows. Here, we present tmod, an R package which implements powerful statistical methods for enrichment analysis. Tmod includes definitions of widely used feature sets for transcriptomic and metabolomic profiling and also allows use of custom user-provided feature sets. Moreover, it provides novel and intuitive visualiza- tion methods which facilitate interpretation of complex data sets. The implemented statistical tests allow the significance of enrichment within sorted feature lists to be calculated without randomization tests and thus are suitable for combining functional analysis with multivariate techniques.

Author Comment

tmod combines state of the art approaches to variable set enrichment tests with novel visualizations and analytical statistical solutions. Furthermore, tmod allows a seamless integration with limma as well as multivariate techniques, and includes gene sets and metabolite sets for a rapid analysis of transcriptomic and metabolomic data.

Supplemental Information

tmod source package (R language)

The R source package containing tmod sources as well as the associated data sets.

DOI: 10.7287/peerj.preprints.2420v1/supp-1

Download

tmod package vignette

The vignette of the tmod package containing usage examples and tutorials.

DOI: 10.7287/peerj.preprints.2420v1/supp-2

Download