TED toolkit: a comprehensive approach for convenient transcriptomic profiling as a clinically-oriented application

Thahmina Ali; Baekdoo Kim; Carlos Lijeron; Olorunseun O Ogunwobi; Raja Mazumder; Konstantinos Krampis

doi:10.7287/peerj.preprints.3385v1

TED toolkit: a comprehensive approach for convenient transcriptomic profiling as a clinically-oriented application

Thahmina Ali ¹, Baekdoo Kim¹, Carlos Lijeron¹, Olorunseun O Ogunwobi^1,2,3, Raja Mazumder^4,5, Konstantinos Krampis ^1,2,6

1 Hunter College of The City University of New York, Weill Cornell Medicine - Belfer Research Building, New York, NY, USA

2 Department of Biological Sciences, City University of New York, Hunter College, New York, New York, United States

3 Joan and Sanford I. Weill Department of Medicine, Weill Medical College of Cornell University, New York, New York, United States

4 The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, George Washington University, Washington, DC, United States

5 The McCormick Genomic and Proteomic Center, George Washington University, Washington, DC, United States

6 Department of Physiology and Biophysics, Institute for Computational Biomedicine, Weill Medical College of Cornell University, New York, New York, United States

DOI: 10.7287/peerj.preprints.3385v1

Published: 2017-11-01
Accepted: 2017-11-01

Subject Areas: Bioinformatics, Genomics, Translational Medicine, Computational Science, Data Science
Keywords: Transcriptome, Bioinformatics, Galaxy, RNA-sequencing, Workflow, Data Analysis

Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Ali T, Kim B, Lijeron C, Ogunwobi OO, Mazumder R, Krampis K. 2017. TED toolkit: a comprehensive approach for convenient transcriptomic profiling as a clinically-oriented application. PeerJ Preprints 5:e3385v1 https://doi.org/10.7287/peerj.preprints.3385v1

Abstract

In translational medicine, the technology of RNA sequencing (RNA-seq) continues to prove powerful, and transforming the RNA-seq data into biological insights has become increasingly imperative. We present the Transcriptomics profiler for Easy Discovery (TED) toolkit, a comprehensive approach to processing and analyzing RNA-seq data. TED is divided into three major modules: data quality control, transcriptome data analysis, and data discovery, with eleven pipelines in total. These pipelines perform the preliminary steps from assessing and correcting the quality of the RNA-seq data, to the simultaneous analysis of five transcriptomic features (differentially expressed coding, non-coding, novel isoform genes, gene fusions, alternative splicing events, genetic variants of somatic and germline mutations) and ultimately translating the RNA-seq analysis findings into actionable, clinically-relevant reports. TED was evaluated using previously published prostate cancer transcriptome data where we observed previously studied outcomes, and also created a knowledge database of highly-integrated, biologically relevant reports demonstrating that it is well-positioned for clinical applications. TED is implemented on an instance of the Galaxy platform (Galaxy page: http://galaxy.hunter.cuny.edu/u/bioitcore/p/transcriptomics-profiler-for-easy-discovery-ted-toolkit , Documentation Manual: http://ted.readthedocs.io/en/latest/index.html ) as intuitive and reproducible pipelines providing a manageable strategy for conducting substantial transcriptome analysis in a routine and sustainable fashion for bioinformatics researchers and clinicians alike.

Author Comment

This is a preprint submission to PeerJ Preprints.