Interpreting and integrating big data in the life sciences

Serghei Mangul

doi:10.7287/peerj.preprints.27603v1

Interpreting and integrating big data in the life sciences

Serghei Mangul

University of California, Los Angeles, Los Angeles, CA, United States

DOI: 10.7287/peerj.preprints.27603v1

Published: 2019-03-19
Accepted: 2019-03-19

Subject Areas: Computational Biology, Genomics, Science and Medical Education, Computational Science
Keywords: omics, NGS, big data, computational algorithms, command line interface

Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Mangul S. 2019. Interpreting and integrating big data in the life sciences. PeerJ Preprints 7:e27603v1 https://doi.org/10.7287/peerj.preprints.27603v1

Abstract

Recent advances in omics technologies have led to the broad applicability of computational techniques across various domains of life science and medical research. These technologies provide an unprecedented opportunity to collect omics data from hundreds of thousands of individuals and to study gene-disease association without the aid of prior assumptions about the trait biology. Despite the many advantages of modern omics technologies, interpretations of big data produced by such technologies require advanced computational algorithms. Below I outline key challenges that biomedical researches are facing when interpreting and integrating big omics data. I discuss the reproducibility aspect of big data analysis in the life sciences and review current practices in reproducible research. Finally, I explain the skills which biomedical researchers need to acquire in order to independently analyze big omics data.

Author Comment

N/A