Interpreting and integrating big data in the life sciences

University of California, Los Angeles, Los Angeles, CA, United States
DOI
10.7287/peerj.preprints.27603v1
Subject Areas
Computational Biology, Genomics, Science and Medical Education, Computational Science
Keywords
omics, NGS, big data, computational algorithms, command line interface
Copyright
© 2019 Mangul
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Mangul S. 2019. Interpreting and integrating big data in the life sciences. PeerJ Preprints 7:e27603v1

Abstract

Recent advances in omics technologies have led to the broad applicability of computational techniques across various domains of life science and medical research. These technologies provide an unprecedented opportunity to collect omics data from hundreds of thousands of individuals and to study gene-disease association without the aid of prior assumptions about the trait biology. Despite the many advantages of modern omics technologies, interpretations of big data produced by such technologies require advanced computational algorithms. Below I outline key challenges that biomedical researches are facing when interpreting and integrating big omics data. I discuss the reproducibility aspect of big data analysis in the life sciences and review current practices in reproducible research. Finally, I explain the skills which biomedical researchers need to acquire in order to independently analyze big omics data.

Author Comment

N/A