The democratization of data science education

Author and article information
Abstract
Over the last three decades data has become ubiquitous and cheap. This transition has accelerated over the last five years and training in statistics, machine learning, and data analysis have struggled to keep up. In April 2014 we launched a program of nine courses, the Johns Hopkins Data Science Specialization, which has now had more than 4 million enrollments over the past three years. Here the program is described and compared to both standard and more recently developed data science curricula. We show that novel pedagogical and administrative decisions introduced in our program are now standard in online data science programs. The impact of the Data Science Specialization on data science education in the US is also discussed. Finally we conclude with some thoughts about the future of data science education in a data democratized world.
Cite this as
2017. The democratization of data science education. PeerJ Preprints 5:e3195v1 https://doi.org/10.7287/peerj.preprints.3195v1Author comment
The first version of our submission to The American Statistician.
Sections
Additional Information
Competing Interests
There are no competing interests.
Author Contributions
Sean Kross conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Roger D Peng conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools.
Brian S Caffo conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, reviewed drafts of the paper.
Ira Gooding conceived and designed the experiments, performed the experiments, contributed reagents/materials/analysis tools.
Jeffrey T Leek conceived and designed the experiments, performed the experiments, contributed reagents/materials/analysis tools, wrote the paper, reviewed drafts of the paper.
Data Deposition
The following information was supplied regarding data availability:
The research in this article did not generate any data or code. In this article we describe our experiences creating an online academic program.
Funding
This work was supported by NIH R01 GM115440. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.