Jupyter notebooks in science gateways

San Diego Supercomputer Center, University of California, San Diego, San Diego, California, United States
DOI
10.7287/peerj.preprints.2577v2
Subject Areas
Data Science, Scientific Computing and Simulation
Keywords
High Performance Computing, Scientific Gateways, Jupyter Notebooks
Copyright
© 2016 Zonca
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Zonca A. 2016. Jupyter notebooks in science gateways. PeerJ Preprints 4:e2577v2

Abstract

Jupyter Notebooks empower scientists to create executable documents that include text, equations, code and figures. Notebooks are a simple way to create reproducible and shareable workflows. The Jupyter developers have also released a multi-user notebook environment: Jupyterhub. Jupyterhub provides an extensible platform for handling user authentication and spawning the Notebook application to each user. I developed a plugin for Jupyterhub to spawn notebooks on a Supercomputer and integrated the authentication with CILogon and XSEDE. Scientists can authenticate on their browser and connect to a Jupyter Notebook instance running on the computing node of a Supercomputer, in my test deployment SDSC Comet. Jupyterhub can benefit Science Gateways by providing an expressive interface to a centralized environment with many software tools pre-installed and allow scientists to access Gateway functionality via web API. Scientists can then define their own workflows with maximum flexibility: they can mix data processing with a programming language of their choice, e.g. Python, R or Julia, add their own software modules and call the Gateway web API for the more resource-intensive operations. Workflows written as notebooks and executed in such a standard environment boost reproducibility with minimal effort from the scientists. Such notebooks can be both attached to publications to ensure reproducibility of past research and also modified and improved by other interested research groups. In this talk I will introduce the functionalities of Jupyterhub, give an overview of the architecture of the interface with XSEDE authentication and with Comet and finally different scenarios where Jupyter Notebooks can be integrated into Science Gateways.

Author Comment

This is a preprint submission to PeerJ Preprints (Proceedings for 8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016). First upload had a malformed PDF manuscript.