Raincloud plots: a multi-platform tool for robust data visualization

Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom
Department of Mathematics, University of Padua, Padova, Italy
Padova Neuroscience Center, University of Padua, Padova, Italy
Alan Turing Institute, London, United Kingdom
Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
Department of Psychology, University of Cambridge, Cambridge, United Kingdom
Max-Planck Centre for Computational Psychiatry and Aging, University College London, University of London, London, United Kingdom
DOI
10.7287/peerj.preprints.27137v1
Subject Areas
Data Science, Graphics
Keywords
raincloud plots, robust visualization, data science, violin plots, barbarplots, matlab, python, Rstudio
Copyright
© 2018 Allen et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Allen M, Poggiali D, Whitaker K, Marshall TR, Kievit R. 2018. Raincloud plots: a multi-platform tool for robust data visualization. PeerJ Preprints 6:e27137v1

Abstract

Across scientific disciplines, there is a rapidly growing recognition of the need for more statistically robust, transparent approaches to data visualization. Complimentary to this, many scientists have realized the need for plotting tools that accurately and transparently convey key aspects of statistical effects and raw data with minimal distortion. Previously common approaches, such as plotting conditional mean or median barplots together with error-bars have been criticized for distorting effect size, hiding underlying patterns in the raw data, and obscuring the assumptions upon which the most commonly used statistical tests are based. Here we describe a data visualization approach which overcomes these issues, providing maximal statistical information while preserving the desired ‘inference at a glance’ nature of barplots and other similar visualization devices. These “raincloud plots” can visualize raw data, probability density, and key summary statistics such as median, mean, and relevant confidence intervals in an appealing and flexible format with minimal redundancy. In this tutorial paper we provide basic demonstrations of the strength of raincloud plots and similar approaches, outline potential modifications for their optimal use, and provide open-source code for their streamlined implementation in R, Python and Matlab ( https://github.com/RainCloudPlots/RainCloudPlots ). Readers can investigate the R and Python tutorials interactively in the browser using Binder by Project Jupyter.

Author Comment

In this preprint, we provide code and tutorials for making your own "Raincloud Plots" across a variety of popular software platforms (R, Python, and Matlab). This article details the overall reasoning and motivation for the use of raincloud plots, modular functions for their creation, and interactive web-based code tutorials (see Github link, below). Additionally, we invite readers to contribute their own additions and variations to our github repository.

Code Repository:

https://github.com/RainCloudPlots/RainCloudPlots

Guide to contributing:

https://github.com/RainCloudPlots/RainCloudPlots/blob/master/CONTRIBUTING.md