Beyond p-values in the evaluation of brain-computer interfaces
- Published
- Accepted
- Subject Areas
- Neuroscience, Statistics, Human-Computer Interaction
- Keywords
- brain-computer interface (BCI), Bayesian inference, classification accuracy, p-values, generalized linear model (GLM), hierarchical models, Bayesian estimation, null hypothesis significance testing (NHST)
- Copyright
- © 2016 Melinscak et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2016. Beyond p-values in the evaluation of brain-computer interfaces. PeerJ Preprints 4:e1828v1 https://doi.org/10.7287/peerj.preprints.1828v1
Abstract
To statistically evaluate the performance of brain-computer interfaces (BCIs), researchers usually rely on null hypothesis significance testing (NHST), i.e. p-values. However, over-reliance on NHST is often identified as one of the causes of the recent reproducibility crisis in psychology and neuroscience. In this paper we propose Bayesian estimation as an alternative to NHST in the analysis of BCI performance data. For the three most common experimental designs in BCI research - which would usually be analyzed using a t-test, a linear regression, or an ANOVA - we develop hierarchical models and estimate their parameters using Bayesian inference. Furthermore, we show that the described models are special cases of the hierarchical generalized linear model (HGLM), which we propose as a general framework for the analysis of BCI performance. The HGLM framework allows the analysis of complex experimental designs with multiple levels of hierarchy (e.g. multiple sessions, multiple subjects, multiple groups) and can accommodate different types of non-normal data (e.g. classification accuracy), which are often analyzed under inappropriate assumptions with NHST. We demonstrate the effectiveness of the proposed models on three real datasets and show how the results obtained with Bayesian estimation can give a more nuanced insight into BCI performance data, compared to NHST. Therefore we believe that a wider adoption of the Bayesian estimation approach in BCI studies could bring about greater transparency in data analysis, allow accumulation of knowledge across studies, and reduce questionable practices such as "p-hacking". To achieve this goal, we provide all the data and code necessary to reproduce the presented results, allowing BCI researchers to use Bayesian estimation in their own work.
Author Comment
This is a preprint submission to PeerJ Preprints.