Language workbench user interfaces for data analysis
- Published
- Accepted
- Subject Areas
- Bioinformatics, Human-Computer Interaction, Computational Science
- Keywords
- Biological Data Analysis, Bioinformatics Training, Biomarker Development, Language Workench, Data Analysis Abstractions
- Copyright
- © 2014 Benson et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
- Cite this article
- 2014. Language workbench user interfaces for data analysis. PeerJ PrePrints 2:e511v1 https://doi.org/10.7287/peerj.preprints.511v1
Abstract
Biological data analysis software is frequently performed with command line software. While this practice provides considerable flexibility for computationally savy individuals, such as investigators trained in bioinformatics, this also creates a barrier to the widespread use of data analysis software by investigators trained as biologists and/or clinicians. Dataflow systems such as Galaxy and Taverna have been developed to try and provide generic user interfaces that can wrap command line analysis software. These solutions are useful for problems that can be solved with the dataflow abstraction, and that do not require specialized user interfaces. For instance, developing biomarker models from high-throughput data is a type of analysis that cannot be directly expressed with the dataflow model. In contrast, we show here that Language Workbench (LW) technology can be used to model the biomarker development and validation process. We developed a language that models the concepts of Dataset, Endpoint, Feature Selection Method and Classifier. These high-level language concepts map directly to abstractions that analysts who develop biomarker models are familiar with. We found that user interfaces developed in the Meta-Programming System (MPS) LW provide convenient means to configure a biomarker development project, to train models and view the validation statistics. We discuss several advantages of developing user interfaces for data analysis with a LW, including increased interface consistency, portability and extension by language composition. The language developed during this experiment is distributed as an MPS plugin (available at http://campagnelab.org/software/bdval-for-mps/).
Author Comment
11 pages, 6 figures. Comment on this manuscript on Twitter or Google+ with this handle: #BDValForMPS. A revised version of this manuscript will be submitted to PeerJ for peer-review.