Computational reproducibility refers to the possibility of reconstructing all the steps of a workflow that connects raw data, processed data and results: it is a fundamental issue in the omic studies because of the complex and high-dimensional nature of the involved data. The analysis of omics data needs to exploit multi-step workflows including pre-processing, elaboration, statistical validation, interpretation and presentation. Although some analysis platforms are able to ensure computational reproducibility for different omics studies, they do not provide explicit information about the executed code. The availability of the code increases the quality of research in terms of transparency and knowledge transfer. Moreover, it allows other researchers to reproduce the results in a local system, make a comparison among the results and re-use computer code for analyzing different dataset.
Geena 2 is a robust web tool for MALDI-ToF mass spectra pre-processing. Its main output is the list of common peaks identified by aligning average spectra originated from groups of replicates from different samples. Intermediate results are also made available. GeenaR is an extension of Geena 2 still under development. Its objective is the integration in the platform of some R libraries, which may provide advanced statistical analyses, thus enriching the current output. It is noteworthy that many R packages follow the reproducible research philosophy.For the aims of GeenaR, the following R packages and tools have been considered: R-Markdown, knitr and spin. The implementation of these resources on an existing web platform can be an added value for its reporting features, since it improves the creation of a report about the work carried out, especially with reference to the code.
Results and Discussion
One of the aims of both Geena 2 and GeenaR is facilitating the users in analyzing MALDI-ToF mass spectra by providing a web-interface that allows to upload data, select different algorithms and parameters, execute the analysis in order to obtain results according to a specific demand. Thanks to the novel reproducible research module implemented in GeenaR, the system generates a report containing all the steps performed. More in details, the report will provide: date and time of the execution, the R libraries used for the process, chunks of code for main elaborations, selected parameters (either by the users or by the system), uploaded data in MALDIquant ‘Mass Spectrum’ class type, numerical and graphical results, short explanation about the workflow, version of the system and of the packages. GeenaR generates the results in a compressed archive, with separated log and graphical results, and a report, both in R-Markdown and in HTML format. It is important to underline strongly that reproducible research is not an optional, but a fundamental component of a good computational practice, which becomes essential in computational biology.