A workflow to integrate pre-processing, analysis and comparison of MALDI-ToF mass spectra in GeenaR
- Published
- Accepted
- Subject Areas
- Bioinformatics, Computational Science
- Keywords
- MALDI-TOF MS, Computational Proteomics, Advanced Statistics, Web application, LAMP & R
- Copyright
- © 2016 Del Prete et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2016. A workflow to integrate pre-processing, analysis and comparison of MALDI-ToF mass spectra in GeenaR. PeerJ Preprints 4:e2255v1 https://doi.org/10.7287/peerj.preprints.2255v1
Abstract
Many large-scale proteomics studies have been performed in the last years, and this field of investigation is expanding up. If the analysis of any single spectrum can be performed by tools already made available along with the mass spectrometry (MS) instrumentation, comparison of spectra on a large scale represents a complex aspect of the analysis and interpretation of the study. Recently, we developed Geena 2, a tool for the automation of different steps in the MALDI/ToF MS data analysis. Integration of further tools can be performed, in order to improve some aspects of the whole workflow: the input of more data formats, the implementation of new algorithms for data cleaning, the graphical visualization and the reporting of the results, the use of advanced statistics for the comparison of mass spectra. For this motivations, we are now developing GeenaR, a new robust web tool for pre-processing, analysing, visualizing and comparing a set of MALDI-ToF mass spectra. The aim of this work is the presentation of on-going developments. The first results will be presented at the conference. GeenaR is being written in PHP, Perl (from Geena 2) and R languages. The R packages used are MALDIquant and MALDIquantForeign for mass spectra pre-processing and analysis, OrgMassSpecR for mass spectra comparison, dendextend and pvclust for clustering, and sda and crossval for variable selection. The system is being implemented in a LAMP (Linux, Apache, MySQL, PHP) environment. Proper interfaces between PHP on one side and perl and R on the other are then implemented. The aim of GeenaR is to provide to the users a wider range of statistical methods and graphical results, without making it more difficult to use for researchers with little expertise in programming. In order to achieve this goal, we have taken advantage of the availability of several packages, written in R language, for mass spectrometry statistical that are going to be integrated in the system. The complete pipeline of GeenaR includes some features already available in Geena 2 plus others under development thanks to the integration of the R environment. In fact, an original set of heuristic algorithms is already available in Geena 2. In particular, they are the identification of isotopic peaks by taking into account molecular weight of signals and the related trend of abundances; the normalization on the basis of a reference standard molecule; the peak selection by means of a threshold line, built by linearly interpolating values provided for given m/z values; the alignment, by selecting the nearest peaks, within a limited m/z difference, in the different mass spectra. By means of some R packages, GeenaR adds new statistical methods which are highly relevant for mass spectra analysis.
(Abstract truncated at 3,000 characters - the full version is available in the pdf file)
Author Comment
This is an abstract which has been accepted for the BITS2016 Meeting.