From raw ion mobility measurements to disease classification: a comparison of analysis processes

Author and article information
Abstract
Ion mobility spectrometry (IMS) is a technology for the detection of volatile compounds in the air of exhaled breath that is increasingly used in medical applications. One major goal is to classify patients into disease groups, for example diseased versus healthy, from simple breath samples. Raw IMS measurements are data matrices in which peak regions representing the compounds have to be identified and quantified. A typical analysis process consists of pre-processing and peak detection in single experiments, peak clustering to obtain consensus peaks across several experiments, and classification of samples based on the resulting multivariate peak intensities. Recently several automated algorithms for peak detection and peak clustering have been introduced, in order to overcome the current need for human-based analysis that is slow, subjective and sometimes not reproducible. We present an unbiased comparison of a multitude of combinations of peak processing and multivariate classification algorithms on a disease dataset. The specific combination of the algorithms for the different analysis steps determines the classification accuracy, with the encouraging result that certain fully-automated combinations perform even better than current manual approaches.
Cite this as
2015. From raw ion mobility measurements to disease classification: a comparison of analysis processes. PeerJ PrePrints 3:e1294v1 https://doi.org/10.7287/peerj.preprints.1294v1Author comment
The two last authors, Jörg Rahnenführer and Sven Rahmann, contributed equally. This work has been presented at the German Conference on Bioinformatics 2015.
Sections
Additional Information
Competing Interests
Sven Rahmann is an Academic Editor for PeerJ.
Author Contributions
Salome Horsch analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Dominik Kopczynski analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.
Jörg Ingo Baumbach contributed reagents/materials/analysis tools, reviewed drafts of the paper.
Jörg Rahnenführer analyzed the data, wrote the paper, reviewed drafts of the paper.
Sven Rahmann analyzed the data, wrote the paper, reviewed drafts of the paper.
Funding
Salome Horsch, Jörg Ingo Baumbach, Jörg Rahnenführer and Sven Rahmann are supported by the Collaborative Research Center (Sonderforschungsbereich, SFB) 876 "Providing Information by Resource-Constrained Data Analysis'', projects TB1 and C1; see http://sfb876.tu-dortmund.de. Sven Rahmann acknowledges support from Mercator Research Center Ruhr, project MERCUR Pe-2013-0012 (UA Ruhr Professorship "Computational Biology''). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.