This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Anslan S, Nilsson H, Wurzbacher C, Baldrian P, Tedersoo L, Bahram M.2018. Great differences in performance and outcome of high-throughput sequencing data analysis platforms for fungal metabarcoding. PeerJ Preprints6:e27019v2https://doi.org/10.7287/peerj.preprints.27019v2
Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appear to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon data set. We conclude that the output of each platform require manual validation of the OTUs by examining the taxonomy assignment values.
Some minor edits were made to the text and Figure 2 (our main conclusions remained unchanged).