MetaCRAST: Reference-guided extraction of CRISPR spacers from unassembled metagenomes
- Published
- Accepted
- Subject Areas
- Bioinformatics, Genomics, Microbiology, Computational Science
- Keywords
- Metagenomics, repetitive sequences, CRISPR, microbial ecology, virus-host interactions
- Copyright
- © 2017 Moller et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2017. MetaCRAST: Reference-guided extraction of CRISPR spacers from unassembled metagenomes. PeerJ Preprints 5:e2278v3 https://doi.org/10.7287/peerj.preprints.2278v3
Abstract
Summary: Clustered regularly interspaced palindromic repeat (CRISPR) systems are prokaryotic adaptive immune systems against viral infection. CRISPR spacer sequences can provide valuable ecological insights by linking environmental viruses to microbial hosts. Despite this importance, metagenomic CRISPR detection remains a major challenge. Here we present a reference-guided CRISPR spacer detection tool (Metagenomic CRISPR Reference-Aided Search Tool - MetaCRAST) that constrains searches based on user-specified direct repeats (DRs). These DRs could be expected from assembly or taxonomic profiles of metagenomes. Our evaluation shows MetaCRAST improves CRISPR spacer detection in real metagenomes compared to de novo CRISPR detection methods. Simulations show it performs better than de novo tools for Illumina metagenomes.
Availability and implementation: MetaCRAST is implemented in Perl and takes metagenomic sequence reads and direct repeat queries (FASTA) as input. It is freely available for download at https://github.com/molleraj/MetaCRAST.[p]
Author Comment
This is an updated version of the manuscript that describes new features (FASTQ parsing and new methods for parallelization). The manuscript was recently accepted at PeerJ.