MetaCRAST: Reference-guided extraction of CRISPR spacers from unassembled metagenomes

Department of Microbiology and Immunology, Emory University, Atlanta, Georgia, United States
Department of Biology, Miami University of Ohio, Oxford, Ohio, United States
DOI
10.7287/peerj.preprints.2278v3
Subject Areas
Bioinformatics, Genomics, Microbiology, Computational Science
Keywords
Metagenomics, repetitive sequences, CRISPR, microbial ecology, virus-host interactions
Copyright
© 2017 Moller et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Moller AG, Liang C. 2017. MetaCRAST: Reference-guided extraction of CRISPR spacers from unassembled metagenomes. PeerJ Preprints 5:e2278v3

Abstract

Summary: Clustered regularly interspaced palindromic repeat (CRISPR) systems are prokaryotic adaptive immune systems against viral infection. CRISPR spacer sequences can provide valuable ecological insights by linking environmental viruses to microbial hosts. Despite this importance, metagenomic CRISPR detection remains a major challenge. Here we present a reference-guided CRISPR spacer detection tool (Metagenomic CRISPR Reference-Aided Search Tool - MetaCRAST) that constrains searches based on user-specified direct repeats (DRs). These DRs could be expected from assembly or taxonomic profiles of metagenomes. Our evaluation shows MetaCRAST improves CRISPR spacer detection in real metagenomes compared to de novo CRISPR detection methods. Simulations show it performs better than de novo tools for Illumina metagenomes.

Availability and implementation: MetaCRAST is implemented in Perl and takes metagenomic sequence reads and direct repeat queries (FASTA) as input. It is freely available for download at https://github.com/molleraj/MetaCRAST.[p]

Author Comment

This is an updated version of the manuscript that describes new features (FASTQ parsing and new methods for parallelization). The manuscript was recently accepted at PeerJ.