NxRepair: Error correction in de novo sequence assembly using Nextera mate pairs

Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
Illumina Cambridge, Chesterford Research Park, Cambridge, UK
DOI
10.7287/peerj.preprints.747v1
Subject Areas
Bioinformatics, Genetics, Genomics, Statistics, Computational Science
Keywords
de novo assembly, mate pair, genome assembly, error correction, scaffolding, insert size, misassembly, misassembly detection, assembly quality, automated error detection
Copyright
© 2014 Murphy et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Murphy RR, O'Connell JM, Cox AJ, Schulz-Trieglaff OB. 2014. NxRepair: Error correction in de novo sequence assembly using Nextera mate pairs. PeerJ PrePrints 2:e747v1

Abstract

Scaffolding errors and incorrect traversals of the de Bruijn graph during de novo assembly can result in large scale misassemblies in draft genomes. Nextera mate pair sequencing data provide additional information to resolve assembly ambiguities during scaffolding. Here, we introduce NxRepair, an open source toolkit for error correction in de novo assemblies that uses Nextera mate pair libraries to identify and correct large-scale errors. We show that NxRepair can identify and correct large scaffolding errors, without use of a reference sequence, resulting in quantitative improvements in the assembly quality. NxRepair can be downloaded from GitHub; a tutorial and user documentation are also available.

Author Comment

This is a submission to PeerJ for review.