Standardizing unique molecular identifiers in SAM flags would benefit more than RNA-Seq

Departments of Medicine & Genetics, Division of Rheumatology, Washington University, Saint Louis, Missouri, United States
DOI
10.7287/peerj.preprints.2465v1
Subject Areas
Bioinformatics, Genetics, Genomics
Keywords
SAM, UMI, RNA-Seq, ATAC-Seq, ChIP-Seq, mark duplicates, unique molecular identifiers, amplicon-Seq, FASTQ
Copyright
© 2016 Roberson
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Roberson ED. 2016. Standardizing unique molecular identifiers in SAM flags would benefit more than RNA-Seq. PeerJ Preprints 4:e2465v1

Abstract

Unique Molecular Identifiers (UMIs) have been incorporated into RNA-Seq experiments to overcome issues with abundance estimation from samples that may have many PCR amplification cycles. However, the use of UMIs in many different types of sequencing experiments could be beneficial, including amplicon sequencing, ATAC-Seq, and ChIP-Seq. Furthermore, UMIs help to overcome artifacts in high-coverage DNA-Seq, and would enable more accurate RNA-Seq genotyping and allele-specific expression calculation. The main advantage of using UMIs is that identical molecules that are true PCR duplicates can be discerned from unique molecules with identical break points.

Author Comment

This preprint is an author opinion on the use of UMIs in sequencing experiments, in which I argue their usage should be expanded and become a standardized part of aligned sequencing formats, such as SAM format, to facilitate their use in marking duplicate molecules.