A simple scaling normalization for comparing ChIP-Seq samples

Paul Manser; Mark Reimers

doi:10.7287/peerj.preprints.175v2

A simple scaling normalization for comparing ChIP-Seq samples

Paul Manser ^1,2, Mark Reimers^1,2,3

1 Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA

2 Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, USA

3 Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA

DOI: 10.7287/peerj.preprints.175v2

Published: 2014-04-10
Accepted: 2014-04-10

Subject Areas: Bioinformatics, Genomics, Neuroscience, Statistics
Keywords: ChIP-Seq, K4me3, K27ac, DNase-Seq, normalization, ENCODE

Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Cite this article: Manser P, Reimers M. 2014. A simple scaling normalization for comparing ChIP-Seq samples. PeerJ PrePrints 2:e175v2 https://doi.org/10.7287/peerj.preprints.175v2

Abstract

In ChIP-Seq and DNase-Seq experiments the density of background reads can vary from sample to sample. Differences in background read densities between samples do not necessarily correspond to proportional changes of read densities in true ChIP-Seq peaks. Therefore, scaling by total library size as a means for normalizing called ChIP-Seq peaks across samples may be ineffective. We suggest a simple easily implemented alternative to scaling by total library size that scales only by the total number of reads mapped to called peaks. We then demonstrate the effectiveness of the modified scaling in K4me3 and K27ac ChIP-Seq data from the BrainSpan project as well as DNase-Seq data from the ENCODE project.