This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
In ChIP-Seq and DNase-Seq experiments the density of background reads can vary from sample to sample. Differences in background read densities between samples do not necessarily correspond to proportional changes of read densities in true ChIP-Seq peaks. Therefore, scaling by total library size as a means for normalizing called ChIP-Seq peaks across samples may be ineffective. We suggest a simple easily implemented alternative to scaling by total library size that scales only by the total number of reads mapped to called peaks. We then demonstrate the effectiveness of the modified scaling in K4me3 and K27ac ChIP-Seq data from the BrainSpan project as well as DNase-Seq data from the ENCODE project.
This paper looks like it has the potential to be quite useful. However, it seems a bit short and I think it would help to expand on the following details:
1) Do you provide a package to implement your recommended normalization and peak calling pipeline? I think this would considerably increase the impact of this paper.
2) How does the increased power affect the false positive rate?
3) Somewhat similar to #2, I think it would help to provide experimental validation, especially for candidates that are uniquely identified using the proposed algorithm.
4) The theoretical CDF plots look very different for the different datasets. Although the green lines for mapped reads do consistently seem to be shifted to the left, the significance of the normalization appears to be modest in some cases, with K4me3 changing much more than K27Ac or DNase-I. I think this warrants some discussion to emphasize and explain the conditions where the proposed normalization is most appropriate.
1) At the moment we don't provide a package. After peak-calling in MACS, getting a common set of peaks, and getting mapped read counts for each peak, which I believe is pretty standard, its basically a line or two of code to divide each sample's peaks by the sum of all its peaks in the language of your choice.
2) This should not change the false positive rate, but increase the total number of calls when controlling the false positive rate at a given level when using a method like 'Benjamini-Hochberg' to control FDR.
3) We currently have some experimental validation for only a few called peaks, that we may try to incorporate later.
4) I think the normalization is working for K4me3 better than K27ac because K4me3 has more well-defined peaks that are better recognized by MACS, so the difference between signal and background is clearer. K27ac is less sparse than K4me3 and it is less clear what is and is not a peak when using MACS.