All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
The reviewer is satisfied with the current revision.
One minor comment: I noticed that you wrote that "The [22] methods use z-values, as opposed to the other methods, which use p-values; as a result, in this case we input the t-statistics into the Scott T approach, leading to a more pronounced anti-conservative behavior in some cases. " It seems more reasonable to transform the p-value into z-value (i.e., \phi^{-1}(1 - pvalue)) before applying FDRreg.
# PeerJ Staff Note - this decision was reviewed and approved by Elena Papaleo, a PeerJ Section Editor covering this Section #
No comment.
No comment.
No comment.
I am satisfied with the revision and would recommend it for publication.
Both reviewers agree that the work is well-motivated, technically sound and practically useful. They also offered many suggestions to improve the method/manuscript including de-emphasizing of the proposed modified BH procedure, comparing to some state-of-the-art of method (IHW), conducting more robustness studies, providing guidance against potential FDR inflation and citing relevant literature. Please address these comments by additional simulations or further explanations.
[# PeerJ Staff Note: It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors are in agreement that they are relevant and useful #]
Please see the report attached.
Please see the report attached.
Please see the report attached.
Please see the report attached.
no comment
no comment
no comment
The authors present a framework for estimating the false discovery rate conditionally on covariates, when the p-values are independent of these under the null hypothesis. In the analysis of large scale datasets, this is an important, challenging and potentially valuable task. The paper is well-written, well-motivated, and technically sound.
The underlying theme of this paper could be split into two parts: First, the authors propose an estimator of the proportion of null hypotheses conditionally on covariates. Second, they use the estimator to modify the standard Benjamini-Hochberg procedure. We remark on both aspects:
A. Estimation of $\pi_0(X_i)$:
---------------------------
The first part, i.e., estimation of $\pi_0(X_i)$ is, in our opinion, the selling point of this paper. The authors phrase the task of $\pi_0(X_i)$ estimation as a regression task. This key observation opens up many opportunities; in this paper the authors explore the use of linear and logistic regression potentially in conjunction with regression splines. However, their idea also enables the use of machine learning methods (say random forests, boosting, neural networks) to estimate $\pi_0(X_i)$, for example, if the covariates are complicated (say sequences or images) and high-dimensional. This is not mentioned explicitly in the paper, and we believe it would be worth mentioning.
Furthermore, the authors emphasize the importance of estimating $\pi_0(X_i)$ for tasks not directly related to down-stream multiple testing. For example, in the discussion they attribute the conservative behaviour in the right panel of Fig. 1 to publication bias. It would be interesting if this could be discussed in more detail.
Some missing references for this section include: the SABHA procedure [1], which also uses the thresholded p-values to estimate $\pi_0(X_i)$ but employs a maximum likelihood approach + convex optimization instead; AdaPT [2], which uses EM and logistic regression to estimate $\pi_0(X_i)$; and for categorical covariates, as in supplement S2.2., the GBH [3] procedure, which uses a standard $\pi_0$ estimator within each level.
It could also be interesting to see the following: How does averaging all the $\hat{\pi}_0(X_i)$ perform compared to $\hat{\pi_0}$, if we are just interested in estimating the global $\pi_0$?
B. Downstream multiple testing with FDR control:
----------------------------------------------
Another large part of this paper is concerned with modifying the BH procedure to account for $\pi_0(X_i)$ estimation. This is achieved by multiplying the adjusted p-values by the corresponding conditional $\pi_0$ estimate. We feel that this part of the paper is not as well motivated and could potentially be drastically deemphasized or even removed, similarly to an older version of this manuscript [4].
The main caveats of this section are as follows: The proposed scheme of modifying BH is essentially the same as that proposed in the SABHA procedure [1]. Second, as mentioned in [2] and [5], this scheme is inadmissible even under oracle knowledge of $\pi_0(X_i)$. Instead one should use a weighted BH scheme. One choice would be to use weighted BH with weights equal to $(1-\pi_0(X_i))/\pi_0(X_i)/(1-\pi_{0,\text{global}})$. This is the scheme proposed for categorical covariates in the GBH [3] procedure, and it leads to power increase both by upweighting the right hypotheses and by being adaptive (alpha-exhaustive). It would be important to see how such a scheme performs compared to the method presented here; we think it would further illustrate why task A. is so important. A further option would be to also use weights proportional to $(1-\pi_0(X_i))/\pi_0(X_i)$, but normalized to average to 1. Then the procedure would no longer be adaptive, but it could also be used with the cross-weighting scheme in [5] and would automatically enjoy finite sample FDR control guarantees.
Minor remarks:
----------------
* After Eqn. (1): "Namely the expected fraction": Conditional on at least making one rejections?
* $\theta_i = 1$ when null is a bit confusing and non-standard notation, maybe change it to $\theta_i = 0$.
* Page 5 typo "In section S3 we provide theoretical results for this estimator in Section S2 of the suppl. materials"
* Why not try the model with the cubic spline for Scenario 1 as well?
* It would be nice to see simulations under the global null as well.
* In the discussion it is mentioned that the method of Scott (2015) often does not control FDR. This was also mentioned/shown in [6]
* Proof of Supplement S3: Some parentheses missing after "then we can combine them as follows"
This review was prepared by Nikos Ignatiadis together with Wolfgang Huber.
References:
------------
[1] Li, Ang, and Rina Foygel Barber. "Multiple testing with the structure adaptive Benjamini-Hochberg algorithm." arXiv preprint arXiv:1606.07926 (2016).
[2] Lei, Lihua, and William Fithian. "AdaPT: an interactive procedure for multiple testing with side information." Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2018).
[3] Hu, James X., Hongyu Zhao, and Harrison H. Zhou. "False discovery rate control with groups." Journal of the American Statistical Association 105.491 (2010): 1215-1227.
[4] Boca, Simina M., and Jeffrey T. Leek. "A regression framework for the proportion of true null hypotheses." bioRxiv (2015): 035675.
[5] Ignatiadis, Nikolaos, and Wolfgang Huber. "Covariate-powered weighted multiple testing." arXiv preprint arXiv:1701.05179 (2018).
[6] Ignatiadis, Nikolaos, et al. "Data-driven hypothesis weighting increases detection power in genome-scale multiple testing." Nature Methods 13.7 (2016): 577.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.