Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on March 25th, 2025 and was peer-reviewed by 2 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on June 18th, 2025.
  • The first revision was submitted on July 25th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • A further revision was submitted on October 7th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • The article was Accepted by the Academic Editor on November 13th, 2025.

Version 0.3 (accepted)

· · Academic Editor

Accept

This contribution is extremely valuable and can be particularly promoted for cancer screening programs.

[# PeerJ Staff Note - this decision was reviewed and approved by Sonia Oliveira, a PeerJ Section Editor covering this Section #]

Reviewer 1 ·

Basic reporting

see the "Additional comments" section

Experimental design

see the "Additional comments" section

Validity of the findings

see the "Additional comments" section

Additional comments

The quality of the manuscript has improved thanks to the changes made. I think it could be of interest to the readers and, in my opinion, it deserves the priority to be published.

·

Basic reporting

We thank Wood et al. for their acknowledgment of the published two protein model in Medina, Annapragada et al. and for removing claims that the two-protein model proposed in their study had superior performance (it did not). The authors make a statement that is true in general -- simpler models tend to exhibit better generalizability on new data, but one that cannot be verified on fully unblinded data with published performance of a model that is nearly equivalent to the one proposed in this study. Selecting a classifier based on performance in a validation set will, by definition, have better performance in that set but can also lead to overfitting and lack of generalizability.

It is worth commenting on the complexity of the classifier that Medina, Annapragada et al. developed. They trained a logistic regression model with a penalty for complexity that was simply allowed to see and potentially incorporate fragmentation features (chromosomal copy number and fragmentation profiles) that have been previously described as well as measures of protein abundance for CA-125 and HE-4. The machine learner selected a subset of features from cfDNA and levels of both proteins that had high performance for detecting ovarian cancer. In general, this approach to feature selection creates simple, parsimonious linear models that are less likely to lead to overfitting.

Finally, Medina, Annapragada et al. provided code and data that enable independent groups to verify their original analyses as well as to explore alternatives. Such reproducibility is essential in the age of large-scale genomic studies involving clinical data as the time and expense of an independent clinical trial and the costs of sequencing makes replication unattainable in the short term. To this end, Medina, Annapragada et al. were highly successful. We applaud Wood et al. for their efforts to improve biomarkers for early detection of this deadly disease and look forward to continued advancements in this field.

Experimental design

No Comment

Validity of the findings

No Comment

Additional comments

No Comment

Version 0.2

· · Academic Editor

Major Revisions

Please follow the requests and comments received. In particular, one reviewer's comments need to be addressed adequately.

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

Reviewer 1 ·

Basic reporting

The quality of the manuscript has improved thanks to the changes made. I think it could be of interest to the readers and, in my opinion, it deserves the priority to be published.

Experimental design

The quality of the manuscript has improved thanks to the changes made. I think it could be of interest to the readers and, in my opinion, it deserves the priority to be published.

Validity of the findings

The quality of the manuscript has improved thanks to the changes made. I think it could be of interest to the readers and, in my opinion, it deserves the priority to be published.

Additional comments

The quality of the manuscript has improved thanks to the changes made. I think it could be of interest to the readers and, in my opinion, it deserves the priority to be published.

·

Basic reporting

We thank the authors for the inclusion of additional sensitivity analyses and discussion of study limitations. However, some concerns remain with the revised text regarding the following topics.

Concerns related to acknowledging the previously published 2-protein model in the DELFI-Pro publication.
• While the authors have acknowledged the 2-protein model in their revised manuscript, for clarity they should include the reference to the publication showing the 2-protein model (Medina, Annapragada et al., Cancer Discovery, 2024) at the end of the sentence on line 228 after “another model created by the DELFI-PRO authors”.
• Given their acknowledgement of the inclusion of the two-protein model in the prior publication, the authors should refer to this model as “Published Two-Protein model” rather than “DELFI-noDNA model”, in keeping with their references to the “Published DELFI-Pro” model throughout the text and figures.
• As the two protein model is the central aim of this study, the authors should include the previously “Published Two-Protein model” in Figure 1, alongside the other models for comparison.

Concerns related to description of performance of the two protein model and other models
• The authors have added point estimates of AUC’s to some of their analyses, but these point estimates are insufficient for understanding differences in performance between models. To assess whether their model is an improvement over any of the previously published models currently included in Figure 1 (as well as the published two-protein model we have asked to also be included in Figure 1), the authors should include 95% confidence intervals for all AUCs and estimates of sensitivity throughout the study. For clarity, these confidence intervals should be included in Figure 1, Figure S2, S4, and S6 as well as sections of the text where these comparisons are made (pages 6-7).
• The authors assert superior performance of their model to the previously published approach in the abstract, but it appears unlikely that this claim could be substantiated on statistical grounds. Point estimates and confidence intervals for the difference in performance at clinically relevant specificities (partial AUCs or sensitivities) would enable these comparisons and should be provided in the abstract. In the absence of a statistical justification for superior performance, the authors should remove the text "superior performance” in the abstract and throughout the text.

Experimental design

No comment.

Validity of the findings

No comment.

Version 0.1 (original submission)

· · Academic Editor

Major Revisions

There are significant concerns about the manuscript that need to be addressed appropriately. Please address the criticisms adequately.

**PeerJ Staff Note:** It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors agree that they are relevant and useful.

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

[# PeerJ Staff Note: Reviewer 2 declared a potential Conflict of Interest, and the Editor was aware of this when making their decision #]

Reviewer 1 ·

Basic reporting

I read with great interest the Manuscript titled " Simpler predictive models provide higher accuracy for ovarian cancer detection”, a topic interesting enough to attract readers' attention.
Although the manuscript can be considered of good quality, I would suggest the following recommendations:
- I suggest a round of language revision to correct a few typos and improve readability.
- In recent years, various mutations have been detected, and several molecular inhibitors have been developed for the treatment of ovarian cancer, whose antitumor potential is currently being assessed in different clinical trials. Considering the results and topic of this study, I suggest reading and adding recent references referring to PMID: 38768941
- Growing evidence has demonstrated the role of mutations of tumor biomarkers in diagnosing and managing epithelial ovarian cancer. I suggest analyzing recent literature on the correlation between tumor biomarkers and chemotherapy in ovarian cancer, providing suggestions for personalized treatment approaches. I suggest reading and adding recent references referring to PMID: 39457020

Because of these reasons, the article should be revised and completed. Considering all these points, I think it could be of interest to the readers and, in my opinion, it deserves priority to be published after minor revisions.

Experimental design

-

Validity of the findings

-

·

Basic reporting

Thank you for the opportunity to review this manuscript. It is written in professional English and is structured acceptably. The authors share their code in a Zenodo repository. They copy and re-deposit the original Medina, Annapragada et al., Cancer Discovery, 15(1):105-118, 2024 data and supplementary tables in Zenodo as well. It would have been preferable for them not to redeposit these tables in another repository like Zenodo but rather indicate that these previously published data should be obtained from the original source publication or GitHub repository. At the very least, the authors should include a reference to the Cancer Discovery paper for each of the items that are copied and redeposited in GitHub. While the authors provide references, we believe they omit mention of prior studies of CA-125 and HE-4 in ovarian cancer, which are critical to acknowledge and would provide additional validation data for the model proposed here. We have described this point in greater detail in the General Comments section.

Experimental design

Not applicable. The authors perform a re-analysis of existing data from Medina, Annapragada et al., Cancer Discovery, 15(1):105-118, 2024. While the authors provide code to re-train their model, they provide neither locked model weights nor final scores for their model (such as those provided in the Medina, Annapragada et al study repository), which hampers reproducibility.

Validity of the findings

We have the following concerns regarding the validity of findings:

1. The authors assert that a logistic regression model combining only CA125/HE4 proteins was not provided in Medina, Annapragada et al. Cancer Discovery, 2024 study. We refer the authors to the previously published Supplemental Fig. 7, which shows an ROC curve of CA125 and HE4 proteins combined together in the Medina, Annapragada et al. Cancer Discovery, 2024 study and the Rmarkdown file “analysis/S7.Rmd” in the associated GitHub repository for reproducing the ROC curve from this model.

2. The authors fail to recognize multiple other published studies using CA125 and HE4 for early cancer detection, and thereby miss the opportunity to validate their locked model on these datasets.

3. The authors claim that the CA125/HE4 model they developed is likely to be more generalizable than DELFI-Pro from the Medina, Annapragada et al. Cancer Discovery, 2024 study. However, their model was evaluated on the same validation set where results from the published CA125/HE4 model and DELFI-Pro were already provided. Such a post-hoc analysis does not establish generalizability.

4. There is insufficient evidence to suggest that this 2-protein strategy would be successful in enabling high sensitivity at clinically relevant specificities, thereby achieving a PPV high enough to justify screening for ovarian cancer.

5. The authors assert that a batch effect in the Medina Annapragada et al study “impac previously reported results”. Excluding samples identified by the authors, we found that the specificity, stage-wise sensitivities, and overall sensitivity of a model trained with the same DELFI-Pro features were unchanged from these measures of performance in the published DELFI-Pro classifier.

Additional discussion of these points is provided in the Additional Comments.

Additional comments

In their manuscript, Wood et. al. have performed an analysis of two protein biomarkers (CA125 and HE4) using the data and code from our study “Early Detection of Ovarian Cancer Using Cell-Free DNA Fragmentomes and Protein Biomarkers” (Medina, Annapragada et al., Cancer Discovery, 15(1):105-118, 2024). The authors have expertise in genomic and bioinformatic analyses, and given the challenges of early detection of ovarian cancer, additional efforts like this study have the potential to make progress that can be helpful for the field. Nevertheless, the study has a number of significant limitations as indicated below.

1. The central thesis of the manuscript by Wood and colleagues is that a logistic regression model limited to the two protein biomarkers CA125 and HE4 could be a more parsimonious alternative to DELFI-Pro, a penalized logistic regression model that incorporates fragmentation-derived features in addition to these two proteins. However, the authors have failed to observe that the analysis and code for a logistic regression model combining these exact two proteins alone was already published in the original study (Medina, Annapragada et al., Cancer Discovery, 2024, Supplementary Fig. S7) and is readily available in the code repository (https://github.com/cancer-genomics/delfipro2024/blob/main/code/proteins_screening_model.r). While the authors have framed their research as having devised a simpler and more parsimonious classifier, their classifier is no simpler or more parsimonious than the already published two-protein classifier. The only difference between the model in the manuscript and the previously published model is a z-log transformation that the authors apply to the protein measurements. The performance of this model is also nearly identical to the original published two protein classifier (Medina, Annapragada et al., Cancer Discovery, 2024, Supplementary Fig. S7). Overall, the authors have successfully reproduced the same analyses, models, and figures as described from the data and code of the Medina, Annapragada et al. study, contributing neither new analyses for ovarian cancer detection nor new data.

2. In addition to our own study that combined CA125 and HE for early cancer detection, there are additional large-scale studies that have used these two biomarkers alone or in combination for cancer detection. The European Prospective Investigation into Cancer and Nutrition Study (EPIC) (Terry et al., Clinical Cancer, 22(18), 4664–4675, 2016) evaluated these two biomarkers in combination as well as separately while the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO) (Urban et al., Journal of the National Cancer Institute, 103(21), 1630–1634, 2011) and National Health and Nutritional Survey (NHANES) (Cramer et al., Gynecologic Oncology, 161(1), 282–290, 2021) examined both these protein biomarkers individually. The authors should examine these additional available studies or perform their own validation study to determine the generalizability and performance of a two-protein biomarker in additional cohorts.

3. The approach used in the Wood et al manuscript raises a general concern regarding the attempt to identify an improved classifier post-hoc using the same set of fully unblinded data when the performance of all classifiers in question is already known. This type of approach is likely to compromise the generalizability of any conclusions. The purpose of locking a classifier prior to its evaluation on unseen data is, in part, to remove the ability to rationalize decisions after having observed results in the validation set. The lack of any additional external validation of a classifier developed based on the authors’ re-analysis of an existing study with known performance is a weakness that further reduces enthusiasm for this effort. In our previous comment, we have suggested opportunities for external validation.

4. In addition to the concerns above, it is important to note that protein-based classifiers for ovarian cancer detection have not historically provided an avenue for population-scale screening due to their low sensitivity at clinically relevant specificities. Large prospective trials of protein-based approaches (Menon et al., The Lancet, 397(10290), 2182–2193, 2021; Prorok et al., Controlled Clinical Trials, 21(6), 273S-309S, 2000) have failed to deliver clinically impactful performance metrics in ovarian cancer screening. Given the low prevalence of this disease and the large number of individuals to be screened, high sensitivity at near-perfect specificity (>99%) is necessary to achieve a positive predictive value high enough to justify screening. Ovarian cancer is a molecularly heterogeneous disease with known alterations in chromosomal copy number, methylation, and protein biomarkers (Bell et al., Nature, 474(7353), 609–615, 2011; Labidi-Galy et al., Nature Communications, 8(1), 1093, 2017; Papp et al., Cell Reports, 25(9), 2617–2633, 2018). While including too many features can lead to overfitting and lower generalizability, omitting features, including those related to chromosomal copy number changes and fragment length, can lead to an attenuation in performance. In the screening setting, both the Medina, Annapragada et al. protein-only CA125/HE4 classifier and the Wood et al CA125/HE4 classifier have lower cross-validated sensitivity at >99% specificity as compared to the published DELFI-Pro performance that includes these two proteins together with genome-wide fragmentation profiles and chromosomal changes (at >99% specificity). The lower performance of the CA125/HE4 classifier is not surprising given the historic performance of these proteins (Menon et al., The Lancet Oncology, 10(4), 327–340, 2009; Menon et al., The Lancet, 397(10290), 2182–2193, 2021; Menon et al., The Lancet Oncology, 24(9), 1018–1028, 2023). As individual proteins in historic studies, CA125/HE4 in the Medina, Annapragada et al. study, and CA125/HE4 in the authors’ reproduction of Medina, Annapragada et al., all demonstrate insufficient performance metrics for an appropriate early cancer detection classifier for ovarian cancer, the authors would be well-served to focus the manuscript on the limitations of a classifier based on CA125 or HE4 alone or in combination in this setting. Additionally, given the lower sensitivity at the clinically relevant >99% specificity of their classifier compared to the DELFI-Pro performance, the authors should remove sentences like “This approach exhibits superior performance on an external validation cohort compared to a more complex method” from the abstract and similar text throughout the manuscript.

5. The authors raise questions about batch effects in the chromosomal arm copy number features, demonstrating that a subset of 42 samples from the Medina, Annapragada et al. study have a markedly different z-score from the majority of samples used to train the classifier, and use this observation as a further rationale for focusing on their two protein classifier and removing all genomic features. However, as z-scores for chromosomal arm copy number changes observed in the ovarian cohort are highly concordant with chromosomal gains and losses in TCGA ovarian tumor samples (Figure 2C), removing these features, as the authors suggest, would remove an important source of biological signal. The authors find that the sequencing lab prefix available in the code repository (or the set of genomic library batches provided in Supplementary Table S2 of Medina, Annapragada et al.) provides a surrogate for delineating these samples. We agree with the authors that removing features that have high variance across batches can mitigate the risk of learning pre-analytical (whether clinical or technical) and analytical sources of variation. In this case, a specific sample subset drives the high variance in the features noted by Wood and colleagues, but these same features appear robust to batch-to-batch variation in the remaining 234 samples that were processed across 22 genomic library batches. A sensitivity analysis to examine whether pre-analytical differences in these samples influenced the performance of the classifier on the validation set could have been performed by the authors by re-training the classifier without these samples. We have completed such an analysis in the screening setting, and found no differences in the sensitivity between the new and old classifiers (p=1, all stages, two-sided χ2 test), specificity (p=1), or AUC (p=0.38). The lack of impact on the performance of the previously published approach for ovarian cancer detection stands in contrast to the author's statement in the abstract that “confounding technical variation … impacts previously reported results”. The authors should tone down the language in their abstract and throughout the text related to the performance of previous classifiers.

Review by

Akshaya Annapragada, Shashikant Koul, Jillian Phallen, Robert Scharpf, and Victor Velculescu

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.