Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

Review History
Predicting cervical cancer risk probabilities using advanced H20 AutoML and local interpretable model-agnostic explanation techniques

All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

The initial submission of this article was received on August 9th, 2023 and was peer-reviewed by 2 reviewers and the Academic Editor.
The Academic Editor made their initial decision on October 2nd, 2023.
The first revision was submitted on November 21st, 2023 and was reviewed by the Academic Editor.
A further revision was submitted on January 26th, 2024 and was reviewed by the Academic Editor.
A further revision was submitted on January 31st, 2024 and was reviewed by the Academic Editor.
A further revision was submitted on February 7th, 2024 and was reviewed by the Academic Editor.
The article was Accepted by the Academic Editor on February 8th, 2024.

Version 0.5 (accepted)

Davide Chicco · Feb 8, 2024 · Academic Editor

The authors correctly addressed my last requests and therefore I can recommend this article for acceptance and publication.

[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]

Download Version 0.5 (PDF) Download author's response letter - submitted Feb 7, 2024

Version 0.4

Davide Chicco · Jan 31, 2024 · Academic Editor

Minor Revisions

The results of the binary classification measured through MCC, F1 score, accuracy, sensitivity, specificity, precision, and NPV are still missing. The authors should include them in Table 5 and discuss them in the text.

Plus, what the authors call "AUC" should be called "ROC AUC" or "AUROC".

Download Version 0.4 (PDF) Download author's response letter - submitted Jan 31, 2024

Version 0.3

Davide Chicco · Jan 29, 2024 · Academic Editor

Minor Revisions

The authors added tests on additional datasets but completely ignored my previous comment:

> 1. In particular, if this study is a binary classification project, only binary classification metrics should be employed (MCC, F1 score, accuracy, sensitivity, specificity, precision, NPV, ROC AUC, PR AUC, etc) and not the regression analysis metrics such as RMSE.

Please add results measured through the indicated metrics.

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

Download Version 0.3 (PDF) Download author's response letter - submitted Jan 26, 2024

Version 0.2

Davide Chicco · Dec 18, 2023 · Academic Editor

Major Revisions

Some issues raised by the reviewers were correctly addressed by the authors, but some problems still remain.

In particular, if this study is a binary classification project, only binary classification metrics should be employed (MCC, F1 score, accuracy, sensitivity, specificity, precision, NPV, ROC AUC, PR AUC, etc) and not the regression analysis metrics such as RMSE.

Moreover, the authors show results in only one single dataset; the analysis should be repeated on at least one alternative validation cohort dataset. Other datasets on cervical cancer EHRs could be found on Google Dataset Search, re3data.org, Zenodo, Kaggle, FigShare, UC Irvine ML Repository, and other resources.

Download Version 0.2 (PDF) Download author's response letter - submitted Nov 21, 2023

Version 0.1 (original submission)

Davide Chicco · Oct 2, 2023 · Academic Editor

Major Revisions

A series of issues are still present in the article, so it cannot be accepted for publication. The authors should address the points described by the reviewers and prepare a new version of the manuscript.

**PeerJ Staff Note:** Please ensure that all review and editorial comments are addressed in a response letter and any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

**Language Note:** The review process has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at [email protected] for pricing (be sure to provide your manuscript number and title). Alternatively, you should make your own arrangements to improve the language quality and provide details in your response letter. – PeerJ Staff

Reviewer 1 · Sep 6, 2023

Basic reporting

There are a few typos:

Line 26: Please replace "Cancer positioning a major disease" with "Cancer is positioned as a major disease."

Line 33-34: Please replace "the traditional ML technique handles" with "traditional ML techniques handle."

Line 87: Ensure the statement "(WHO)" is formatted correctly as a citation or note. The parentheses are hard to understand.

Line 95: Please replace "machine ML algorithms" with either "ML algorithms" or "machine learning algorithms"

Experimental design

Methods Section of the paper needs some improvements.
What are the features for the dataset? Please give a few examples or explain in plain words what the features typically represent.

In Materials & Methods, please cite the github or website page of the open source program h2o.ai. If you meant the h2o.ai website, then the model is trained on the cloud, please indicate the hardware environment of the cloud cluster.

Validity of the findings

The paper highlights the LIME method for model explanation. Please give examples on the explanation result and the findings behind them.
The authors mentioned the model outperforms previous studies occurred in cervical cancer research, please explain on what aspect the model purposed in the paper is better, introduce what are these methods, and run experiments for comparision.

Cite this review as

Anonymous Reviewer (2024) Peer Review #1 of "Predicting cervical cancer risk probabilities using advanced H20 AutoML and local interpretable model-agnostic explanation techniques (v0.1)". PeerJ Computer Science

Reviewer 2 · Sep 20, 2023

Basic reporting

The manuscript currently references outdated cancer statistics which are at least 5 years old. Updating the paper with the most recent data on cancer statistics is recommended to maintain the relevance and accuracy of the information presented.

The use of "however" in line 69 does not align well with the context. A careful review of the grammar and spelling throughout the manuscript will enhance its readability. Moreover, the introduction contains repetitive statements, which should be refined to avoid redundancy.

The coherence between paragraphs needs to be improved. Presently, there is a lack of clear demarcation between the introduction and literature review sections, resulting in an overlap of information. It is advised to clearly delineate the boundaries by reserving epidemiological data, statistics, and screening issues for the introduction, and focusing on comparing traditional approaches with ML & DL models in the literature review

Experimental design

The data set derived from Kaggle is originally from the UCI Repository, a fact that should be acknowledged to give proper credit to the primary source.

The methodology section lacks clarity in explaining how the automated tool for prediction was utilized. Additionally, the paper seems to lack innovative elements, as even the model selection process is automated. It would benefit the study to introduce more depth and originality in the approach,

The usage of the DX:cancer column in the training phase raises concerns about the validity of the predictions, especially when the objective is to predict the necessity of a biopsy. It is advisable to reassess the inclusion of this variable in the initial training to maintain the integrity of the predictions.

Validity of the findings

The claim regarding the screening ability of the model is contentious given the nature of the dataset utilized. Traditional cervical cancer screening entails processing image data, a component conspicuously missing in the current approach which instead focuses on assessing the risk of developing cancer. The manuscript should revisit this claim to align the narrative with the actual capabilities of the developed model.

The title "Integrated modeling approach to predict cervical cancer in women" seems to overstate the capabilities of the model given that it primarily assesses the risk factors rather than directly predicting cervical cancer occurrences. It would be prudent to modify the title to accurately reflect the scope and the objective of the study.

Cite this review as

Anonymous Reviewer (2024) Peer Review #2 of "Predicting cervical cancer risk probabilities using advanced H20 AutoML and local interpretable model-agnostic explanation techniques (v0.1)". PeerJ Computer Science

Download Original Submission (PDF) - submitted Aug 9, 2023

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Review History Predicting cervical cancer risk probabilities using advanced H20 AutoML and local interpretable model-agnostic explanation techniques

Summary

Version 0.5 (accepted)

Davide Chicco · Feb 8, 2024 · Academic Editor

Version 0.4

Davide Chicco · Jan 31, 2024 · Academic Editor

Version 0.3

Davide Chicco · Jan 29, 2024 · Academic Editor

Version 0.2

Davide Chicco · Dec 18, 2023 · Academic Editor

Version 0.1 (original submission)

Davide Chicco · Oct 2, 2023 · Academic Editor

Reviewer 1 · Sep 6, 2023

Basic reporting

Experimental design

Validity of the findings

Reviewer 2 · Sep 20, 2023

Basic reporting

Experimental design

Validity of the findings

Review History
Predicting cervical cancer risk probabilities using advanced H20 AutoML and local interpretable model-agnostic explanation techniques