Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

Review History
Machine learning analysis of TCGA cancer data

All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

The initial submission of this article was received on March 3rd, 2021 and was peer-reviewed by 2 reviewers and the Academic Editor.
The Academic Editor made their initial decision on April 7th, 2021.
The first revision was submitted on April 16th, 2021 and was reviewed by 1 reviewer and the Academic Editor.
The article was Accepted by the Academic Editor on May 17th, 2021.

Version 0.2 (accepted)

Li Zhang · May 17, 2021 · Academic Editor

Accept

There is no further comment from the reviewer.

[# PeerJ Staff Note - this decision was reviewed and approved by Jun Chen, a PeerJ Section Editor covering this Section #]

Reviewer 2 · May 5, 2021

Basic reporting

no comment

Experimental design

no comment

Validity of the findings

no comment

Additional comments

The authors have addressed all my previous comments.

Cite this review as

Anonymous Reviewer (2021) Peer Review #2 of "Machine learning analysis of TCGA cancer data (v0.2)". PeerJ Computer Science https://doi.org/10.7287/peerj-cs.584v0.2/reviews/2

Download Version 0.2 (PDF) Download author's response letter (v0.2) - submitted Apr 16, 2021

Version 0.1 (original submission)

Li Zhang · Apr 7, 2021 · Academic Editor

Minor Revisions

Please address the reviewers' comments and provide point-by-point response.

Reviewer 1 · Mar 14, 2021

Basic reporting

The authors present a review to reflect the state of the art in machine learning particularly on The Cancer Genome Atlas Program data.
First, some of the work of the TCGA consortium itself is described to give the reader a good foundation. However, the main goal of this work is to identify and discuss those works that have used the TCGA data to train different ML approaches. The authors classify them according to three main criteria: the type of tumor, the type of algorithm, and the predicted biological problem. One of the conclusions drawn in this work shows a high density of studies based on two main algorithms: Random Forest and Support Vector Machines and naturally an increase is described from the use of deep artificial neural networks. An increase of integrative models of multi-omic data analysis is also presented. Biological problems were classified into five types: Prognosis prediction, tumor subtypes, microsatellite instabilities, immunological aspects, and specific signaling pathways.
A clear trend was found in the prediction of these conditions according to tumor type. This is why a greater number of papers have focused on the BRCA cohort, while specific papers for survival, for example, have focused on the GBM cohort, due to its largeNumber of events.

Any work dedicated to fighting cancer is important. This work is a very good contribution and very useful for the international research community and it is very well written, well motivated and good to read. For all these reasons, this reviewer endorses this work, recommends acceptance, and makes some recommendations below to help further improve the work:

1) For the beginner in this field, a list of all abbreviations used would be very helpful, e.g. MSI = Microsatellite Instable is not mentioned anywhere and newcomers should be able to find their way quickly.

2) Figure 2, the descriptions are very hard to read - maybe the image can be optimized

3) Figure 2 b - is practically illegible

4) Figure 4 also very hard to read

5) In the summary section or before, a bit of an outlook on future important research topics should be given, e.g. comprehensibility, interpretability to further explore causal relationships which is eminently important in cancer research. Here, however, it is totally important to point out that in the medical field always several different components contribute to a result - but this is often negated in current machine learning, consequently, it would be good to say so, and to point to a brand new current work that deals exactly with this matter [x].
[x] Holzinger, A., Malle, B., Saranti, A. & Pfeifer, B. 2021. Towards Multi-Modal Causability with Graph Neural Networks enabling Information Fusion for explainable AI. Information Fusion, 71, (7), 28-37, doi:10.1016/j.inffus.2021.01.008

6) Readers within the subsection "A general perspective of unsupervised learning with TCGA data" may also interested be interested in this brand new work [y]:
[y] Jean-Quartier, C., et al. A. 2021. Mutation-based clustering and classification analysis reveals distinctive age groups and age-related biomarkers for glioma. BMC medical informatics and decision making, 21, (77), 1-14, doi:10.1186/s12911-021-01420-1

Experimental design

nothing to add - very good

Validity of the findings

nothing to add - very good

Additional comments

please see comments above.

Cite this review as

Anonymous Reviewer (2021) Peer Review #1 of "Machine learning analysis of TCGA cancer data (v0.1)". PeerJ Computer Science https://doi.org/10.7287/peerj-cs.584v0.1/reviews/1

Reviewer 2 · Mar 29, 2021

Basic reporting

No comment

Experimental design

No comment

Validity of the findings

No comment

Additional comments

Summary of Paper:
In this paper, the authors provide a comprehensive review on literature using machine learning techniques for the analysis of different types of cancer using TCGA data. Starting with explaining the methodologies used in this survey, the authors present the main results obtained by the TCGA consortium followed by reviewing the capabilities of machine learning algorithms to solve biological problems. Overall, the paper is well written. Below are my minor concerns on the paper.

Minor Concerns:
• On line 102, 'Machine learning as a source o new knowledge' -> 'Machine learning as a source of new knowledge.'
• On line 107, ‘To his aim …’ -> ‘To this aim’.
• It is advisable to write the article inclusion/exclusion criteria on lines 127-136 more clearly. For example, I am assuming that the last point ‘Article/conference manuscripts using machine learning marginally or without solid biological conclusions’ should be the exclusion criteria.
• ‘neuron networks’ on lines 241 and line 244 should be replaced by ‘neural networks’.
• Consider rephrasing the sentence ‘In this Figure 4 is interesting to observe how the different problems addressed are distributed differently in the types of tumour.’ on lines 391-392.
• ‘Immnulogical phenotype prediction’ -> ‘Immunological phenotype prediction’ on line 441
• ‘immunologicc phenotype’ -> ‘immunologic phenotype’ on line 455.

Cite this review as

Anonymous Reviewer (2021) Peer Review #2 of "Machine learning analysis of TCGA cancer data (v0.1)". PeerJ Computer Science https://doi.org/10.7287/peerj-cs.584v0.1/reviews/2

Download Original Submission (PDF) - submitted Mar 3, 2021

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Review History Machine learning analysis of TCGA cancer data

Summary

Version 0.2 (accepted)

Li Zhang · May 17, 2021 · Academic Editor

Reviewer 2 · May 5, 2021

Basic reporting

Experimental design

Validity of the findings

Additional comments

Version 0.1 (original submission)

Li Zhang · Apr 7, 2021 · Academic Editor

Reviewer 1 · Mar 14, 2021

Basic reporting

Experimental design

Validity of the findings

Additional comments

Reviewer 2 · Mar 29, 2021

Basic reporting

Experimental design

Validity of the findings

Additional comments

Review History
Machine learning analysis of TCGA cancer data