Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on April 13th, 2021 and was peer-reviewed by 2 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on May 13th, 2021.
  • The first revision was submitted on June 11th, 2021 and was reviewed by 1 reviewer and the Academic Editor.
  • A further revision was submitted on July 8th, 2021 and was reviewed by 1 reviewer and the Academic Editor.
  • The article was Accepted by the Academic Editor on July 29th, 2021.

Version 0.3 (accepted)

· Jul 29, 2021 · Academic Editor

Accept

As you can see, the reviewer was satisfied by your responses to the critiques and by revisions. Therefore, I am pleased to let you know that your amended manuscript is acceptable now.

[# PeerJ Staff Note - this decision was reviewed and approved by Paula Soares, a PeerJ Section Editor covering this Section #]

Reviewer 2 ·

Basic reporting

Background provided, literature referenced, clear writing

Experimental design

Recommendations were worked on and methods described in details

Validity of the findings

The conclusion have been supported by the results

Additional comments

Recommendations have been worked on.

Version 0.2

· Jun 25, 2021 · Academic Editor

Minor Revisions

As you can see, the reviewer still thinks that the manuscript has some linguistic issues and requires additional editorial work. Please address these remaining concerns and make sure that the manuscript is edited by professional editors or fluent English speakers.

[# PeerJ Staff Note: The Academic Editor has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at [email protected] for pricing (be sure to provide your manuscript number and title) #]

Reviewer 2 ·

Basic reporting

Background context and literature reference have been provided.

Experimental design

Including more details and flowchart presented helps in better understanding

Validity of the findings

The conclusion have been supported by the results

Additional comments

The manuscript is more structured and issues have been addressed. However few sections in the manuscript lacks correct sentence formation and is sometimes confusing. Working on improving those is recommended.

Version 0.1 (original submission)

· May 13, 2021 · Academic Editor

Major Revisions

Please address the critiques of both reviewers and revise the manuscript accordingly.

[# PeerJ Staff Note: Please ensure that all review comments are addressed in a response letter and any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.  It is a common mistake to address reviewer questions in the response letter but not in the revised manuscript. If a reviewer raised a question then your readers will probably have the same question so you should ensure that the manuscript can stand alone without the response letter.  Directions on how to prepare a response letter can be found at: https://peerj.com/benefits/academic-rebuttal-letters/ #]

Reviewer 1 ·

Basic reporting

No comment

Experimental design

1 For the survey method part, I don’t think it is a fair comparison of the mentioned tools, since the benchmark set CAFA3 was published in 2019, the DeepGOplus was trained using more recent data released in 2020, and other tools, such as PFP and PANNZER2 were trained using different datasets (training data was not mentioned in the manuscript). The results of DEEPred were extracted from the DEEPred paper, the training data was still not clear. Because deep-learning models are easy to get the over-fitting problem, the sequence similarity between the training data and the benchmark set will strongly affect the performance. It is not clear whether the sequences in CAFA3 have been used in training those deep-learning methods, the performance may be overestimated for DeepGoplus and DEEPred. Since DeepGoplus and DEEPred both provide source code, they should be re-trained on the same dataset. Because of such a problem, the conclusions are not well supported. I suggest the authors do more strict experiments or discuss more on the performance evaluation.
2 For the machine-learning-based method, there lacks a clear problem formulation. For example, is it a multi-class classification problem for all the GO terms? Or is it a binary-class classification problem for each GO term? This is also related to the performance evaluation. The metrics used to evaluate the prediction performance, such as precision and recall, are designed for binary-class predictions, how to assign a single score per method?

Validity of the findings

1 Why only sequence-based methods were selected and compared in this paper? The readers expect to see the comparisons of deep-learning methods as much as possible.
2 As pointed out by the authors, one challenge of protein function prediction is the imbalanced GO classes and the multi-label problem, but what is the current state of the field regarding to these challenges? Do the authors have any insights in coping with these challenges?

Additional comments

In this review paper, the authors briefly reviewed the conventional approaches and focused on the review of recent deep-learning-based methods for protein function prediction. They presented an overview of current automated protein function prediction methods and conducted a mini comparison of several available tools. And they finally highlighted the challenges of the field. I have two other major concerns that are listed below:
1 Except for the deep-learning methods reviewed in this paper, several more advanced and recently published methods should be reviewed. Such as one in https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btab198/6182677 that applied transformer and one in https://arxiv.org/abs/2007.12804 that applied GNN.
2 The data used as benchmark is not clearly explained in the Data section. I still don’t understand what does the NK (no-knowledge) and LK (limited-knowledge) mean. Does it mean the sequence has no or limited public annotations but was fully labeled first in the benchmark dataset? What do the partial mode and fully mode mean? The authors should explicit these two terms. I cannot understand the explanation from the current version from this sentence: “partial mode, for a set of proteins with at least one prediction, and full mode, computed for all benchmark proteins.”

Reviewer 2 ·

Basic reporting

1) Professional English used throughout
2) Background context and literature reference have been provided

Experimental design

Investigation into methods have been performed and results presented.
However, more structured details are required in the explanation of methods. While reading, details seemed confusing.

Validity of the findings

Conclusion have been supported by the results and overall review.

Additional comments

The review is well written giving an overview of conventional and new methods for protein function prediction.
Recommendation: Structured details are required in the explanation of methods.

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.