All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
All concerns have been addressed. This manuscript can be accepted.
[# PeerJ Staff Note - this decision was reviewed and approved by Sebastian Ventura, a PeerJ Section Editor covering this Section #]
There are some minor concerns that need to be addressed.
Since the authors have addressed all the previous comments, I don't have any further suggestions; the paper is ready to be published.
Since the authors have addressed all the previous comments, I don't have any further suggestions; the paper is ready to be published.
Since the authors have addressed all the previous comments, I don't have any further suggestions; the paper is ready to be published.
no comment
no comment
no comment
The paper's author has made revisions according to the reviewers' opinions and provided reasonable explanations. Agree to accept and publish this paper.
All my concerns have been well addressed. Table 5, Table 6, Table 8, Table 9, Table 11, Table 13, Table 15 and Table 17 for hyperparameter tuning can be used as supplementary material. Then the manuscript is ready to be published.
The study has detailed and reasonable experimental design.
Results are good and clear.
The reviewers have substantial concerns about this manuscript. The authors should provide point-to-point responses to address all the concerns and provide a revised manuscript with the revised parts being marked in different color.
**PeerJ Staff Note:** Please ensure that all review and editorial comments are addressed in a response letter and any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
**Language Note:** The review process has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at copyediting@peerj.com for pricing (be sure to provide your manuscript number and title). Alternatively, you should make your own arrangements to improve the language quality and provide details in your response letter. – PeerJ Staff
This paper presents a novel approach for the classification of Anticancer peptides (ACPs) by integrating word embedding techniques, namely Word2Vec and FastText, with deep learning models like CNN, LSTM, and BiLSTM. Given the surge in peptide sequences, accurate prediction models are essential. Tested on widely-used datasets, ACPs250 and Independent, our proposed FastText+BiLSTM combination achieved record accuracies of 92.50% and 96.15%, surpassing existing methods. However, there are some issues before publication.
Line 240: Begin this line with a statement like "Before diving into specific methodologies, it's essential to understand the foundations of word embedding models and their deep learning counterparts." This provides a clearer introduction to the purpose of the section and the relevance of the ensuing models.
Line 244-245: While the sentence explains that Word2Vec assigns vector embeddings based on contextual usage, it could be clearer for readers unfamiliar with NLP. Consider refining to: "The model processes input text (typically a vast corpus) to understand and represent words as vectors based on the context in which they appear."
Line 248-249: The explanation of the output layer is a bit passive. Revise to be more active and precise, such as: "The output layer then predicts which words are most likely to appear near a given target word."
Line 252: The description of Word2Vec's working logic can benefit from a more concrete example. A brief illustrative example showcasing the skip-gram or CBOW technique might clarify how the model predicts neighboring words.
Line 256-259: The introduction of FastText feels redundant given that the Word2Vec introduction already explains the general idea of word embeddings in NLP. Consider streamlining this by focusing more on what differentiates FastText from Word2Vec, such as its capability to generate vectors for out-of-vocabulary words using subword information.
Please include more peer words at experiment results part.
Please double check all font type in all figures. Sometimes font is too small and blur.
The article is written in English and conforms to professional standards of courtesy and expression., the text is clear, unambiguous, and technically correct. However, there are some grammar and lexical errors.
The article includes a sufficient introduction and background to demonstrate how the work fits into the broader field of knowledge. But there are some errors in the references.
"Holohan, C., Van Schaeybroeck, S., Longley, D. B., and Johnston, P. G. (2013). Cancer drug resistance: An evolving paradigm." should be "Holohan C, Van Schaeybroeck S, Longley DB, Johnston PG. Cancer drug resistance: an evolving paradigm. Nat Rev Cancer. 2013 Oct;13(10):714-26. doi: 10.1038/nrc3599. PMID: 24060863."
"Haykin, S. (2010). Neural Networks and Learning Machines, 3/E. Pearson Education, London." should be "Haykin, S. (2009). Neural Networks and Learning Machines, 3/E. Pearson Education, London."
Word embedding is a key technology in generative artificial intelligence, which maps words or phrases from a vocabulary (which may contain thousands or millions of words) to the real space of vectors.
The most advanced word embedding models currently available are the "GloVe" and "FastText" models. Why is the "GloVe" model not studied in the paper?
The use of the model proposed by the research institute has improved classification accuracy. The proposed combination of FastText+BiLSTM has an accuracy of 92.50% for the ACPs250 dataset and 96.15% for the Independent dataset. What are the factors that affect the classification accuracy of the model? What are the shortcomings of the current model?
No comments
The study of "An efficient consolidation of word embedding and deep learning techniques for classifying anticancer peptides: FastText+BiLSTM" showed some combinations of embedding methods and some neural networks models with different hyperparameters for binary classification of ACP. However, there are some items need to be revised and improved.
One-Hot-Encoding as one of the most important embedding method for peptide sequences should be added and compared.
Hyperparameter tuning should be combined, and then your best model can be given and compared directly.
Why do you use accuracy instead of MCC, AUC or F1 score to compare model performance?
You mentioned “efficiently” in the title, however, model complexity and computational performance weren’t calculated and compared.
Whole manuscript should be re-written. Too much basic and irrelevant information was added. For example, you should give your hypothesis and list contributions in “INTRODUCTION” instead of just describing basic concepts of ML, DL etc.; also like line 87, previous work should be summarized not just listing citations here. Similar for “RELATED WORK”, you can summarize the most important and related works, and give their limitations.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.