All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thank you for addressing the reviewers' concerns and congratulations again.
[# PeerJ Staff Note - this decision was reviewed and approved by Vladimir Uversky, a PeerJ Section Editor covering this Section #]
ok
ok
ok
No more comment. Thanks for revision
While the reviewers were generally positive, a number of concerns were raised that should be addressed in the resubmission.
[# PeerJ Staff Note: Please ensure that all review comments are addressed in a rebuttal letter and any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate. It is a common mistake to address reviewer questions in the rebuttal letter but not in the revised manuscript. If a reviewer raised a question then your readers will probably have the same question so you should ensure that the manuscript can stand alone without the rebuttal letter. Directions on how to prepare a rebuttal letter can be found at: https://peerj.com/benefits/academic-rebuttal-letters/ #]
It is fine. No comment.
It is fine. No comment.
There are many many algorithms and machine learning approaches to predict protein-protein interaction, hot spot, drug binding affinity, etc. Most of these methods differ slightly in various metrics. The new features added into this study ( features related to amino acid composition, dipeptide composition, etc., to the already pre-existing data) should be listed in a Table. Also in the final results as in Figure 1, are the new features in the top list, and how the adding of new features change the overall distributions of feature importance.
The second question is the identification of the potential medications listed in the paper. Figures 5-8 should be combined into a larger figure with four panels. More description about the identification and selection should be provided. Details for top screened drugs is better in main text, with additional entries to prove that these drugs are outstanding from virtual screen. What are rankings of these drugs among hits. Otherwise, it is dubious that these drugs could be hand picked.
Overall, it is important to improve accuracy of machine learning in predicting various protein-protein interaction. This paper represent a rigorous approach to add more relevant features to screen hot spot related factors.
See 'general comments'
See 'general comments'
See 'general comments'
The authors presented a method for predicting protein-protein interface hotspots using supervised machine learning algorithms trained using an extensive number of sequence and structure derived features. The viability of the developed method has been demonstrated using a virtual drug screening analysis of the predicted hotspots on the EphB2-ephrinB2 complex.
Dataset: The final dataset is slightly different from the original SpotOn dataset. Therefore, the authors need to report the number of hotspot and non-hotspot residues in their dataset.
Performance evaluation: Since the dataset is imbalanced and have less number of samples compared with the number of extracted features, the estimated performance metrics might be sensitive to the portioning of the data into train and test sets. To address this issue and have a fair comparison with SpotOn, the authors need to evaluate their models using 10 runs of 10-fold CV experiments.
Table 3: The authors claim that their model outperforms SpotOn (based on MCC, F1, and sensitivity) is not accurate. First, SpotOn has AUC score of 0.83 while the authors model has an AUC score on only 0.75. Second, while the authors model has a better sensitivity, its specificity is 0.08 lower than SpotOn. Third, MCC and F1 tend to be higher for the model with the highest specificity. Fourth, the authors model has been evaluated using a single run of 10-fold cv while SpotOn model has been evaluated using 10 runs. Ideally, the two models need to be compared by inspecting the ROC curves. Since it is challenging to get the ROC curve for SpotOn, the authors might modify the threshold of their model to achieve an 0.88 specificity and show that at this specificity their model has a better sensitivity and hence better F1 and MCC metrics.
Feature importance: Lines 248-251 showed that Relative Complex ASA and Complex ASA features are among the top 15 features. These two features will assign the same value to hotspot and non-hotspot residue in the complex. If this is the case, how can these features have any discriminative signal? Moreover, to demonstrate the improvement in the predictive performance by the newly added features, a sensitivity analysis to quantify the change in the model performance before and after adding these novel features is needed.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.