Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on March 10th, 2025 and was peer-reviewed by 2 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on May 28th, 2025.
  • The first revision was submitted on August 14th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • A further revision was submitted on October 16th, 2025 and was reviewed by 1 reviewer and the Academic Editor.
  • The article was Accepted by the Academic Editor on October 24th, 2025.

Version 0.3 (accepted)

· · Academic Editor

Accept

Thank you for your valuable contribution!

[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]

Reviewer 2 ·

Basic reporting

No comments

Experimental design

No comments

Validity of the findings

No comments

Additional comments

No comments

Version 0.2

· · Academic Editor

Minor Revisions

Please read and reply precisely to the last comments of the reviewers.

Reviewer 1 ·

Basic reporting

This revision has improved and addressed most of my prior mentioned issues. Yet it has the following minor issues:
1. The paper proposes separating foreground and background features through "feature decoupling" for more effective knowledge distillation from a dense teacher to a sparse student model. Feature decoupling can typically be implemented via learnable network layers or algorithmic constraints. The paper's description is ambiguous and does not provide precise information on its approach. Therefore, the authors should provide a specific and clear explanation of how feature decoupling is actually implemented.
2. The core algorithm, particularly the "Feature Decoupling Schematic", should be accompanied by its core implementation code to enhance clarity and reproducibility.
3. It is recommended that the revised manuscript be presented as a clean version. The current version with tracked changes appears cluttered and impedes readability.
4. Equation (4) introduces a hyperparameter, η. However, its impact, sensitivity, and the value used in the experiments are not discussed or analyzed in the experimental section. For the best settings of hyperparameters including α, β, γ and η, please see Taguchi’s experimental design method, grid search, and intelligent optimization methods by consulting, e.g., Dendritic neuron model with effective learning algorithms for classification, approximation and prediction; and An online fault detection model and strategies based on SVM-grid in clouds.
5. Presentation needs further improvement. For example, in Figs. 1, 2, etc., please use contrasted color pairs, e.g., blue text in yellow background, and white text in green background, etc. “sgd” in Ref. “Ding, X., Zhou, X., Guo, Y., Han, J., and Liu, J. (2019). Global sparse momentum sgd for pruning very deep neural networks. In Advances in Neural Information Processing Systems (NeurIPS), pages 6382–6394” should be capitalized. Please correct many similar errors in references.

Experimental design

OK.

Validity of the findings

OK.

Additional comments

OK.

Reviewer 2 ·

Basic reporting

no more comments need

Experimental design

no more comments need

Validity of the findings

no more comments need

Additional comments

no more comments need

Version 0.1 (original submission)

· · Academic Editor

Major Revisions

Please follow the reviewers' detailed requests and criticisms.

**PeerJ Staff Note:** It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors agree that they are relevant and useful.

**PeerJ Staff Note:** Please ensure that all review and editorial comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

**Language Note:** The review process has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at [email protected] for pricing (be sure to provide your manuscript number and title). Alternatively, you should make your own arrangements to improve the language quality and provide details in your response letter. – PeerJ Staff

Reviewer 1 ·

Basic reporting

See below

Experimental design

See below

Validity of the findings

See below

Additional comments

This paper presents a sparsity-friendly knowledge distillation approach by using feature decoupling by focusing on sparse CNNs. IT contains some interesting ideas. It has to address many issues as stated below:

1. The authors should pay more attention to the uniformity of concepts or common expressions in the manuscript and the writing format. For example, there is a lack of space between “methods.Depending” in line 123 and “it.Feature-based” in line 128; the use of symbols in line 143 is inconsistent with that in Equation (1); the expressions in lines 124-126 are not entirely consistent with those in lines 127 and 137; the former is too oral; and at lines 199 and 218, the authors use the term “top-1”, while Table 2 and line 224, they use “top1”. The phrase “precipitates a shift in focus from that of its dense counterparts” in line 53 makes it unclear what the concerned weakness is. The authors should phrase it in clearer terms.

2. The authors should review some recent developments in knowledge distillation and see if they have good potential to solve the concerned problems. Their examples include A Novel Tensor Decomposition-Based Efficient Detector for Low-Altitude Aerial Objects With Knowledge Distillation Scheme; Learning From Human Educational Wisdom: A Student-Centered Knowledge Distillation Method; and Feature Map Distillation of Thin Nets for Low-Resolution Object Recognition

3. The authors have some data blanks in Table 1; please specify the reason for these blanks. In addition, the authors mention the symbols r, a, and b in the ablation experiments, but they are not defined in “METHODOLOGIES”. Also, this section’s title should be “PROPOSED METHODOLOGY”.

4. The authors mention in their contribution that the proposed method includes an adaptive weighting mechanism, but it is not elaborated in “METHODOLOGIES” and “EXPERIMENTS”. Instead, manual adjustment of the hyperparameter b is demonstrated in Table 6. This has to make one wonder whether the proposed method contains this mechanism.

5. It is suggested that the authors add to the “CONCLUSIONS” a discussion of the shortcomings of the proposed methodology and future research directions.

Reviewer 2 ·

Basic reporting

The manuscript introduces a meaningful advancement in sparse CNN training.

Experimental design

However, clarity and scientific rigor can be further improved.

Validity of the findings

Pending major revisions and added visual explanations, this paper would make a valuable contribution to the field.

Additional comments

**PeerJ Staff Note:** It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors agree that they are relevant and useful.

Annotated reviews are not available for download in order to protect the identity of reviewers who chose to remain anonymous.

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.