All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
After carefully reviewing your rebuttal and revised paper, I find the paper acceptable for publication. Thanks for your interest in the journal.
[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]
The reviewers have many good suggestions, relatively simple to implement but a large number of them. Please address them and generate a response to them in a separate document. I will review them and may engage reviewers again before making a final decision.
Thanks
[# PeerJ Staff Note: It is PeerJ policy that additional references suggested during the peer-review process should *only* be included if the authors are in agreement that they are relevant and useful #]
The paper is well-structured and easy to read. The narrative progresses logically: the introduction outlines the problem, the second section reviews related work on the topic, and subsequent sections describe the methodology and present the results.
Minor comments:
1. When numbering equations in LaTeX, it is recommended to use the "{equation}" environment to ensure consistent formatting. In the current document, equations are manually numbered with a space following the formula.
2. Figures 1, 2, 3, 4, 5, 6, 9, and 10 are presented in relatively low resolution. Pixelation is visible, and in some cases (e.g., Figures 9 and 10), it is difficult to discern the objects detected by the model.
3. Figure 8 appears to be a screenshot, as indicated by the presence of a gray element in the lower-left corner.
4. It is unclear why the abbreviation for "Multiscale Attention Feature Pyramid Structure" is "NSAFPS" rather than "MSAFPS." The reasoning behind the use of "N" instead of "M" should be clarified.
5. Line 32 contains a possible typographical error: "prowess" should likely be "process."
6. Many instances of incorrect capitalization appear throughout the text. For example:
- Articles are capitalized unnecessarily, such as on line 23, several times between lines 262 and 263, and in other parts of the document.
- "We" is incorrectly capitalized on line 30.
7. Conversely, some terms and sentences should be capitalized but are not:
- Line 58: "RoI" should be "ROI."
- Figure captions (e.g., for Figures 6, 9, 10, and 11) begin with lowercase letters but should start with uppercase.
- Line 268: The sentence starts with a lowercase letter, or there is a period is used in place of a comma.
Similar issues are present throughout the document and require thorough proofreading.
8. Most references to figures in the text appear after the corresponding figure. It would be more logical to reference the figure before presenting it.
9. Figures 3 and 6 are included in the article, but there are no references to them in the text.
10. On line 62, multiple citations are listed separately. It would be better to combine them using a single command, e.g., "\cite{ref-book8, ref-book9, ref-book10, ref-book11, ref-book12, ref-book13, ref-book14}" instead of writing each citation individually.
11. There are several instances where no space follows a citation or a parenthetical explanation, such as in "NSAFPS(Multiscale Attention Feature Pyramid Structure)" on line 28 or "conduct\cite{ref-book1}" on line 39. In contrast, there are cases where unnecessary spaces precede punctuation marks, such as on line 39 and between lines 262 and 263. To address these inconsistencies, it requires thorough proofreading.
12. The term "Multiscale" is inconsistently written as "Multi-scale" in some places. Consistent spelling should be used throughout the document.
13. For numbered lists, it is recommended to use the "enumerate" environment in LaTeX for clarity and formatting consistency (e.g., lines 108, 113, and 118).
14. Line 170 contains a typographical error: "BoumdingBox" should be "BoundingBox." or even with space between bounding and box.
15. Line 248: "BackBone" should be "Backbone."
16. Abbreviations should be defined upon their first occurrence. For example, "EMA" is first expanded on line 306, but it appears earlier on line 293 without explanation. Similarly, the origin of the "E" in "Multiscale Attention" (abbreviated as "MA") is unclear. This issue is repeated for other abbreviations and should be addressed.
17. In Figure 6, uppercase letters are used for labels, but the same labels appear in lowercase in the description. Furthermore, the labels on the figure and in the image description differ significantly and should be aligned.
18. Formulas 6–8 appear to require reordering for clarity. For example, Formula 6 introduces $L_{WIOUv1}$, which has not been previously defined, and combines it with a formula calculating $\gamma$. Probably these two formulas should be separated in two lines. Formula 7 uses $R_{WIOU}$, which is described in Formula 8. This arrangement forces the reader to search backward to understand the notation.
19. Lines 512–513: Are you certain that the reference is to Figure 8? It is better to use the "ref{}" command in the LaTeX document for referencing figures and tables instead of manual references.
20. No links to the table 7 in the text.
After a review of the article, the following questions remain:
1. What is the execution time of the models? Is the proposed method intended for real-time analysis of student behavior during lectures?
2. Student behavior can change throughout a lecture. However, the experiments were conducted on static images. How is the proposed algorithm intended to work when applied to lecture video recordings? Specifically, how should the model's output be interpreted if, at one moment, a student is attentively listening, then becomes distracted by their phone, and later resumes paying attention?
3. How will the proposed approach perform in scenarios involving varying lighting conditions in the classroom and differing input photo/video quality?
1. Comparing the results of the proposed algorithm, authors conducted experiments with other well-known model. However, results obtained by other researchers on the SCB-Dataset3-S dataset were not included. It would be valuable to see a comparison with the results of other researchers. These results can be added to the table 7.
2. Figure 11 presents the model's output. Why were many faces not detected?
3. Will the code be available on GitHub or another platform for researchers to reproduce the results presented in the article?
A significant contribution of this work is the publication of a new dataset containing images of students in a classroom exhibiting various behavior patterns: hand-raising, reading, writing, using a phone, bowing their head, etc.
Thank you for the invitation to review the study. The study proposed A deformable multi-scale adaptive approach to automate classroom behavior. After reading the study several times, I must commend the effort of the authors. The authors employed good analysis and systematically presented the results. I only have a minor concern which centered on the citation of strong claims and the clarity of gaps in the literature.
Several strong claims were made in the study, particularly in the introduction, with no citations to justify the claim. I have cited a few below:
"Classroom behavior patterns of students serve as crucial indicators of their learning progress……"
"Conventional approaches to analyzing classroom teaching behaviors rely heavily on subjective self-evaluations, direct oversight, and manual coding procedures."
"The adoption of deep learning, renowned for its robust feature extraction and autonomous learning prowess, has markedly enhanced the precision and speed of recognizing behaviors within classroom settings."
I found some relevant studies that can be used to justify the first and second claims: Yusuf et al. (2024): https://doi.org/10.1007/s10639-023-12079-8
Akcapinar, G., & Hasnine, M. H. (2022): https://doi.org/10.1016/j.procs.2022.09.443
Andrade, A., Delandshere, G., & Danish, J. A. (2016): https://doi.org/10.18608/jla.2016.32.14
[# PeerJ Staff Note: It is PeerJ policy that additional references suggested during the peer-review process should *only* be included if the authors are in agreement that they are relevant and useful #]
Some studies that employed deep learning architecture in this regard were reviewed. However, it is still not clear how these prior studies differ from the present study and how the limitations within the studies led to the present study.
Based on the minor issues observed, I qualify this as a minor revision. Once again, I commend the authors' efforts.
Relatively good and novel.
Relatively good.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.