All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Dear Authors,
Thank you for addressing the comments. Your manuscript now seems sufficiently improved.
Best wishes,
[# PeerJ Staff Note - this decision was reviewed and approved by Mehmet Cunkas, a PeerJ Section Editor covering this Section #]
Dear Authors,
Although all reviewers have concluded that the paper has been sufficiently improved and can be accepted in its current form; references numbered [5], [6], [8], [9], [12], [17], [36] contain only titles, suggesting a lack of careful reference preparation. Furthermore, there are inconsistencies in the order and numbering of section headings. The "Conclusion and Limitations" section, in particular, contains statements that appear to have been generated by artificial intelligence.
Because of this, your paper needs a minor revision.
Best wishes,
**PeerJ Staff Note:** Your submission appears to have been at least partially authored or edited by a generative AI/large language model. When you submit your revision, please detail whether (and if so, how) AI was used in the construction of your manuscript in your response letter, AND mention it in the Acknowledgements or other relevant section of your manuscript.
The authors have made all the suggested changes in the revised version.
The authors have made all the suggested changes in the revised version.
The authors have made all the suggested changes in the revised version.
The authors have addressed all the concerns. The paper can be accepted in its current state.
NA
NA
NA
Author has made the required changes.
Author has made the required changes.
Author has made the required changes.
Author has made the required changes.
Dear Authors,
Thank you for your submission. Feedback from the reviewers is now available. Your article has not been recommended for publication in its current form. However, we do encourage you to address the concerns and criticisms of the reviewers and resubmit your article once you have updated it accordingly. It is also recommended that some paragraphs be divided into two or more sections in order to enhance their comprehensibility and understandability. All equations should be used with correct equation number. Equations should be used with correct equation number. Please do not use “as follows”, “given as”, etc. Explanation of the equations should be checked. Definitions and boundaries of all variables should be provided. Necessary references should also be given. Many of the equations are part of the related sentences. Attention is needed for correct sentence formation.
Please also pay special attention on the usage of abbreviations.
Best wishes,
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
This paper presents a novel dual-stage ensemble framework for facial expression recognition (FER), incorporating multiple CNN architectures and a secondary statistical module for enhanced performance. The work is well-motivated and addresses a significant challenge in FER — balancing accuracy and efficiency across complex emotional categories.
The idea of combining traditional deep learning feature extractors with a secondary layer that leverages ensemble learning and feature selection is interesting and contributes meaningfully to current research.
However, several aspects of the paper would benefit from further clarification, improvement in presentation quality, and deeper experimental analysis. I commend the authors on the depth of technical implementation but suggest revisions before acceptance.
The paper would benefit from a clearer articulation of why this dual-stage approach significantly outperforms single-stage models beyond reporting results — a short theoretical or intuitive explanation would strengthen the contribution claim.
The description of the Secondary Statistical Module (SSM) needs to be clearer. Please define its operations with a formal algorithm or flow diagram.
It's unclear whether data augmentation or cross-validation strategies were consistently applied across all models during evaluation
Baseline Comparisons: While improvements are shown, the paper lacks statistical tests (e.g., paired t-test) to confirm whether the improvements are statistically significant.
Efficiency Metrics: Time comparisons are promising; however, detail the environment (e.g., GPU model, RAM, software versions) clearly in the experiments section.
Additional Metrics: Adding ROC-AUC curves, F1-scores, or precision/recall tables would better validate robustness.
Figure captions should be self-contained — briefly describe what each figure shows without relying on the main text.
Minor grammar errors and typos are present (e.g., "DesNet" should be "DenseNet", "avaialbe" should be "available").
Some technical terms (e.g., MFCC, GMM, feature stacking) are used without definition upon first appearance — please add a brief explanation for a broader audience.
Suggest improving paragraph transitions between "Materials and Methods" and "Results" to make the paper more fluid.
The related work section could conclude with a comparative analysis table showing key differences between the proposed method and existing methods (e.g., performance, architecture, limitations).
Please include recent references from 2023–2024, especially on the use of Vision Transformers (ViT) or lightweight CNNs in FER, to strengthen the paper's relevance.
Summary of Strengths
- Novel dual-stage ensemble design with strong empirical results.
- Evaluation on multiple datasets demonstrating effectiveness.
- Good balance between model complexity and real-world applicability.
- Summary of Weaknesses
- Lack of clarity in explaining some modules (e.g., SSM).
- Minor issues with figure quality, typos, and baseline result analysis.
- Insufficient discussion of limitations and ethical considerations.
This is a well-written and technically sound research paper. It focuses on a dual-ensemble framework for facial emotion recognition. The performance gains have been demonstrated using ensemble learning and deep feature extraction. The empirical validation is apparent. However, the following suggestions can further strengthen the paper.
1- There is a need to justify why these specific models, such as DenseNet, VGG, ResNet, Xception, ViT were selected over.
2- Briefly describe the generalizability of the model.
3- The selection of the meta-learner XGBoost is not well-explained. There exist simpler and more complex models, but the authors did not utilize them.
4- Some of the equations are misnumbered, which creates ambiguity in the paper, such as “equation xx,”. The authors should revise the manuscript.
5- The authors have utilized ensemble learning, however, there is a need to provide information about additional steps taken to prevent overfitting. Please also provide more details about augmentation and cross-validation.
6- There is no information about handling class imbalance specifically for mixed emotions.
7- In the paper, DenseNet221 is used in ECER. However, its justification is missing.
8- The authors used IMED in the paper for blended emotions, but the details about creating labels have not been provided. were the Emotions manually annotated, crowd-sourced, or synthesized?
NA
This manuscript is fairly written and has introduced novelty in facial emotion recognition. This manuscript can be further improved.
1- In the paper, there are numerous typos such as “avaialbe,” instead of “available”, “Exception” instead of “Xception,” “DesNet” instead of “DenseNet,” etc. This indicates the lack of proofreading.
2- In Figure 7, the (d) label is repeated, and there is no label (e), which is referred to in the text.
3- Equation numbers like “equation xx” are incorrect, reflecting the rushed submission.
4- At several places, the content is unnecessarily repeated, such as the ensemble structure, which creates redundancy.
5- Similarly, the references are repeated without contextual description.
6- The authors used “simple emotions,” “typical emotions,” and “common emotions” terms for the same semantics. The terms should be standardized.
7- In ECER and PMP, the “SSM module” is mentioned but poorly described. More explanation is needed.
8- The details of parameters should be mentioned, such as Learning rates, Batch sizes, Optimizers, Number of epochs, etc.
.
.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.