All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Dear Authors, thank you for addressing all reviewers' concerns.
I carefully evaluated your rebuttal, and I believe that your work can proceed to the next editorial process.
[# PeerJ Staff Note - this decision was reviewed and approved by Mehmet Cunkas, a PeerJ Section Editor covering this Section #]
The authors have condensed excessively lengthy phrases, rectified minor grammatical inaccuracies, and enhanced overall clarity. The introduction offers sufficient context and rationale for the investigation, while the literature evaluation is thorough and current.
Figures and tables have been enhanced: captions are now more informative, and the primary figures (notably Figures 1–2) are crisper and accompanied by high-resolution counterparts. Errors in reference formatting (e.g., “Ekman & Cordaro, 1798”) have been rectified. The manuscript adheres to the structural norms of PeerJ.
No other modifications necessary.
The article aligns with the journal's scope and demonstrates methodological rigor. The authors have elucidated their preprocessing choices, including image resizing to 100×100 and the utilization of YOLOv8 augmentation, while also rationalizing the selection of YOLOv8-S. Definitions of mathematical terminology (e.g., residual mapping F in Eq. 1) have been incorporated to enhance reader understanding.
The experimental results are robust and properly articulated, with accuracy levels documented across various benchmark datasets. The uniqueness of the adaptive feedback systems is now highlighted, setting them apart from previous ITS methods.
The authors have enhanced the discourse on cultural prejudice by examining demographic disparities in datasets (RAF-DB and AffectNet) and delineating techniques for mitigation (augmentation, fairness metrics). Limitations are openly recognized, and exaggerations from previous iterations have been tempered. The framework is now characterized as “a promising step” instead of a conclusive standard.
Dear authors, please address the concerns raised by Reviewers 1 and 3
The article maintains clear, professional English throughout
Introduction effectively establishes context and motivation for developing emotion-aware ITS for cognitive disabilities
Structure follows discipline norms with logical flow from problem statement to solution
Literature review is comprehensive and relevant
Some technical terms in equations (e.g., residual mapping F in Eq. 1) could benefit from brief intuitive explanations
Figure quality (especially Figures 1-2) could be improved with higher resolution and larger labels
Research fits well within journal's scope of AI for inclusive smart cities
Methods are rigorously described with sufficient detail for replication
Code and datasets are publicly available (GitHub and dataset links provided)
Data preprocessing is adequately described (resizing, normalization)
Evaluation metrics are comprehensive and appropriate
More justification for choosing YOLOv8 S over other variants would strengthen model selection discussion
Ethical considerations section could be expanded with specifics about IRB approval for future real-world testing
Experiments are thorough with cross-dataset validation
Results support claims with strong performance metrics
Limitations are honestly discussed with mitigation strategies
EigenCam explainability is a novel and well-validated contribution
Future directions are clearly outlined
Could better emphasize novelty of adaptive feedback mechanisms compared to prior ITS solutions
Discussion of cultural bias could be strengthened with analysis of dataset demographics
While the manuscript has been significantly improved, two minor enhancements would further strengthen it:
Figure quality: Figures 1-2 remain somewhat small in the current version. Please either:
Increase their size in the main document, or
Provide them as separate high-resolution files (e.g., 300 DPI PNG/TIFF).
Future work specificity: The discussion of physiological sensors (Section "Preliminary Mitigation Strategies") would benefit from concrete examples (e.g., ECG for stress detection, GSR for arousal monitoring) and planned testing protocols.
These adjustments would improve readability and implementation clarity without requiring major revisions.
The manuscript is clearly written, well-structured, and professionally presented. The literature is relevant and comprehensive.
The research is methodologically robust, with comprehensive descriptions of datasets, model design, and evaluation metrics.
The results are substantial and backed up by both quantitative and qualitative proof!
The paper is generally well-written and professionally organized; however, certain sections contain lengthy sentences and minor grammatical errors. These factors diminish the overall intelligibility and should be rectified through meticulous editing or professional language review. The introduction provides a clear explanation of the motivation for the study and relevant context. However, certain sections are excessively protracted and could be more concise. The literature review is sufficient and incorporates recent references; however, it is possible to broaden the scope of existing emotion-aware assistive systems that are specifically designed for cognitive disabilities. Figures and tables are suitable; however, they necessitate more descriptive captions (e.g., specify the context and purpose of the dataset). There are some references that seem to contain errors (e.g., "Ekman & Cordaro, 1798") and should be verified.
The paper suggests a novel integration of YOLOv8 and EigenCam explainability in an ITS environment that is technically sound and in accordance with the journal's scope. The framework design and mathematical model specifications are exhaustively described.
The dataset preprocessing steps are briefly discussed, but they lack justification (e.g., the reason for resizing images to 100×100 and the absence of additional augmentation beyond the YOLOv8 defaults). Consent, data security, and bias mitigation are among the ethical considerations that require additional attention, despite their mention. The robustness of conclusions regarding the practical applicability of the system is restricted by the absence of pilot testing or real-world validation, despite the comprehensive explanation of the evaluation methods, performance metrics, and model selection process.
The model's interpretability is supported by EigenCam visualizations, and experimental results indicate robust performance (e.g., 95.8% accuracy on RAF-DB, 93.95% on AffectNet, and 100% on CK+48). These findings are encouraging; however, they are wholly based on publicly available benchmark datasets. Claims regarding the assistance of individuals with cognitive disabilities during transit are somewhat exaggerated in the absence of real-world field testing or user studies. The discussion of limitations addresses bias and trust; however, it could be further developed, particularly in relation to real-world deployment and usability challenges. The results are generally in agreement with the conclusions; however, it is prudent to exercise greater caution when referring to the framework as "a new standard for inclusive AI-driven solutions."
This study is both pertinent and innovative, with significant technical potential. In order to fortify the manuscript, the authors should: (1) incorporate simulated or pilot real-world data, (2) broaden ethical and privacy discussions, (3) improve the legibility of figure captions, and (4) rectify reference formatting errors. The work would make a valuable contribution to inclusive AI applications in intelligent transportation systems with these enhancements.
Dear Authors,
Thank you for the contribution.
The reviewers highlighted gaps and concerns, which I describe below. Please excuse me for typos, I just broke my workstation while writing this letter.
Following reviewers' comments, I agree that the study is technically strong and socially relevant.
However, it would be useful to demonstrate that the system scales beyond a small workstation and performs reliably in realistic transportation settings.
Also, please ensure that you document all code, preprocessing, and dependencies so that results can be reproduced easily.
Please, improve performance claims with statistical tests and confidence intervals.
Finally, smooth section transitions, define core terms for non-specialists, enrich figure captions, proofread residual language issues, and add brief discussions on data privacy, bias mitigation, and how EigenCam compares to other explainability tools.
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
The manuscript is written in excellent, professional English, making it understandable to foreign audiences. The terminology is consistent, and the wording is concise.
Introduction and Context: This section explains the obstacles that people with cognitive disorders experience while using transportation systems. The study's motivation is clearly described, emphasizing both societal importance and research shortages.
The literature review includes substantial and pertinent background references. The authors discuss cutting-edge emotion detection technologies, such as machine learning and explainable AI techniques (e.g., EigenCam). Key research and gaps are indicated.
Areas for Improvement:
1. Improved Structure and Flow: The work respects PeerJ criteria, but several parts may benefit from clearer subject transitions. For example, the transition from linked works to methodology seems sudden. Using transitional sentences to help the reader would increase cohesion.
2. Proofread minor issues. Although the general language quality is great, there are some small grammatical flaws and poor phrasing. Example: Line 61 should read "Integrating these emotions into social interactions is integral..." instead of "Integrating these emotions with social."
Line 63's phrase "foster relationships and build meaningful connections" is slightly redundant. Try to "foster relationships and build connections."
A thorough proofreading run could resolve these minor errors.
3. Definitions of Key concepts: The explanation of concepts such as ITS, YOLOv8, fog computing, and EigenCam assumes prior knowledge. Consider briefly describing these ideas for a larger readership. For example, a brief overview of fog computing in the introduction would benefit non-specialist readers.
4. Results Presentation. The findings include numerous measurements and tables, although some captions (for example, ROC curve images) should be more descriptive to aid interpretation, particularly for non-expert readers. Including a brief interpretation in figure captions would improve clarity.
Conclusion: The article fulfills Basic Reporting criteria, but may benefit from some language refining, cleaner transitions between sections, and more accessible definitions for better clarity.
The article aligns with PeerJ Computer Science's AI focus on emotion recognition and ITS for cognitive impairments.
The use of YOLOv8 for emotion recognition, along with fog computing, cloud infrastructure, and explainable AI (EigenCam), is a technically sound method. The experimental evaluation across different datasets (RAF-DB, AffectNet, and CK+48) increases the study's robustness.
Areas for Improvement:
1. Reproducibility (Partial): - The code repository is shared (GitHub link supplied), which is great. However, the reproduction directions should be more specific. Include a README with setup steps, such as the Python environment and dependencies.
Provide example scripts or notebooks for reproducing crucial outcomes, such as training and assessing on RAF-DB. The computing infrastructure is defined, but it has modest specifications (Intel Core i7 with 8GB RAM). It begs the question of whether larger YOLOv8 models were adequately trained on large datasets such as AffectNet using this approach. Clarifying whether external compute resources were used (for example, cloud GPUs) would increase transparency.
2. Data Preprocessing (Sufficient but might be improved): - The authors provide thorough descriptions of dataset selection and characteristics (RAF-DB, AffectNet, CK+48), but preprocessing methods such as scaling, normalization, and data augmentation are not explicitly described. Including them would improve replicability and allow others to match their experimental settings more closely.
3. Evaluation Methods: o The measures utilized (accuracy, precision, recall, F1, specificity, IoU, BAC, MCC) are suitable for classification and localization applications.
While cross-dataset testing is discussed, further details on model selection, hyperparameter tweaking, and overfitting mitigation methods (e.g., early halting, validation splits) might be provided. 4.
Conclusion: The experimental design is essentially good, but requires more specific repeatability methods, data pretreatment instructions, and clarification on computational resources. These enhancements would improve the study's technical transparency and reproducibility.
The experiments align with the study objectives of establishing a robust and explainable emotion detection framework for individuals with cognitive limitations in intelligent transportation systems (ITS). The combination of YOLOv8 models, EigenCam explainability, and adaptive feedback mechanisms is consistently assessed across relevant datasets, hence confirming the study's objectives.
Evaluation across datasets: Performance metrics for RAF-DB, AffectNet, and CK+48 show satisfactory results, with extensive reporting on many performance indices (accuracy, precision, recall, F1, IoU, BAC, MCC). This extensive evaluation provides confidence in the framework's strength.
Areas for Improvement:
1. Generalization and Real-World Applicability: The model performs well on standard datasets, but its applicability to real-world ITS situations (e.g., busy transit hubs, variable lighting, occlusions) is uncertain. The report discusses simulated transit scenarios but does not provide solid real-world testing data. Future studies should include pilot deployments or more extensive simulations to ensure real-world applicability.
2. Limitations are acknowledged but could be expanded: Limitations include occlusions, lighting circumstances, cultural prejudice, and reliance on facial signals. This is recognized, but mitigation techniques are primarily left to future research. Including preliminary initiatives or suggestions (for example, incorporating physiological sensors and data augmentation for underrepresented populations) would increase awareness of these challenges.
3. Conclusion and Future Work: The conclusion effectively summarizes findings but may benefit from a more detailed explanation of future directions. For example, describing next stages such as longitudinal research with users, expanding to additional cognitive disorders, or combining multimodal data (speech, physiological signals) will help to concretely define the roadmap.
4. Replication and statistical rigor: While cross-dataset testing is stated, statistical tests (e.g., paired t-tests, Wilcoxon signed-rank tests) that establish the significance of improvements over baselines are only briefly mentioned. The explicit reporting of these tests and their outcomes would increase confidence in the conclusions.
Conclusion: The study's findings are mainly valid and support its goals. However, real-world applicability, limitation handling, and statistical validation require improvement. This would strengthen the results and increase their practical applicability.
The use of EigenCam for explainability is a key feature of our work, particularly for delicate applications with cognitive limitations. However, the subject of trust calibration (for example, over-trust and under-trust in AI systems) might be expanded to include more concrete techniques. For example, adding interactive or user-adaptive explainability techniques might improve the system's usability across a wide range of cognitive profiles.
The framework acknowledges AR devices and mobile apps, but provides few technical details on their integration. Future iterations could benefit from incorporating user interface design considerations, latency testing on wearable devices, and usability research with target audiences.
Dataset prejudice and Diversity: The research admits cultural prejudice in face expression datasets, but suggests additional actions. Consider using data augmentation techniques to recreate various scenarios, or talk about the idea of creating new datasets customized to cognitive disorders and diverse demographics.
While the study focuses on ITS, the framework may have broader implications (e.g., therapeutic, educational). Exploring or identifying these bigger use cases may make the work more compelling.
Figures, especially ROC curves and EigenCam visualizations, are obvious. However, figure captions should be more detailed for solo comprehension. Include key facts immediately in the captions to improve reading.
The introduction successfully contextualizes the problem of cognitive disability support within Intelligent Transportation Systems (ITS), highlighting the need for emotion-aware frameworks. The literature review is comprehensive, up-to-date, and appropriately referenced. A final proofreading pass is advised to fix minor grammatical issues and improve the flow of certain sentences, especially in the introduction and related works section.
The experimental design is sound and appropriately scoped for an AI Application article. The authors clearly describe the use of YOLOv8 variants for emotion recognition, including integration with fog computing and explainability through EigenCam. The methodology is detailed, with mathematical formulations, architectural insights, and visual explanations that support reproducibility. More specific details on data preprocessing (e.g., face alignment, augmentation, input resolution) should be included to support full replication.
The findings presented in the manuscript are well supported by the experimental results. The model consistently demonstrates high accuracy across all three datasets, with the YOLOv8 Small variant achieving up to 95.8% accuracy on RAF-DB and 100% on CK+48. The use of cross-dataset validation strengthens the argument for generalizability. A brief discussion on how ethical concerns (e.g., privacy of facial data) will be addressed in deployment would enhance the practical applicability of the findings.
The manuscript’s contribution is significant, offering a comprehensive ITS framework that is not only technically robust but also socially impactful. Moreover, expand briefly on the preprocessing steps. Also, include a short comparison between EigenCam and other XAI tools (e.g., Grad-CAM).
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.