Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on May 15th, 2025 and was peer-reviewed by 3 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on July 1st, 2025.
  • The first revision was submitted on August 8th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • The article was Accepted by the Academic Editor on September 4th, 2025.

Version 0.2 (accepted)

· Sep 4, 2025 · Academic Editor

Accept

Thank you for your valuable contribution to science.

[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]

**PeerJ Staff Note:** Although the Academic and Section Editors are happy to accept your article as being scientifically sound, a final check of the manuscript shows that it would benefit from further editing. Therefore, please identify necessary edits and address these while in proof stage.

·

Basic reporting

Clear, professional English; well-structured with appropriate figures/tables; sufficient background and references; results are self-contained.

Minor presentation fixes (editorial only):
• Define symbols/abbreviations at first use.
• Figure 4: Relabel panels to A–F and use the caption: “(A–C) Train/Val accuracy for original, +5k, +10k; (D–F) Train/Val loss for original, +5k, +10k.”
• Model figure: Please delete the name “Proposed ENeTAMIB model” from figure 4.
• (Optional) Near the model figure, add input size and key hyperparameters for quick reference.

Experimental design

Original research, clear question, sensible gap; methods and training details are adequate for replication; ablations and explainability strengthen interpretation. No additional experiments required for acceptance.

Validity of the findings

Data and analyses are appropriate and controlled; cross-validation, external test, and reported metrics support the conclusions, which are properly bounded by the evidence.
Optional (not required for acceptance): add a single calibration plot (e.g., reliability/ECE).

Additional comments

* Strong, well-organized manuscript suitable for publication after the two minor figure edits noted above.
* These are production-level corrections and do not affect the scientific content.

Production notes (for the journal):
- Update Figure 4 panel markers to A–F and apply the revised caption.
- Remove the figure label text “Proposed ENeTAMIB model” from figure 4.

·

Basic reporting

-

Experimental design

-

Validity of the findings

-

Additional comments

Thank you for your thorough and constructive revisions. I have carefully reviewed the updated manuscript along with your detailed responses to the reviewers’ comments. It is clear that you have made significant efforts to address each concern raised.

I appreciate the improvements in the clarity of the methodology, the inclusion of ablation studies, the integration of external validation (HAM10000 dataset), and the addition of hyperparameter details and architectural diagrams. These changes have strengthened the technical rigor, reproducibility, and scientific contribution of your work.

Your revisions to the explainability analysis and the more critical interpretation of Grad-CAM and LIME visualizations add meaningful depth to the clinical applicability of your study. The updated manuscript now presents a clear pipeline, justifies its novelty, and demonstrates robustness through cross-validation, SMOTE balancing, and statistical tests.

Overall, I am satisfied with the quality of the revision.

Version 0.1 (original submission)

· Jul 1, 2025 · Academic Editor

Major Revisions

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

**Language Note:** The review process has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at [email protected] for pricing (be sure to provide your manuscript number and title). Alternatively, you should make your own arrangements to improve the language quality and provide details in your response letter. – PeerJ Staff

·

Basic reporting

-

Experimental design

-

Validity of the findings

-

Additional comments

1. Title Simplification: The current title is overly complex and difficult to interpret. Please consider revising it to be more concise and understandable to a broader audience.

2. Dataset Reference Missing: In the Materials and Methods section, there is no mention or link to the benchmark dataset used. It is essential to provide a direct reference or citation to ensure reproducibility.

3. Lack of Technical Depth in Augmentation: The augmentation techniques used are
limited to basic operations like rotation and flipping, despite the dataset being relatively small. This raises concerns about whether such minimal augmentation is sufficient for robust model training. Please justify this choice technically.

4. Diverse Image Modalities - No Unified Preprocessing Justification: The dataset comprises images from different sources (dermoscopic, lab cameras, regular photographs), yet no explanation is given on how preprocessing was uniformly applied across these varying modalities. It is critical to clarify how such diverse data was standardized and processed consistently.

5. Unclear Contributions and Lack of Novelty: The manuscript vaguely mentions that methods were used, explored, and applied without explicitly identifying the novel contributions. It is unclear whether the novelty lies in the ensemble, preprocessing, or augmentation. A clear technical pipeline or algorithm must be presented to demonstrate originality beyond conventional approaches.

6. Class Imbalance and Overfitting: There is a significant imbalance among classes, especially between classes 50–56 and the initial few classes. The manuscript does not explain how overfitting was prevented or what optimization strategies were employed to handle this imbalance. Please provide technical clarifications.

7. Generic Methodology Section: The methodology lacks specific details and appears generic. It does not highlight the novel aspects of your work, making it hard to distinguish your approach from existing conventional models. The section should be revised to emphasize your unique contributions.

8. Missing Model Weights and Visualizations: Include the model weights and a visual summary (e.g., attention blocks, model architecture) in the manuscript to help readers understand customizations and interpretability aspects of your model.

9. Unexplained Notations in Equations: The notations used in Equations (1) to (4) are not clearly explained, especially in the context of how they were adapted to your dataset. Please clarify.

10. Missing Hyperparameter Table: A separate table summarizing all hyperparameters used during training is expected. This will enhance the clarity of your model's setup and allow better reproducibility.

11. Inconsistencies Across Manuscript:
o Table 2 mentions only 10 epochs, while the corresponding graph in Figure 3 shows 100 epochs — this is contradictory.
o In Table 3, classes are indexed from 0 to 56, but in the confusion matrix, they appear from 1 to 57 — again, inconsistent.
Please ensure all such discrepancies are corrected for coherence.

12. Integration of Explainable AI (XAI): It is unclear how and where XAI (e.g., Grad-CAM) was integrated into the model. Also, justify the choice of Grad-CAM over other interpretability methods within the manuscript.

13. Table Labeling Correction: In Table 4, replace the label "This work" with "Proposed work" for consistency and professionalism.

14. Weak Comparative Evaluation: The comparison shown in Table 4 is not convincing. It is unclear how comparisons with other works using balanced data were made against your highly imbalanced dataset. Please justify the ethics and fairness of such comparisons.

15. Need for Ablation Studies: Ablation trials are required to validate the importance and impact of each component in your proposed model. This is necessary to support your claims of performance improvement.

16. Figure Arrangement: Figures are taking up excessive space in the manuscript. Please consider arranging related figures side by side with sub-labels (e.g., Fig. 6a and 6b, Fig. 7a and 7b). Similarly, Figures 4 and 5 can be combined as subplots to improve layout and readability.

·

Basic reporting

The manuscript is generally well-structured and written in understandable English. However, it would benefit from a thorough language edit to eliminate redundancies and improve clarity, especially in sections discussing data augmentation and model components. The background and literature review cover a reasonable number of sources, but some recent and relevant works in skin lesion classification and explainable AI are missing or under-cited. Additionally, several figures and tables are inadequately explained or formatted. For example, the architectural details of the proposed ENeTAMIB model are not clearly visualized, which limits readers’ understanding of the network design. Moreover, the dataset description and preprocessing steps are fragmented and should be better organized. Finally, although performance results are presented, the manuscript lacks a link to source code or full data to ensure reproducibility.

Experimental design

The research aligns with the journal's scope and addresses a relevant problem in medical image analysis. The research question is clearly motivated, aiming to improve skin disease classification using a hybrid CNN model enhanced with attention and inception blocks. However, the methodology needs further refinement. The manuscript lacks sufficient technical detail about the structure and implementation of the proposed modules (e.g., how the attention mechanism interacts with the inception block, or how these components are positioned within EfficientNet-B2). Training settings, data partitioning strategies, and evaluation protocols (e.g., number of epochs, early stopping criteria) are insufficiently described. Most importantly, the reliance on a heavily augmented dataset derived from only 888 original images raises concerns about overfitting and weak generalization. To improve scientific rigor, the authors should include ablation studies, clearer architectural visualizations, and a discussion of how their method compares to baseline models under identical training conditions.

Validity of the findings

The reported results demonstrate high classification accuracy, but their validity is questionable due to limited dataset size, extensive augmentation, and a lack of external validation. The authors do not sufficiently control for overfitting, and no statistical tests are applied to confirm the significance of performance differences. Furthermore, while explainable AI tools like Grad-CAM and LIME are used, the discussion around these visualizations is minimal and lacks depth. There is no real analysis of how these explainability maps correlate with clinical features or improve model trust. The conclusions are too confident given the limitations of the experimental setup. To strengthen the study, the authors should validate their model on external datasets, provide results across multiple random splits or cross-validation folds, and integrate statistical assessments (e.g., confidence intervals, t-tests) to support their claims.

Additional comments

The integration of deep learning and explainable AI for skin disease diagnosis is highly relevant. However, to improve the quality and scientific clarity of your manuscript, I recommend major revisions based on the following concerns:

1- Architecture Novelty and Justification:
The proposed model is composed of well-known components (EfficientNet-B2, attention module, inception block). While the integration may be effective, the manuscript lacks a clear technical justification or innovation that sets it apart from other architectures in the literature.

2- Small Dataset and Augmentation Concerns:
The original dataset includes only 888 samples. The reported accuracy of over 98% raises concerns about overfitting, particularly since most results are based on aggressively augmented data (up to 10,000 images). Additional discussion on generalization, limitations, and validation strategies (e.g., cross-validation or external datasets) is needed.

3- Methodological Clarity and Replicability:
The implementation of modules (attention and inception) should be explained with more clarity and detail, ideally including architectural diagrams and pseudocode. Moreover, training configurations (e.g., batch size, optimizer, learning rate, and early stopping) are not comprehensively reported.

4- Explainability Analysis:
While Grad-CAM and LIME visualizations are included, the interpretation of these results is superficial. Please provide a more critical assessment of how XAI contributes to model trust, reliability, or clinical applicability.

5- Presentation and Language:
The manuscript would benefit from language editing. Redundancies, overly long explanations (especially in the augmentation section), and occasional grammatical errors affect the paper’s clarity.

6- Benchmarking Against Literature:
While a comparison with state-of-the-art models is included, the analysis is mostly descriptive. A more critical and quantitative discussion—perhaps including statistical significance testing—would improve the paper’s scientific value.

We encourage you to revise the manuscript accordingly. With better justification of the model’s design, improved presentation, and more rigorous experimentation, this work could be a valuable contribution to the field.

·

Basic reporting

The manuscript is generally well-structured and adheres to PeerJ’s format guidelines. However, there are multiple issues with grammar, phrasing, and sentence construction throughout the text that affect clarity and readability. The writing would benefit from thorough language editing, preferably by a native English speaker or a professional editing service. Moreover, some figure captions lack sufficient descriptive detail, making them difficult to interpret without referring to the main text. Additionally, while the dataset used is appropriately cited, the manuscript does not provide a DOI or direct URL, which would be necessary for reproducibility and data transparency.

Revise the manuscript for improved English usage and readability. Enhance figure captions to be more informative and include a direct link or DOI for the dataset used.

Experimental design

The proposed model architecture is novel and technically sound, combining EfficientNet-B2 with attention and inception modules. However, the experimental setup lacks critical details. Key training parameters such as learning rate, batch size, optimizer type, and data splitting strategies are not explicitly stated, which limits reproducibility. Furthermore, there is no ablation study to assess the individual contributions of the attention and inception components. While data augmentation was used to overcome limited dataset size, the original dataset comprises only 888 samples, raising concerns about the model’s ability to generalize.

Include detailed training parameters and a complete description of the model implementation. Conduct ablation experiments to justify the added components of the architecture. Discuss strategies used to mitigate potential overfitting due to small sample size, such as regularization or early stopping.

Validity of the findings

The results presented are impressive, with very high performance metrics across 57 skin disease classes. However, the absence of cross-validation or testing on an external dataset raises questions about the generalizability of the findings. Given the small original dataset and heavy reliance on augmentation, these metrics may not reflect real-world performance. Additionally, the manuscript does not provide confidence intervals, statistical tests, or p-values to support the significance of the results. While explainability techniques like Grad-CAM and LIME are appropriately used, their interpretations are not validated by dermatologists or clinical experts.

Perform k-fold cross-validation and/or test the model on an independent dataset (e.g., HAM10000 or ISIC-2019) to demonstrate robustness. Include statistical analyses such as confidence intervals or hypothesis testing. If feasible, involve clinical experts to validate the relevance of the XAI visualizations.

Additional comments

This manuscript introduces a promising model architecture for multi-class skin disease classification and integrates explainable AI techniques effectively. The scope of the work aligns well with current trends in AI-driven medical diagnostics. However, improvements in methodological transparency, evaluation rigor, and clarity of writing are needed to strengthen the scientific contribution and reproducibility of the study. Furthermore, the manuscript would benefit from a clearly stated objective at the end of the introduction, a discussion on the study's limitations, and a forward-looking conclusion highlighting potential applications in clinical practice.

Clarify the research objective and contributions early in the introduction. Add a limitations section discussing dataset size and generalization. Expand the conclusion to include implications for clinical adoption and directions for future research.

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.