All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thank you for submitting your revised manuscript.
After a careful review of your revisions and point-by-point replies, I can confirm that all of the reviewers’ comments have been fully and satisfactorily addressed. I have assessed the revised manuscript myself and am delighted with the result. The changes have significantly enhanced the clarity, readability, and overall impact of your work.
I am happy to recommend this manuscript for publication in its current form.
[# PeerJ Staff Note - this decision was reviewed and approved by Shawn Gomez, a PeerJ Section Editor covering this Section #]
After a careful review of the four reviewer reports, we are requesting Major Revisions for the following reasons:
- A critical concern is the application of SMOTE to the entire dataset, which introduces data leakage and invalidates the reported results. Please correct this by applying SMOTE only to the training data after the train-test split, and update all related experiments and results;
- The manuscript does not sufficiently describe: (i) the architecture of the Tab Transformer (layers, heads, embedding strategies); (ii) the rationale and design of the ensemble/meta-learner; (iii) hyperparameter tuning strategies and search space; and (iv) hardware setup and training time. Please expand these sections to ensure reproducibility;
- The extremely high accuracy (99%) raises concerns. Please provide additional details;
- Clearly state your manuscript’s novelty in the abstract and introduction, including the research gap addressed and why your method is an advance over existing approaches;
- Discuss the small dataset size and its impact on generalizability. Testing on additional or external datasets is strongly encouraged, or at least explicitly acknowledge this limitation.
**PeerJ Staff Note:** Please ensure that all review and editorial comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
**Language Note:** The review process has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at [email protected] for pricing (be sure to provide your manuscript number and title). Alternatively, you should make your own arrangements to improve the language quality and provide details in your response letter. – PeerJ Staff
1. The novelty of the proposed method is not clearly stated; revise the introduction to highlight the specific contribution.
2. The PIMA dataset is not described in enough detail; include feature definitions and preprocessing steps.
3. Recent related works (2022–2024) on Tab Transformers and ensemble learning are missing; update the literature review.
Liang, J., He, Y., Huang, C., Ji, F., Zhou, X.,... Yin, Y. (2024). The Regulation of Selenoproteins in Diabetes: A New Way to Treat Diabetes. Current Pharmaceutical Design, 30(20), 1541-1547. doi: https://doi.org/10.2174/0113816128302667240422110226
Liu, Z., Sang, X., Liu, Y., Yu, C., & Wan, H. (2024). Effect of psychological intervention on glycemic control in middle-aged and elderly patients with type 2 diabetes mellitus: A systematic review and meta-analysis. Primary Care Diabetes, 18(6), 574-581. doi: https://doi.org/10.1016/j.pcd.2024.09.006
Xiang, Y., Xu, Z., Xiao, R., Yao, Y., Tang, X., Fu, L.,... Ding, Y. (2025). Interacting and joint effects of assisted reproductive technology and gestational diabetes mellitus on preterm birth and the mediating role of gestational diabetes mellitus: a cohort study using a propensity score. Journal of Assisted Reproduction and Genetics, 42(2), 489-498. doi: 10.1007/s10815-024-03342-z
Ding, Y., Cai, X., Ou, Y., Liang, D., Guan, Q., Zhong, W.,... Lin, X. (2025). The Burden of Diabetes in the Southeastern Coastal Region of China From 1990 to 2019 and Projections for 2030: A Systematic Analysis of the 2019 Global Burden of Disease Study. Diabetes/Metabolism Research and Reviews, 41(1), e70031. doi: https://doi.org/10.1002/dmrr.70031
**PeerJ Staff Note:** It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors are in agreement that they are relevant and useful.
4. Figures and tables are not consistently discussed in the text; ensure each is referenced and explained.
5. Technical details such as SMOTE parameters, model architecture, and hyperparameter settings are insufficient; add for reproducibility.
6. The conclusion should better separate key findings, limitations, and future directions.
1. The justification for using the Tab Transformer model is weak, especially given its lower standalone performance; explain its inclusion more clearly.
2. The ensemble model design lacks detail; clarify how the base learners and meta-learner were selected and integrated.
3. The experimental setup does not mention cross-validation or statistical tests; add to validate model robustness and performance claims.
4. The impact of SMOTE on model performance is not analyzed; consider including an ablation or comparison without SMOTE.
5. The hyperparameter tuning strategy is unclear; specify the search method, parameter ranges, and evaluation criteria used.
6. The experimental design should address the small dataset size and its implications on generalizability.
1. The performance improvement over baseline models is marginal; include statistical significance testing to validate the results.
2. The Tab Transformer’s poor individual performance raises concerns about its contribution; provide justification or comparative analysis.
3. The use of a single small dataset (PIMA) limits generalizability; recommend testing on additional or larger datasets.
4. No error analysis or discussion of misclassifications is provided; include to support the reliability of findings.
5. The impact of SMOTE on model performance and potential overfitting is not evaluated; add supporting analysis.
6. Confidence intervals or variance measures for evaluation metrics are missing; include them to demonstrate result stability.
The overall idea is clear, but it is not very novel, i.e., using machine learning to predict diabetes.
One big flaw is that SMOTE, or any kind of data preprocessing, should ONLY be done for the training dataset, not the WHOLE dataset.
-
Given the flaw in the method design, the findings are questionable.
The manuscript is written in clear and professional English, with appropriate academic tone throughout, although some sentences in the introduction and related work sections could benefit from simplification for improved readability. The introduction provides adequate background on diabetes and presents a general overview of machine learning approaches used in its prediction. However, the specific motivation for employing Tab Transformer and meta-ensemble learning is not stated clearly at the outset and could be emphasized more effectively by highlighting the limitations of existing models. The literature cited is relevant and fairly comprehensive, although integration of more recent works and clearer articulation of the research gap would strengthen the narrative. The article follows PeerJ’s structural standards with well-organized sections, visual illustrations, and a logical flow. Formal results, while not requiring theorems or proofs due to the applied nature of the study, are supported by well-defined evaluation metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. The methodology, including preprocessing, model architecture, and tuning strategies, is described in sufficient technical detail to support reproducibility. Overall, the study presents a sound implementation of a hybrid deep learning ensemble framework, though refinement in the introduction and discussion of model interpretability would further enhance the manuscript.
Although the article fits within the journal's scope and applies technically sound methods, several aspects could still be improved. There is no explicit mention of ethical considerations or data usage permissions, even though the dataset is publicly available. The authors also do not provide access to source code or reproducibility scripts, which are important for transparency and scientific validation. The explanation of the Tab Transformer architecture is rather general and lacks technical details such as the number of layers or attention heads used. While data preprocessing is described, it lacks supporting visualizations or summary statistics to demonstrate how class balancing impacted the data distribution. The rationale behind the choice of base models in the ensemble is not discussed in depth. The evaluation uses a standard 80:20 train-test split, which is acceptable; however, incorporating k-fold cross-validation is recommended as a potential enhancement for future work to strengthen the generalizability of results. Additionally, some of the references cited are relatively outdated and could be replaced with more recent studies to improve the literature’s relevance and currency.
The manuscript presents a technically sound implementation of a hybrid ensemble learning approach combining Tab Transformer and meta-ensemble methods for diabetes prediction. While the paper does not explicitly claim or assess its novelty in comparison to state-of-the-art methods, the integration of attention-based architectures into tabular medical data classification is a growing area of interest, and the study provides a valuable example of such application. The rationale for replication is implied through the use of a publicly available dataset and common evaluation metrics, though it would be stronger if accompanied by open access to code and experiment scripts. The conclusions are clearly stated and generally align with the results, highlighting the strong performance of the hybrid model. The experiments, particularly the comparison of multiple models and metrics, are performed adequately. However, the argument could be strengthened by more critical analysis of why certain models underperform and how the Tab Transformer contributes to ensemble success. The conclusion mentions limitations briefly but could benefit from a more explicit discussion on unresolved challenges and future research directions, such as testing on clinical datasets or improving model interpretability.
Overall, this manuscript presents a promising application of hybrid ensemble learning in medical data classification. The integration of Tab Transformer into a meta-ensemble framework is technically well-executed and contributes to ongoing efforts to improve structured data modeling in healthcare contexts. That said, the manuscript would benefit from improved clarity in certain sections, particularly regarding the role and configuration of the Tab Transformer component. Additionally, providing public access to the source code and experimental scripts would enhance the reproducibility and impact of the work. Minor editorial improvements in language and organization, especially in the introduction and literature review, would further strengthen the paper. We encourage the authors to expand future work toward interpretability, ethical considerations in real-world deployment, and validation across diverse clinical datasets to increase the practical value of this approach.
Professional language but vague in key sections. It could be more specific.
More recent literature coverage is needed.
Architectural and ensemble design need to be added.
SMOTE strategy needs clarification.
Claims are too broad given the current evidence. Need to add more specifics.
Add an ablation study.
Missing significance tests, needed when accuracy is 99%.
This research is to apply a state-of-the-art neural architecture (Tab Transformer) and meta-ensemble learning to a diabetes prediction problem. The hybrid modeling approach is pretty novel in my opinion. Thank you for this article.
Refer below for peer review comments:
Abstract:
1. The abstract does not clearly state what is novel about using a tab transformer and meta-ensemble for diabetes prediction. You may need to add a sentence explicitly stating the unique contribution of this approach.
2. Can you add a brief explanation for “meta learner” and “tab transformer” for a broader audience?
3. You mention “accuracy of 99%”, can you clarify whether this is on the test set, cross-validation, or hold-out set? 99% is very high, it must have specifics to understand the context better.
Introduction:
1. The introduction discusses diabetes prevalence. Can you please add a paragraph with a direct research gap and problem statement? This helps readers.
2. I don't see many of the recent studies in deep learning. Can you expand the related works section to include recent deep learning models for tabular data?
3. You may need to add the rationale for choosing a Tab Transformer over classic models. Can you explain its architectural advantage of tabular features? This rationale is important if you want to expand the reach of this article.
Methods:
1. Can you describe the Tab Transformer architecture in more detail? Ex. tokenization, embedding strategy, attention heads? What layers?. These are very important details.
2. The ensemble construction using stacking and XGBoost needs a clearer explanation. How did you train XGBoost? What was the major consideration in training?
3. You mention SMOTE but offer no details on normalization, feature selection, or outlier handling are provided. These details are very important since your dataset has a smaller size.
4. GridSearchCV and RandomSearchCV – can you mention the parameter search space and validation strategy, it's important.
Experimental Setup:
1. Clarify if SMOTE was applied before or after the train-test split; it is not clear. If you apply before would cause data leakage, so move to after the train-test split.
2. This section does not mention cross-validation. Was it k-fold, stratified, or simple hold-out?
3. How did you tune baseline models - were default settings used? Any biases, any observations, please include them.
4. Can you document the hardware setup (ex., CPU/GPU specs) and training time for Tab Transformer for fair comparison with classical models?
Results and Discussion:
1. 99% accuracy seems to be suspiciously high for the UCI diabetes dataset. Can you add statistical tests (ex, t-test or McNemar’s test) comparing models to confirm significance?
2. If your accuracy is 99%, you must include the ablation study. Can you show how each component (ex., Tab Transformer alone, ensemble alone, ensemble with XGBoost) performs? Without this, hard to understand the contribution of each component in achieving this high accuracy.
3. Error analysis is missing – can you include analysis crucial in medical applications - confusion matrix, F1-scores, ROC curves, and comment on false positives/negatives.
4. The UCI dataset is small ( 768 diabetic patients only) and may not generalize to real-world data. Consider external validation datasets. To have better results, it’s important to use a bigger dataset.
Conclusion:
1. You mention “significantly outperforming individual models”, need to back it up with quantification, can you please be more specific in the Conclusion?
2. You may need to state that the study is limited by dataset size and may require clinical validation for real-world deployment.
3. It says “future studies may explore...”, which can provide concrete and specific suggestions for future work. Right now, this is too general.
4. Can you add how this model might reduce diagnostic delay or improve early intervention if deployed? This is useful for societal impact.
5. You need to link back findings/operation to research objective – can you restate how the ensemble technique addressed the original research gap? How it improved predictive power over classical ML models.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.