Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on June 7th, 2023 and was peer-reviewed by 2 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on June 26th, 2023.
  • The first revision was submitted on July 14th, 2023 and was reviewed by 2 reviewers and the Academic Editor.
  • A further revision was submitted on July 31st, 2023 and was reviewed by the Academic Editor.
  • The article was Accepted by the Academic Editor on August 14th, 2023.

Version 0.3 (accepted)

· Aug 14, 2023 · Academic Editor

Accept

The authors have made efforts to address all of the reviewers' comments. I have assessed the revised manuscript and acknowledge that the authors have adequately incorporated the feedback. It is not necessary to send this version to the reviewers. Based on this assessment, the manuscript is now ready for publication.

[# PeerJ Staff Note - this decision was reviewed and approved by Claudio Ardagna, a PeerJ Section Editor covering this Section #]

Version 0.2

· Jul 24, 2023 · Academic Editor

Minor Revisions

Please revise the manuscript to address the comments from the two reviewers.

Reviewer 1 ·

Basic reporting

The revised manuscript has undergone significant improvements, with a stronger emphasis on scientific evidence to support the study's motivation. The authors addressed all of the major concerns raised about their work, resulting in a more comprehensible version to readers. I suppose that the current version is now suitable for publications with minor modifications:
- Line 98: “.... For a feature-target pair (x, y) where x ...”: “(x, y)” and “x” need to be italicised like other mathematical signs, variables, and operators.
- Line 128: “... In our modeling experiment ...” -> In our modeling “experiments” (plural)
- Line 130-131: “... One epoch required around 1.2 seconds to train and 0.2 seconds for testing.” -> Your sentence should written in a parallel structure. “to train” -> “for training”.
- Line 147: ... epoch 27. -> “... epoch 27th”
- Line 153-154: “... than the other setup models (b), and (c).” -> “... than the models of setups (b) and (c).”
- Line 175: “... value of 0.38, whereas other methods obtains an AUCPR value” -> value of 0.38, whereas other methods “obtain” (fixed verb) AUCPR “values” (plural)
- Line 178-179: “Table 3 gives information on the performance of all models over multiple trials.” -> “Table 3 gives information on the performance of all models over “ten” (concrete number) trials.
- Line 186: “The p-values” -> The p-values (italicised “p”)
- Line 190, 192, 197, 204: “Transformers” -> “Transformer-based models”

Experimental design

The experiments were well-designed to achieve the study’s objectives.

Validity of the findings

The newly added statistics provide more insights into the model's robustness and applicability.

Additional comments

I have no additional comments for this article.

Cite this review as

Reviewer 2 ·

Basic reporting

I appreciate the authors’ efforts in adding more details and conducting additional experiments to improve the quality of the manuscript. The manuscript is now well-structured and understandable to readers. I recommend that this work be considered for publication once the authors have completed correcting several minor points.
(1) Figure 2. It is recommended to move the legend of Figure B to the top-right position to avoid overlaying text.
(2) Figure 2. The axis names in Figure B are wrong. It should be "Precision" and "Recall" instead of "TPR" and "FPR".
(3) The limitations of the method should be discussed.
(4) In the statistical analysis section, which threshold did you choose? (0.05, 0.01, etc.). Is it the one-tail or two-tail test?

Experimental design

The description of the experiments is clear and simple. It would be much better if the authors could add a flowchart describing the major steps in conducting their experiments.

Validity of the findings

The work provides sufficient results in terms of experiments and statistical testing to evaluate the validity of the findings. Based on their results, I agree that their proposed method works better than other methods.

Additional comments

- Line 75: "All selected algorithms for" should be read "All algorithms selected for".
- Line 185: "compare the performance of our model to each machine learning model" should be read "compare the performance of our model to that of each machine learning model".

Cite this review as

Version 0.1 (original submission)

· Jun 26, 2023 · Academic Editor

Major Revisions

Please address the comments from the two reviewers, especially the comments about the stability of the model (Reviewer 1) and conducting statistical testing (Reviewer 2), and then revise the manuscript accordingly.

Reviewer 1 ·

Basic reporting

This article discusses utilising advanced computing technology to predict employee attrition and manage business profits more accurately. It presents a study that applied a Transformer-based neural network to the IBM HR Employee Attrition dataset, resulting in improved prediction efficiency. The study also underscores the promise of deep learning for handling unbalanced tabular data.

The document is well-structured and exhibits proficient use of language. While the background information is generally adequate, it would be better if the author could update some most recent studies that applied this model. The presentation, such as figures and tables, is clearly displayed; however, the author should give more details in the figure legend to make it self-explained.

Experimental design

The study is in line with the journal's Aims and Scope and contains concise research questions. The model's structure is thoroughly described, and the provided methodologies are sufficiently informative. Still, the "Assessment Metrics" parts need additional explanations of the significance of these metrics.

Validity of the findings

Model robustness and stability are still questionable. Conducting experiments using different data sampling trials is highly recommended to have more statistical evidence to support the conclusion of the study.

Additional comments

Besides, there are still a few significant issues that need to be addressed:
- What is the significance of the metrics mentioned in the Assessment Metrics? Also, why did you choose these metrics for assessment?
- Table 2: What is ERT, GB, RF, and XGB? Please include the information for these abbreviations in the introduction and the Table legend.
- Line 175: ‘Mitigation strategies include transfer learning, data augmentation, 175 regularization techniques, and domain adaptation.” Why can these strategies address the limitation of the Transformer? I suggest having a different paragraph for this information for more detailed information.

Cite this review as

Reviewer 2 ·

Basic reporting

The manuscript is well-organized and follows the official English writing style. The literature review says enough about the subject to give a full idea of it. The study results are presented in a clear and concise way in the manuscript by using well-designed tables and figures. Still, there are some things that need to be fixed before the story is ready to be published. The list below shows specific questions that need to be answered.

Experimental design

The experimental design appears to be sound; nevertheless, there are certain areas that require clarification. Specifically, it is important to ascertain which datasets were utilized for training and tuning the GB, XGB, RF, and ERT models. Although the manuscript mentions the utilization of grid search, it remains unclear whether it was employed in conjunction with cross-validation or solely based on the validation set. Furthermore, considering the imbalanced nature of the dataset, it would be beneficial for the authors to explore the implementation of data balancing techniques during the development of the models.

Validity of the findings

The findings of the study are useful for researchers across diverse fields with an interest in related subjects. The provision of a publicly available dataset and code repository facilitates result replication and further exploration of the study. However, as the dataset is relatively small, it raises concerns about potential bias in data splitting. Consequently, it becomes challenging to ascertain whether the high performance stems from a robust model or biased random sampling. To address this, it is necessary to conduct statistical testing, such as a t-test, to compare the proposed model with other baseline models. This would provide a more comprehensive evaluation and enhance the validity of the study's claims.

Cite this review as

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.