All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
The reviewers seem satisfied with the recent changes to the manuscript and therefore I can recommend this article for acceptance.
[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]
All suggested changes have been addressed.
-
-
This version of the article has been revised according to all the given suggestions.
-
-
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
**PeerJ Staff Note:** It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors agree that they are relevant and useful.
-The sample size, 47, is acknowledged as a limitation. The authors should provide a clearer rationale for its sufficiency in terms of model robustness, perhaps citing literature using similarly small datasets or offering a power analysis.
-The bibliography, although adequate, is essentially old. Only a few papers (I think I counted around 5) are from 2024 or 2025, one of them in addition to the authors themselves. Some examples that the authors should include in Section 2 to update the work (I declare I am not an involved author).
-The results section would benefit from reporting intervals (or standard deviations) for performance metrics, especially given the small dataset and bootstrapping approach. On the other hand, the discussion section should better distinguish between statistical significance and clinical relevance.
-Although the model uses balanced class weights to mitigate the imbalanced dataset, how the imbalance (7 vs. 40 samples) impacts precision, recall, and potential overfitting?
[1] Liu, L., Sun, Y.., Ge, X. (2025). "A Hybrid Multi-Person Fall Detection Scheme Based on Optimized YOLO and ST-GCN", International Journal of Interactive Multimedia and Artificial Intelligence, vol. 9, issue Regular issue, no. 2, pp. 26-38.
[2] Lupión, M., González-Ruiz, V., Sanjuan, J. F., & Ortigosa, P. M. (2025). Privacy-aware fall detection and alert management in smart environments using multimodal devices. Internet of Things, 101526.4
[3] Chen, L., Zanotto, T., Fang, J., Scharf, E., Garcia, N., Luzania, A., ... & Sosnoff, J. J. (2025). Role of the upper limb in limiting head impact during laboratory-induced falls in at fall-risk older adults. The Journals of Gerontology, Series A: Biological Sciences and Medical Sciences, 80(1), glae267.
**PeerJ Staff Note:** It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors are in agreement that they are relevant and useful.
Check section 1
Check section 1
Check section 1
Please consider the general opinion for the editor.
Please consider the general opinion for the editor.
Please consider the general opinion for the editor.
This manuscript aims to develop a machine learning-based classifier to identify high or low risk of falls in older adults, based on evaluating five input variables. However, this reviewer understands that the authors have focused on variables that represent a considerably restricted approach, given the complexity of fall risk in this population. Fall risk in older adults is widely recognized as a multifactorial aspect, involving interactions between physical, cognitive, sensory, environmental, and psychosocial factors. The choice of only two predictors, both of a cognitive and educational nature, disregards this complexity and reduces the ecological validity of the proposed model. This significantly limits its predictive power and clinical utility when underestimating falls' multifactorial nature.
Although the authors have highlighted that Timed Up and Go (TUG), Berg Balance Scale, Tinetti, and Unipodal Stance tests require special equipment and physical space to carry them out and are much more time-consuming than other tests, this reviewer does not see it that way, especially in a clinical context. The cost-benefit of including motor information in this proposed predictive model could increase its robustness and approach the complexity of falls more closely. Furthermore, developing a model based only on self-reported outcomes may have even more limitations than including some physical and functional tests (tests based on individual performance), such as TUG, the Unipodal Stance test, and the Berg Balance Scale.
Additionally, this reviewer understands that there was a limited and not very robust choice of machine learning algorithms. The authors used three classic algorithms — Logistic Regression, Decision Tree, and K-Nearest Neighbors — without justifying their choice over more robust methods, and appropriate for problems with potential nonlinearities and interactions between variables, such as Random Forest, Gradient Boosting, or SVM. The lack of more rigorous cross-validation and additional metrics (such as ROC curve, AUC, F1-score) also limits the results' reliability.
Thus, the methodological proposal of the study, in its current form, does not present relevant innovations in predicting falls in older adults. The literature already includes multiple models with greater predictive complexity, a more comprehensive number of variables, and more sophisticated algorithms. Thus, the present study does not add significant advances to the state of the art.
Given the methodological limitations, the weak adherence to the multifactorial nature of falls in older adults, and the lack of technical originality, I do not recommend accepting the manuscript in its current form. I suggest that the authors reconsider the conceptual and technical design, incorporating a more representative number of multifactorial variables and more robust algorithms, to align the study with current standards in the scientific literature in fall prediction.
This study successfully demonstrates the potential of machine learning techniques—particularly Logistic Regression (LR)—for classifying the risk of falls among community-dwelling older adults using cognitive and demographic variables. Among the models tested, LR classifiers achieved the highest accuracy, with a peak performance of 71.4%. A key contribution of the study is the identification of two variables—educational level and Trail Making Test (TMT) part B—as the most significant predictors of fall risk. These features yielded strong classification performance, possess clinical relevance, and are easily assessable in real-world settings. The study further refines its practical application by proposing simple, interpretable thresholds: older adults with at least eight years of education or a TMT part B time of less than 212 seconds are, on average, at lower risk of falling.
After reading the manuscript, I offer suggestions that help improve its clarity and comprehensiveness.
1. While the introduction touches upon a wide range of relevant factors (e.g., epidemiology, cognitive domains, balance, motor tests, and executive function tests), it delays the specific focus of the paper—namely, the use of executive function tests and machine learning to classify fall risk. The introduction would benefit from earlier narrowing of the topic. A more focused transition from general statistics to cognitive assessment and ML-based approaches could enhance reader engagement and conceptual clarity. Furthermore, several sentences repeat similar points about the relationship between executive function and fall risk (lines 54–64), creating redundancy without introducing new insights or evidence. These points could be streamlined or consolidated to maintain a concise flow, allowing more space to expand on underdeveloped areas (e.g., the role of each executive test in detail or the novelty of combining cognitive testing with ML).
1. The models (LR, DT, K-NN) are relatively simple and may not capture non-linear relationships or complex interactions among features. This could limit prediction performance and model robustness, especially in larger or more heterogeneous datasets. As data increases, explore more advanced models (e.g., Random Forest, XGBoost, or Neural Networks) with proper cross-validation and regularization techniques.
2. The study does not compare ML performance with standard clinical assessments (e.g., Mini-BESTest or TUG) as standalone predictors.
Assessing whether ML models provide added value over existing practices is difficult without such a baseline. For context, include baseline performance from conventional screening tools.
1. The authors acknowledge that the sample size (47 participants) is small. While they mention that small samples are common in similar studies, they do not explore the statistical consequences, such as wide confidence intervals, high model variance, and reduced generalizability. The discussion should explicitly state that limited sample size reduces statistical power, increases the likelihood of overfitting, and restricts reproducibility in real-world settings.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.