Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

Review History
Efficient Parkinson’s disease classification from speech with filter-based feature selection and Genetic Algorithm–Bayesian Optimization ensemble integration

All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

The initial submission of this article was received on July 4th, 2025 and was peer-reviewed by 2 reviewers and the Academic Editor.
The Academic Editor made their initial decision on September 18th, 2025.
The first revision was submitted on October 14th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
The article was Accepted by the Academic Editor on November 6th, 2025.

Version 0.2 (accepted)

Nicole Nogoy · Nov 6, 2025 · Academic Editor

Accept

The authors have addressed all the reviewers' concerns in the revised manuscript.

[# PeerJ Staff Note - this decision was reviewed and approved by Mehmet Cunkas, a PeerJ Section Editor covering this Section #]

Reviewer 1 · Oct 16, 2025

Basic reporting

All my comments and concerns have been adequately addressed by the authors. The revisions improved the clarity and quality of the manuscript, and I am satisfied with the current version. Therefore, I recommend that the paper be accepted in its present form

Experimental design

All my comments and concerns have been adequately addressed by the authors. The revisions improved the clarity and quality of the manuscript, and I am satisfied with the current version. Therefore, I recommend that the paper be accepted in its present form

Validity of the findings

All my comments and concerns have been adequately addressed by the authors. The revisions improved the clarity and quality of the manuscript, and I am satisfied with the current version. Therefore, I recommend that the paper be accepted in its present form

Additional comments

All my comments and concerns have been adequately addressed by the authors. The revisions improved the clarity and quality of the manuscript, and I am satisfied with the current version. Therefore, I recommend that the paper be accepted in its present form

Cite this review as

Anonymous Reviewer (2025) Peer Review #1 of "Efficient Parkinson’s disease classification from speech with filter-based feature selection and Genetic Algorithm–Bayesian Optimization ensemble integration (v0.2)". PeerJ Computer Science

Reviewer 2 · Oct 24, 2025

Basic reporting

N/A

Experimental design

N/A

Validity of the findings

N/A

Additional comments

N/A

Cite this review as

Anonymous Reviewer (2025) Peer Review #2 of "Efficient Parkinson’s disease classification from speech with filter-based feature selection and Genetic Algorithm–Bayesian Optimization ensemble integration (v0.2)". PeerJ Computer Science

Download Version 0.2 (PDF) Download author's response letter (v0.2) - submitted Oct 14, 2025

Version 0.1 (original submission)

PeerJ Staff · Sep 18, 2025 · Academic Editor

Major Revisions

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

**Language Note:** When preparing your next revision, please ensure that your manuscript is reviewed either by a colleague who is proficient in English and familiar with the subject matter, or by a professional editing service. PeerJ offers language editing services; if you are interested, you may contact us at [email protected] for pricing details. Kindly include your manuscript number and title in your inquiry. – PeerJ Staff

Reviewer 1 · Aug 5, 2025

Basic reporting

The manuscript is generally well-structured, detailed, and aligned with the existing literature. The proposed GA+BO hybrid ensemble presents a valuable contribution in terms of novelty. However, the temporal and computational complexity of this system should also be discussed. If the author(s) do not evaluate the method on additional datasets, the lack of cross-corpus generalizability should at least be explicitly acknowledged as a limitation. I am in favor of accepting the paper for publication, provided that the suggested minor revisions are implemented.

Experimental design

- The model's performance has been comprehensively evaluated using multiple metrics, including MCC, F1-score, and Accuracy. The TQWT+KNN combination demonstrates statistically significant performance, outperforming many existing studies. The comparative presentation of ensemble methods (GA, BO, GA–BO) constitutes a strong methodological contribution. However, no variance or statistical confidence intervals are provided for the results of the “best-performing model.” At a minimum, ± standard deviation values should be included in the tables. Additionally, the number of selected features is sometimes fixed at 100, and in other cases, it is expressed as a retention ratio (e.g., 20%, 40%, 60%). The impact of these variations on classification performance should be illustrated with a plot (e.g., a line plot) to enhance interpretability.

- The terms “Baseline+MFCC” and “Wavelet-based” should be defined more clearly and enriched with examples at their first mention. Additionally, for the SVM classifier, only the polynomial kernel (degree = 3) has been tested. A comparative evaluation using RBF or linear kernels should also be included to ensure a more comprehensive analysis.

Validity of the findings

- The parametric details of feature extraction methods, such as TQWT (e.g., Q, r, and J values), are not clearly specified—whether they were fixed or optimized is not mentioned. Furthermore, the hyperparameter settings used in the GA–BO Ensemble strategy (e.g., population size of 75, 150 generations, mutation rate of 0.15) are not justified. The rationale behind choosing these values should be explained, either based on experimental results or theoretical grounds. Additionally, for the KNN classifier, only k = 1 was evaluated. A comparative analysis using alternative k values should be conducted to ensure robustness.

Additional comments

- Some references are repeated (e.g., the citation “Chen et al. (2016)” appears more than once in the same form). There are typographical inconsistencies in the formatting and alignment used in tables and figures. Additionally, table captions should be made clearer to enhance readability and understanding.

Cite this review as

Anonymous Reviewer (2025) Peer Review #1 of "Efficient Parkinson’s disease classification from speech with filter-based feature selection and Genetic Algorithm–Bayesian Optimization ensemble integration (v0.1)". PeerJ Computer Science

Reviewer 2 · Aug 16, 2025

Basic reporting

This manuscript presents a framework for classifying Parkinson's disease (PD) from vocal features. The authors conduct a comprehensive evaluation of multiple filter-based feature selection methods (e.g., F-score, Mutual Information, Chi-square, and several devised variants) across three distinct feature domains (Wavelet, Baseline+MFCC, and TQWT). The outputs of classifiers (k-NN and SVM) trained on these feature subsets are then integrated using various ensemble strategies. The primary contribution is a novel two-stage hybrid ensemble method (GA-BO-Ensemble) that first uses a Genetic Algorithm (GA) to select an optimal subset of classifiers and then employs Bayesian Optimization (BO) to assign weights for a final prediction. The proposed GA-BO-Ensemble method achieves a classification accuracy of 96.4%, demonstrating a competitive, computationally efficient alternative to deep learning models.

1- The introduction contains the necessary elements but could be structured more clearly for impact. I strongly recommend restructuring the end of the Introduction to include explicit, distinct subsections titled "Motivations" and "Contributions," as is common practice. This would replace the current, less-defined "aims" and "contributions" bullet points, improving readability and immediately clarifying the paper's novelty for the reader.

2- Acronyms must be defined upon their first use in both the abstract and the main body of the text.

3- In the title and abstract, "GA" and "BO" are used without being spelled out. They should be defined as "Genetic Algorithm (GA)" and "Bayesian Optimization (BO)" at their first appearance. Please check for consistency with all other acronyms (e.g., TQWT, MFCC, MCC).

4- The Conclusion effectively summarizes the results, but it could be strengthened.

5- The authors should elaborate on the practical, clinical implications. How would this system be deployed? What are the next steps to move from a research framework to a clinical decision support tool?

6- The current "Limitations" section is somewhat generic. The authors should add more specific limitations tied to their methodology. For instance, does the sequential nature of the GA-BO model introduce specific biases? Is the reliance on the 1-NN classifier a potential weakness in datasets with higher noise levels?

7- While the manuscript is generally understandable, it would benefit from a thorough round of language editing by a native English speaker or a professional service.

8- Certain sentences are overly long and complex. The tone should be consistently formal and precise. For example, phrases like "making it perfect for the implementation" could be revised to more academic language, such as "making it highly suitable for the implementation."

9- The "Devised Feature Selection Variants" section would benefit from more detailed descriptions. For instance, for "Score Fusion," how are the scores normalized and aggregated (e.g., simple average, weighted average)?

Experimental design

The manuscript addresses an important problem and proposes a methodologically interesting solution. The experimental design is systematic and thorough in many respects, and the commitment to reproducibility via publicly available code and data is highly commendable. However, the manuscript requires substantial revisions to address significant shortcomings in experimental validation, analytical depth, and overall justification of its core claims before it can be considered for publication.

10- A central premise of this work is that the proposed framework is "computationally efficient" and "better suited for real-world deployment and edge-computing scenarios" compared to deep learning (DL) alternatives. This claim is repeatedly made but is never substantiated with empirical evidence. The authors must conduct a computational complexity analysis. This should include, at a minimum:

- Training and Inference Times: Provide a comparative table of the training and inference times for the proposed GA-BO-Ensemble framework versus at least two of the DL-based SOTA models cited in Table 8 (e.g., Ouhmida et al., 2024; Gunduz, 2019).

- Algorithmic Complexity: Discuss the theoretical complexity (Big-O notation) of the feature selection, GA, and BO stages. While potentially complex during the training/optimization phase, this must be quantified to support the claim of efficiency, especially concerning inference.
Without this analysis, the primary motivation for preferring this method over higher-accuracy DL models is unsubstantiated.

11- The experimental section contains interesting results that are reported but never discussed, representing a significant missed opportunity for scientific insight. Specifically, the results from Tables 3-5 (percentage-based feature retention) show that TQWT features are consistently dominant. However, the controlled experiment in Table 6 (fixed 100 features) shows that Baseline+MFCC features perform best. This is a critical finding. The authors must dedicate a paragraph to discussing this discrepancy. Does it imply that TQWT features have their discriminative information spread more thinly across a larger number of features, whereas MFCC features are more information-dense? This discussion is essential for understanding the nature of the features themselves.

Validity of the findings

12- The paper presents numerous tables comparing model performances based on accuracy, F1-score, and MCC. However, it fails to report whether the observed differences are statistically significant. For example, in Table 7, the proposed GA-BO-Ensemble achieves 96.4% accuracy, while the BO-Ensemble achieves 95.4%. Is this 1% difference statistically meaningful or simply due to random variation in the cross-validation splits? The authors must perform appropriate statistical significance testing. Given the use of 10-fold cross-validation, a paired test like the Wilcoxon signed-rank test should be used to compare the performance of the proposed model against the other ensemble methods and the best-performing individual models. The results of these tests should be reported to validate claims of superiority (e.g., "our model significantly outperforms... (p < 0.05)").

13- The novelty of the work hinges on the GA-BO hybrid ensemble. While the components are described, the manuscript lacks a deep analysis of why this specific combination is advantageous. A formal ablation study is required. The authors have the components in Table 7 (GA-only, BO-only), but they need to expand the discussion. They must analyze why the two-stage approach of selection (GA) followed by weighting (BO) is superior. Does GA effectively prune poorly performing or redundant classifiers, allowing BO to optimize weights on a more complementary subset? This interaction must be explicitly discussed.

14- The performance of GA and BO is highly dependent on their hyperparameters (e.g., population size, mutation rate for GA; number of iterations for BO). The manuscript provides the values used but offers no justification for their selection or analysis of their impact. A sensitivity analysis is needed to demonstrate that the reported performance is robust and not the result of meticulous, dataset-specific tuning that may not generalize.

Additional comments

The manuscript presents a novel and promising GA-BO ensemble framework and is supported by a comprehensive, systematic experimental setup and an excellent commitment to reproducibility. However, the current version is not acceptable for publication due to several major flaws:
- The central claim of "computational efficiency" is entirely unsupported by empirical data or analysis.
- The results lack statistical validation, making it impossible to determine if the claimed performance improvements are significant.
- The analysis of the core contribution—the GA-BO method—is superficial, lacking both an ablation study and a hyperparameter sensitivity analysis.

These are not minor issues; they go to the core of the paper's scientific validity and contributions. If the authors can thoroughly address these points, particularly by providing computational benchmarks and statistical tests, the manuscript could become a strong contribution to the field.

Cite this review as

Anonymous Reviewer (2025) Peer Review #2 of "Efficient Parkinson’s disease classification from speech with filter-based feature selection and Genetic Algorithm–Bayesian Optimization ensemble integration (v0.1)". PeerJ Computer Science

Download Original Submission (PDF) - submitted Jul 4, 2025

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Review History Efficient Parkinson’s disease classification from speech with filter-based feature selection and Genetic Algorithm–Bayesian Optimization ensemble integration

Summary

Version 0.2 (accepted)

Nicole Nogoy · Nov 6, 2025 · Academic Editor

Reviewer 1 · Oct 16, 2025

Basic reporting

Experimental design

Validity of the findings

Additional comments

Reviewer 2 · Oct 24, 2025

Basic reporting

Experimental design

Validity of the findings

Additional comments

Version 0.1 (original submission)

PeerJ Staff · Sep 18, 2025 · Academic Editor

Reviewer 1 · Aug 5, 2025

Basic reporting

Experimental design

Validity of the findings

Additional comments

Reviewer 2 · Aug 16, 2025

Basic reporting

Experimental design

Validity of the findings

Additional comments

Review History
Efficient Parkinson’s disease classification from speech with filter-based feature selection and Genetic Algorithm–Bayesian Optimization ensemble integration