All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thank you for your valuable contribution.
[# PeerJ Staff Note - this decision was reviewed and approved by Massimiliano Fasi, a PeerJ Section Editor covering this Section #]
no comment
no comment
no comment
The manuscript has greatly improved and my concerns addressed.
Particularly,
1. The restructuring of the manuscript provides a clearer narrative flow and avoids redundancies.
2. The results are presented clearly with the required details.
3. My question on data standardization/normalization has been clearly addressed.
4. The rationale behind the ablation study is now clear.
Therefore, I suggest paper acceptance.
Just as a minor note that I think can be easily addressed during the proof reading, in Table 5 and 6 you highlight (in bold) the best results for each subject, thus I suggest doing the same in Table 7 to be consistent with the rest of the table formatting.
Please address the requests and comments thoroughly.
Please, see detailed comments in the "Additional comments" section. I will just provide a brief summary related to the checklist of "Basic reporting".
1. The English use is clear and unambiguous.
2. The background and context have improved but requires a re-organization.
3. The article structure has improved but requires a revision to ensure clarity and content coherence.
4. The results are more accessible, but requires a revision.
5. The problem statement is clear.
Please, see detailed comments in the "Additional comments" section. I will just provide a brief summary related to the checklist of "Experimental design".
1. The paper is in the scope of the journal.
2. The knowledge gap is clearer.
3. The investigations should be revised.
4. The methods' description can be further improved.
Please, see detailed comments in the "Additional comments" section. I will just provide a brief summary related to the checklist of "Validity of the findings".
1. The impact is clearer.
2. The use of the third dataset is not completely clear.
3. Conclusions seem to be in line with the presented results.
I thank the Authors for their clear responses. While the manuscript has improved, I have some additional concerns I think need to be addressed before suggesting the paper publication.
The following concerns refer to the highlighted version of the manuscript.
1. I think that the “Problem definition section” could be merged with the similar content now added in the “Introduction” to provide a better understanding of the problem and your approach and a better manuscript flow.
2. Consider revising the introductory paragraph of the “Materials & Methods” section, having the description of the datasets as the first subsection. Moreover, please consider referencing to MOABB not as a benchmark dataset, but as a repository presenting a benchmark of algorithms as well as available datasets. Braindecode is also a toolbox and thus should be referenced accordingly.
3. At line 196, you say that there are “m SDs”, but subsequently you have 1 to M subscripts. Consider verifying the notation.
4. The newly added text in the “Motivation of G-softmax DDG” section seems to me more suited to the “Introduction” considering the possible redundancies with what has been added in the “Introduction” and the fact that reporting the motivations at the very beginning would immediately let the reader understand your point of view and the novelty of your proposal. I would leave the “Materials & Methods” section for methodology description in its formalism. I suggest describing in detail Figure 1, highlighting the input form for your architecture, and so on.
5. When reporting averaged results, usually standard deviation is reported too to ensure that the model consistently works independently from the training and test data it is fed with. Please, consider revising this point in Table 3. For the benchmark case you seem to use a subject-independent approach with a leave-one-subject-out validation. Is that correct? I suggest clarifying this point in the text.
6. At the end of the paragraph concerned with the results of the benchmark models, you correctly say that there are limitations from direct cross-subject learning. Have you considered a data pre-processing to improve the comparability of data coming from different subjects such as a standardization or a normalization?
7. Consider providing a better linking of the newly added benchmark result description with your own results starting at line 347.
8. The Algorithm should be reported in the “Materials & Methods” section instead of remaining in the “Results” section.
9. In the “Experimental Results” you should start with the text and not with tables and figures.
10. In lines 374-375 you state that “G-Softmax DDG demonstrates superior cross-subject MI classification performance across the three datasets”.
a. I suggest being fairer and highlight that that you have less variability in the results, considering that for some subjects there are models working quite well.
I agree on the point that your model seems to generalize better, but I would highlight it in these terms from the very beginning.
b. Why does Table 6 report only your results and not the benchmark ones? Not having a clear comparison does not allow to understand if the dataset is a very lucky one (nice subjects and nice signal quality) or if there is some hidden mechanism that brings the model to achieve perfect accuracy.
The code reports only the parts related to G-Softmax and not an example of application to the data, thus I am not able to assess better the reason why this could have happened.
Figure 5 also marks these doubts demonstrating a very different trend compared to the previous Figure 3 and 4. I suggest checking if you are correctly feeding the model. Could it be due to redundancy?
It could be useful to understand if this trend is maintained with subsequent epochs or not.
11. Considering the ablation study, I am quite confused. Did you verify that the results were significantly statistically different? It seems that the second proposal is better that the one that you use for comparison in the previous section. Please, clarify this point as it could justify the presentation of the subject-related results for these modified G-Softmax more than the regular one.
Minor comments
12. Please, see that the acronyms should be reported in their extended version at their first appearance (even after having been presented in the abstract). For example, see DDA at line 58 and DDG at line 70.
13. I think at line 125 you mean “In fact” more than “Instead”. Please, see if my interpretation is correct.
Overall Evaluation
The manuscript presents a novel G-softmax deep domain generalization framework for cross-subject motor imagery EEG classification. The study is well-motivated, addressing the significant challenge of inter-subject variability in brain–computer interface research. The paper is generally well-written, provides adequate background, and includes thorough experimental validation on multiple benchmark datasets. The proposed approach appears to be technically sound and demonstrates promising performance compared with existing methods.
However, there are several areas that require clarification and improvement before the manuscript can be considered for publication. Some sections are overly technical without sufficient intuitive explanation, and certain experimental results raise concerns about generalizability. Additionally, the discussion of limitations and future work could be expanded to strengthen the contribution.
Specific Comments
1. The description of the improved G-softmax function (Section 3) is mathematically clear, but the intuitive explanation of why it outperforms the original G-softmax is limited. Please add more conceptual insight.
2. The experimental results on the Lee2019-MI dataset report near 100% accuracy, which seems unusually high. It would be helpful to provide more justification, e.g., whether this might be due to dataset characteristics or model overfitting.
3. In the methodology, the choice of hyperparameters (Table 2) is not well justified. Please explain whether they were tuned systematically or chosen empirically.
4. While the comparison with DDA methods is comprehensive, the ablation study (Figure 6) does not fully isolate the contribution of each proposed component. More granular ablation or sensitivity analysis would strengthen the claims.
5. The discussion of limitations focuses mainly on subject variability, but practical deployment issues (e.g., computational cost, online adaptability) are only briefly mentioned. Please expand on these aspects to give a more balanced view.
none
none
none
This is a revised version of a manuscript.
The language of the manuscript is clear. Adequate literature review has been conducted and teh quality of figures is improved from previous version.
Experimental design is appropriate. The aims and scope is in line with the scope of the journal. The problem definition and reserach gap has been identified. INvestigation and results are sufficient.
The results support the conclusion.
The manuscript has been revised and all concerns have been addressed.
Please follow strictly the requested suggested changes. Requests are mandatory.
**PeerJ Staff Note**: Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
Please, see detailed comments in the "Additional comments" section. I will just provide a brief summary related to the checklist of "Basic reporting".
1. The English use is clear and unambiguous.
2. The background and context should be revised to ensure the goal of the Authors, the addressed research gaps, and how different approaches are used in the literature.
3. The article structure should be revised to ensure clarity and content coherence. Figures, tables, and codes should be revised.
4. The results are not clearly accessible due to the lack of descriptions of both the framework and the performed experiments.
5. The problem statement is clear.
Please, see detailed comments in the "Additional comments" section. I will just provide a brief summary related to the checklist of "Experimental design".
1. The paper is in the scope of the journal.
2. The research questions should be clearly reported and highlight how it works on the knowledge gap.
3. The investigations should be more rigorous.
4. The methods require to be described in more details to allow a better understanding of the framework as well as reproducibility.
Please, see detailed comments in the "Additional comments" section. I will just provide a brief summary related to the checklist of "Validity of the findings".
1. The manuscript would benefit from a clear presentation of the approach's benefits to the literature.
2. While the data use is sufficiently clear, the presentation of both methods and results is not sufficient to provide a clear assessment of the proposed work.
3. Conclusions need to be revised according to the previous points.
The Authors propose a framework based on G-Softmax deep domain generalization, intended to address the issues deriving from cross-subject, inter-subject, and inter-class variability and cross-domain shifts in electroencephalographic (EEG) signals recorded during a motor imagery (MI) experimental paradigm. Particularly, the Authors exploit three widely known publicly available datasets and report the link to their codes, potentially allowing researchers to replicate their approach.
While the paper presents a hot topic for the EEG community, important details are missing, especially on the framework description, executed experiments, and critical analysis of the results.
Please see my additional comments in what follows.
1- The shared code seems sufficiently clear and succinct; however, it may benefit from additional comments both in the GitHub readme and in the code itself.
2- Introduction. While the main point is clear, the introduction is extremely short and may require a revision of its content. In this case, it could be integrated with the literature descriptions present in the “Related Study” section, which is succinct as well. Please, consider adding at the end of the introduction your hypothesis/research questions and provide an overview of your proposal.
a. Lines 48-51. While it is true that the inter-subject variability may lead learning models to the incorrect interpretation of the EEG signals, it is also true that accuracy in motor imagery is usually very good, when it is around 80%. Moreover, this value must be interpreted by considering the number of classes that the model is dealing with. Could you please comment on this point? I agree with what you say about the higher performance for subject-dependent decoding strategies, but I think that the introduction may benefit from providing this kind of information.
b. I suggest adding at the end of the introduction an anticipation of your proposal to provide a better understanding of where your discourse is going during the literature analysis and better detail why you consider specific techniques.
3- Related Study. This section is very brief and does not seem to justify its presence. Moreover, its division is subtopics should better respect the paper template.
Are there no studies unrelated to MI but in the EEG field that use similar approaches?
4- Materials & Methods. In this section, I would expect an in-depth presentation of your proposal and the data used, and eventually, literature models, approaches, and so on. However, it seems that in some paragraphs is reported the content that should be distributed in the “Introduction” and “Related Study” sections. is reported
a. For example, in the sub-section “G-Softmax DDG”, I suggest adding more details on the method and reporting the problems in the literature and why you propose a certain framework in the Introduction/State-of-the-art.
This may provide a better understanding of the topics and your choices.
b. To improve the understanding of your framework, I suggest adding a graphical representation of your framework pipeline/workflow. It is difficult to understand the connection between the different parts that are presented.
c. Being Figure 1 taken 1 on 1 from the original work from Chen, Xia, et al. "Toward reliable signals decoding for electroencephalogram: A benchmark study to EEGNeX." Biomedical Signal Processing and Control 87 (2024): 105475 --- I suggest removing the figure and sending the reader to the original work. You can consider providing a more in-depth description of EEGNeX and how you use it (especially the latter point would be important to provide a clear understanding of how you use literature models in your framework).
d. Importantly, the framework needs further description. You should also ensure that it is clear from the introduction how your proposal fills a knowledge gap that should be clearly reported.
e. The datasets should be part of the “Materials” section. Please, provide a brief description of the differences between the dataset commenting Table 1, and add also the information on class distribution.
f. Please provide a brief explanation of the benchmark.
5- Results. Please, see some related comments at point 4. Better present the performed experiments
a. How do you configure the hyperparameters? Do you use an optimization strategy or an empirical approach?
b. Please provide in-depth comments on figures and tables. It is important to have a correct interpretation of the results in terms of models and performance. Additionally, is it necessary to report column “evaluation” in Table 3, considering that the values are always the same?
c. Is accuracy sufficient to provide a clear understanding of the model performance?
d. The “Experimental Results” must be discussed and not just reported as tables in a table-only section. Please consider revising the structuring and eventually using the “Discussion” to better highlight your results.
6- Discussion. This section requires a revision of the content, also following the previously reported comments.
a. I understand that you provide a comparison with literature works using DDA, but I think you should restructure this dissertation to provide a better narrative flow. Why is this not reported as one of the experiments in the results?
b. Please provide more insights on 100% accuracy. Is it possible? Is the model biased? Is the training behaviour correct?
In this study, the G-softmax DDG framework is utilized to dynamically balance the intra-class and inter-class distances through multi-source domain joint training and an enhanced G-softmax function, to improve the model performance. Finally, it is verified on public datasets such as BNCI2014-001, BNCI2014-004, and Lee2019-MI. The manuscript is well-written and presents a solid contribution. Below are some comments and suggestions regarding the main contributions, experimental design, and discussion:
1. In multi-source domain training, are data from different subjects input with equal weights? Have you considered dynamically adjusting weights based on the distribution similarity between source domains and the target domain?
2. The paper mentions using the EEGNeX architecture but does not specify whether network structure modifications (such as the depth of feature extraction layers or specific parameters of dilated convolutions) were made for G-softmax DDG. How do these adjustments impact cross-domain generalizability?
3. The paper replaces the Gaussian CDF in the original G-softmax with an exponential decay function to reduce computational cost, but does not demonstrate whether this simplification compromises the original method's advantage in feature distribution modeling. Can you provide ablation experiments to compare the classification performance, convergence speed, and feature separability before and after the improvement?
4. In the experimental design, why were the BNCI2014-001, BNCI2014-004, and Lee2019-MI datasets chosen, and how do their respective characteristics influence the experimental results?
5. Compared with DDA methods such as DRDA and DJDAN, G-softmax DDG shows slightly lower average accuracy on the BNCI2014-001 dataset but achieves 100% accuracy on Lee2019-MI. How do you explain this performance discrepancy?
-
-
The language is clear, the manuscript organization needs improvement, see details in section 4.
The design is clear, however presentation can be improved. Refer to section 4.
-
In this study, an improved G-softmax (Gaussian-based softmax function)Deep Domain Generalization (G-softmax DDG) framework is proposed. This framework aims to overcome the limitations of traditional DDG methods in handling inter-class differences and cross-domain distribution shifts. The advantage of DDG makes it well-suited for medical applications, as it eliminates the necessity for prior data collection from subjects.
1. The organization of the manuscript needs improvement; the problem definition comes at the end of the literature review and not in the methods section.
2. Under the experimental results section, no description is provided, and only tables and figures are given.
3. It is not clear why only “precision” is evaluated and not any other performance metrics.
4. Tables 2-7 are not cited in text, and similarly, Figures 3 onwards are not cited in text.
5. While looking at Figure 3, the subject 5 pattern shows lots of variation, and the subject 6 data is very sparse. What's the reason for this?
6. Similarly, for Figure 4, in most cases, there are lots of fluctuations, and the accuracy achieved is also varying. In addition, the graph in Figure 3 is missing
7. Why are the numbers of epochs for different subjects different in Figures 3and 4?
8. Figure 5 shows only one set of curves. Is this training or testing data?
Overall, although the results are good but the complete manuscript looks chaotic; a complete reorganization is needed to follow the manuscript with ease.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.