Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on December 3rd, 2024 and was peer-reviewed by 3 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on March 3rd, 2025.
  • The first revision was submitted on April 2nd, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • The article was Accepted by the Academic Editor on April 10th, 2025.

Version 0.2 (accepted)

· Apr 10, 2025 · Academic Editor

Accept

Thank you for your contribution to PeerJ Computer Science and for addressing all the reviewers' suggestions. We are satisfied with the revised version of your manuscript and it is now ready to be accepted. Congratulations!

[# PeerJ Staff Note - this decision was reviewed and approved by Vicente Alarcon-Aquino, a PeerJ Section Editor covering this Section #]

Reviewer 2 ·

Basic reporting

no comment

Experimental design

no comment

Validity of the findings

no comment

Additional comments

no comment

Cite this review as

Reviewer 3 ·

Basic reporting

Thank you for making effort in addressing the raised comments.
It is suggested to include the response to the following comment below in the revised manuscript in the methodology section. It gives the context and clarification to the potential readers too.
"R3.5: Why are only traditional models used? why not neural networks or Time series-based
models? What is the rationale behind this? Does the data have timestamps? If yes, then why is LSTM not used?

Experimental design

no comment

Validity of the findings

no comment.

Cite this review as

Version 0.1 (original submission)

· Mar 3, 2025 · Academic Editor

Major Revisions

We have completed the evaluation of your manuscript and will reconsider it following major revisions. We invite you to resubmit the manuscript after addressing the reviewers' comments, with particular attention to the following points:

1. Strengthen the discussion on privacy preservation and interoperability, explaining how the proposed FL platform uniquely addresses these aspects.

2. Provide a clearer methodology for the FL platform, focusing on privacy and interoperability improvements, and justify the local simulation setup.

3. Justify the choice of evaluation metrics, discuss data handling (e.g., SMOTE, ADASYN), and consider adding statistical significance tests. Explore the use of alternative models like neural networks or LSTM.

4. Enhance figure quality, clarify table highlights, and fix any visual issues.

5. Update the literature review with recent studies and rewrite the abstract to clearly state the paper’s gap and key findings.

I hope you can incorporate the recommended changes in your revised article.

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

·

Basic reporting

The text is well-written with respect to the use of the English language. My main issue with the current version of the paper is the focus of the reported work. Or, in other words, the discrepancy between what is promised and the presented results. Based on the title, abstract and introduction, it is expect a paper presenting an approach for improved privacy and interoperability of healthcare data using Federated Learning (FL). Additionally, the authors compared the performance of FL algorithms against Centralized Learning. Although the authors delivered the promise regarding the performance comparison between FL and CL, the privacy preserving and interoperability improving aspects of the platform requires more convincing arguments and results. For instance, regarding privacy preservation, throughout the text the authors claim that this property is provided by using FL. If FL inherently guarantees data privacy, what is special about proposed platform in this aspect that could not be found in other FL platforms? The argument that using FHIR improves data interoperability because is adopts a common data schema seems understandable and sensible. However, the author did not provide information about how the data was previously structure and how the local data to FHIR data translation occurs. Is it done manually case by case, or there is a mapping specification, or is it done automatically, and how?

Some other comments related to the presentation and text:
- In many parts of the paper the authors refer to some properties of the proposed FL platform such as "user-friendly tool...",
- Line 229. Isn't EHR already digitized, hence the electronic part of the acronym? If so, why "...digitizes EHR...";
- On lines 267 and 268, the sentence reads as if there was a previous discussion on the types of ML problems addressed, which is not the case. It may be useful to justify why on classification and regression problems.
- Line 294. "...distance, and run distance". What is the difference between distance and run distance? Is it walking distance?
- Figures 4 and 5 seem to be in low resolution with some visual artifacts in the cyan area. In figure 4 it is not straightforward to differentiate between synthetic and real areas. Maybe changing the color scheme would improve the possibility of differentiation.
- Table 1 (and 4). The first sentence refers to table 1, bringing evaluation metrics for classification. Considering the premise that the paper should be self-contained, i.e., that the paper should provide to the reader information that is pertinent to its understanding, it is not clear to readers that are not very familiar with ML algorithms whether the presented evaluation metrics are defined by the authors or from other sources, which is the case. Therefore, it would be expected that either the authors explain the metrics and the elements within the formulae (e.g., what are TP, TN, FP and FN?), or provide a bibliographic reference for further information. The same comment applies to table 4, the evaluation metrics for regression.
- Table 2, 3, 5, 6, 7, 8. It is not clear what the yellow highlight in a number of cells of these tables mean. Do they represent the best value between FL and CL for a given metric and algorithm? But then there are several examples of algorithms and metrics that do not have the best value highlighted.

Experimental design

Although the experiment regarding the performance of ML algorithms using Federated and Centralized Learning techniques have been explained and motivated, in the session titled Methodology, the authors only presented the system design and explained the implementation choices. If one of the aims of the paper is the proposition of a FL platform, one would expect the methodology for designing such platform. Or, if the purpose is to highlight the privacy preserving and interoperability improvements offered by the proposed platform, the methodology of how to use the platform to achieve such properties should have been presented and discussed.

From the current text, there are also some points that should be clarified. For instance, the authors indicate that Federated Learning is inherently secure in terms of privacy preservation because data does not need to be centralized. Besides the point that one can argue if that is indeed the case based on the lack of evidence in the paper, then why did the authors designed a FHIR Server in the Aggregator (the central component of the platform) where data can be moved from the edges? Would this move nullify the "inherent" privacy preservation of the FL platform? As already mentioned, the Methodology section should either report a methodology or be renamed to describe the proposed platform.

Validity of the findings

For this paper, the validity of the findings to be assessed can be divided in two parts, (1) the findings related to the performance of classification and regression ML algorithms using federated or centralized learning and, the findings related to the proposed FL platform to improve interoperability using FHIR protocol.
The results related to part 1 are very interesting and relevant. The authors properly explained the findings and discussed them.
The findings related to part 2, the proposed platform, are hard to assess. Throughout the text, it seems that the major goals for the platform are to ensure interoperability of wearable devices' data and privacy preservation. The interoperability part has been addressed by adopting FHIR whereas privacy preservation, according to the authors, is addressed by adopting Federated Learning. However, the paper lacks the proper discussion on how effectively this has been achieved. Sentences like "This approach prevents the centralization of sensitive data, thus enhancing privacy and interoperability" is not enough to demonstrate that these properties have been effectively achieved. The premise mentioned several times in the paper that Federated Learning inherently preserves privacy requires evidence. Otherwise, why one should choose this proposed platform instead of any other FL?

Additional comments

In summary, I consider that the paper should either focus on a performance comparison between FL and CL, for which the authors provided significant results, or it has to better present the proposed FL platform, further reporting the key features and evaluating them against related work.

Cite this review as

Reviewer 2 ·

Basic reporting

The source document is a paper undergoing peer review that discusses enhancing healthcare data privacy and interoperability with federated learning. The document includes an AI detection score of 29%, which suggests areas needing revision to ensure the content is original and adheres to academic standards.

Experimental design

The paper acknowledges that real data might be non-Independent and Identically Distributed (non-IID), which can significantly affect the results of CL and FL. To address this, the authors used the Synthetic Minority Over-sampling Technique (SMOTE) to synthesize data for classification tasks and Adaptive Synthetic Sampling (ADASYN) for regression problems. While using SMOTE and ADASYN is a good start, the authors should provide a more in-depth discussion on how these techniques mitigate the challenges posed by non-IID data in their specific context. They could also explore and compare other data synthesis techniques or partitioning strategies that are more robust to non-IID data.
The paper uses standard evaluation metrics such as Accuracy, F1-score, Kappa, and MCC for classification, and RMSE, MAE, R-squared, and MAPE for regression. The choice of evaluation metrics is appropriate, but the authors should justify their selection more explicitly. Additionally, they could consider including other metrics that are relevant to the specific healthcare applications they are addressing.
While comparing FL and CL is valuable, the authors could enhance their analysis by including additional baseline models or comparing against other federated learning algorithms. This would provide a more comprehensive understanding of the strengths and weaknesses of their proposed approach.
The experiments were simulated locally using only one dataset, and the data was divided into pairs of 60%/40%, 70%/30%, 80%/20%, and 90%/10% for training and testing.The authors should address the limitations of simulating the experiments locally. Running experiments on a distributed system with multiple devices would provide a more realistic evaluation of the performance of FL. They should also justify their choice of partition proportions and the number of clients used in the FL experiments.

Validity of the findings

Address Local Simulation Limitations: Acknowledge and discuss the limitations of simulating experiments locally. Running experiments on a distributed system would provide a more realistic evaluation of FL performance.
The paper often compares the performance of FL and CL models without explicitly testing for statistical significance.
The authors acknowledge the potential of integrating classification and regression models but lack a suitable dataset to test this integration

Additional comments

None

Cite this review as

Reviewer 3 ·

Basic reporting

The idea presented in this manuscript titled "Enhancing healthcare data privacy and interoperability with federated learning" is good. The manuscript is well written however the following suggestions can be considered while revising the manuscript.

1: The abstract needed to be rewritten, currently it is not exactly conveying what is done. It is at the abstract level, specifically highlight the gap and then report the key findings.

2: Extended the literature review and add more recent studies from last 2 to 3 years such as "Distributed intelligence for IoT-based smart cities: a survey".

Experimental design

1: Add more details for both tasks classification and regression and also include descriptive statistics for both datasets and sample.

2: Why and how both tasks are combined what was the rationale.

3: Why only traditional model's are used why not neural networks or Time series based models? What is the rationale behind this.

4: Does the data have timestamps? If yes then why Lstm is not used.

Validity of the findings

1: It is suggested to evaluate the findings based on some benchmark or if the similar kind of work is done in literature using similar datasets.

2: Rewrite conclusion and highlight limitations and potential future work.

Cite this review as

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.