Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on August 8th, 2024 and was peer-reviewed by 3 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on January 9th, 2025.
  • The first revision was submitted on April 22nd, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • A further revision was submitted on August 7th, 2025 and was reviewed by 1 reviewer and the Academic Editor.
  • The article was Accepted by the Academic Editor on August 7th, 2025.

Version 0.3 (accepted)

· Aug 7, 2025 · Academic Editor

Accept

Thank you for addressing the reviewers' comments.

Best wishes,

[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]

Reviewer 1 ·

Basic reporting

No comment

Experimental design

No comment

Validity of the findings

No comment

Additional comments

The authors have carefully addressed all reviewer comments from the previous round. In my view, the manuscript is now ready for publication.

Version 0.2

· May 19, 2025 · Academic Editor

Minor Revisions

Dear Authors,

One of the previous reviewers did not respond to the invitation for reviewing the revised paper. Although one reviewer accepts your paper, one reviewer suggests minor revision. We encourage you to address the concerns and criticisms of Reviewer 1 and resubmit your paper once you have updated it accordingly.

Best wishes,

Reviewer 1 ·

Basic reporting

The revised manuscript shows a notable improvement in clarity, logical flow, and professional language use, especially in the abstract and introduction, which were criticized as unclear in the first round. The abstract has been restructured into a more coherent form, eliminating redundancies and emphasizing the motivation, methods, and results clearly.

The authors added missing references and substantially improved the contextualization of their work by:
- Clarifying how their method extends prior work on Chinese character feature embeddings.
- Better distinguishing their contribution within the specific domain of rehabilitation medicine, which was previously underdeveloped.

The figures (e.g., RMCCKG visualizations) and tables (performance comparisons) are appropriate, clear, and well-integrated into the discussion. Formatting issues identified previously (e.g., minor typographical errors) have been corrected
.
Dataset limitations are now discussed (e.g., the Rehab corpus being skewed toward orthopedic rehab), and the data construction process is better documented, addressing earlier concerns about bias and transparency

Experimental design

The research question is now clearly articulated in the introduction and abstract. The domain-specific challenge of applying NER to Chinese rehabilitation medicine—where annotated data is scarce—is convincingly framed as a meaningful gap in the literature.

The revised paper significantly improves the description of RMCCKG construction, with more details on:
- Data sources
- Rule-based mapping of features to characters
- Triplet creation in Neo4j

Validity of the findings

The results (notably a +3.96% F1 improvement) are robust, statistically validated, and directly support the research question. The authors correctly limit their conclusions to the demonstrated gains in NER performance and improvements in low-frequency entity recognition.

The authors addressed reviewer concerns on dataset bias by:
- Acknowledging orthopedic skew
- Justifying domain-agnostic potential of the RMCCKG
- Outlining how the approach can be extended to other subdomains

The authors clarified their contribution lies in the integration and application of RMCCKG in a domain-specific context.

Additional comments

Strengths:
- Thorough and respectful engagement with reviewer feedback.
- Expanded technical clarity on knowledge graph construction and feature embeddings.
- Stronger articulation of domain-specific innovation.

Minor Weaknesses or Suggestions:
A few areas could still benefit from more illustrative examples in the main text. Although the authors justify their statistical approach, adding 1–2 short annotated sentences in a figure/table could help readers unfamiliar with Chinese NER.

Consider open-sourcing RMCCKG and the Rehab dataset if possible, to foster replication and reuse.

Reviewer 3 ·

Basic reporting

The author has made revisions according to the comments, and I suggest accepting the publication of this article.

Experimental design

The author has made revisions according to the comments, and I suggest accepting the publication of this article.

Validity of the findings

The author has made revisions according to the comments, and I suggest accepting the publication of this article.

Additional comments

The author has made revisions according to the comments, and I suggest accepting the publication of this article.

Version 0.1 (original submission)

· Jan 9, 2025 · Academic Editor

Major Revisions

Dear authors,

Thank you for your submission. Feedback from the reviewers is now available. Your article has not been recommended for publication in its current form. However, we do encourage you to address the concerns and criticisms of the reviewers and resubmit your article once you have updated it accordingly. It is also recommended that some paragraphs be divided into two or more sections in order to enhance their comprehensibility and understandability.

Best wishes,

Reviewer 1 ·

Basic reporting

- The language used throughout the paper is professional and clear, with a strong focus on technical terms relevant to named entity recognition (NER) and Chinese character knowledge graphs. The text is comprehensible and maintains a formal tone appropriate for an academic setting.
-The paper provides a solid context for the research by citing relevant literature, including previous methods for Chinese NER and the use of knowledge graphs in language modeling (e.g., references to BERT, CRF, BiLSTM, and existing NER models like those used in clinical texts). The introduction to the problem and the background in rehabilitation medicine NER are sufficiently covered. References are ample and touch on key developments in both NER and related fields, demonstrating a good grounding in existing literature.
- The article is well-structured, conforming to standards of academic publication. It includes sections such as "Introduction," "Methods," "Results," and "Discussion," which follow a logical flow. Figures and tables are well-integrated into the text, especially the diagrams illustrating the RMCCKG model (Figures 1-5) and tables that present results of different models (Tables 2-7). These elements are clearly labeled and described, adding value to the explanation of the methods and results.
- The manuscript mentions the use of public datasets such as the CMeEE dataset and their self-constructed "Rehab" corpus. Specific details on the data, such as the size and variety of entities, are provided, and the results of various model comparisons are presented in tables. The raw data, such as entity frequencies in the test and training sets, is presented transparently.
- The paper is self-contained, providing relevant context, detailed methods, and a clear explanation of the hypotheses. The results directly relate to the research questions regarding the performance of their proposed RMCCKG+BERT-BiLSTM-CRF model. The data provided supports the conclusion that their model improves on baseline NER methods in recognizing medical entities.
- Results are presented with clarity, including well-defined terms such as Precision, Recall, and F1 score. The methods section clearly outlines the modeling approach and its components (BERT, BiLSTM, CRF, etc.). There is no ambiguity in how terms are used, and theorems or proofs are not necessary for this type of applied research.

Experimental design

- The research is original and focuses on a novel application of named entity recognition (NER) in rehabilitation medicine, which is a specialized and under-researched area, particularly in the context of Chinese medical texts. The paper falls within the broader scope of natural language processing (NLP) and medical informatics, which is likely to align with the journal's aims and scope, particularly if the journal deals with topics in AI, machine learning, or applied medicine.
- The research question is clearly defined: the authors aim to enhance the recognition of medical entities in Chinese rehabilitation texts using a knowledge graph of Chinese characters (RMCCKG) combined with a BERT-BiLSTM-CRF model. This addresses a specific challenge in medical NLP—how to accurately identify and categorize medical entities in a language with complex character-based semantics (Chinese). The paper identifies and attempts to fill a knowledge gap, namely the underutilization of Chinese character features and their relationships in medical NER tasks.
- The investigation is technically sound, employing state-of-the-art techniques such as the BERT model, BiLSTM, and CRF, which are well-regarded in NLP tasks. The authors build their own rehabilitation medicine corpus and augment it with publicly available data (CMeEE), which demonstrates rigor in data collection and preparation. The ethical standard is also upheld by utilizing public datasets and building a specialized dataset for rehabilitation medicine, without apparent ethical concerns related to patient privacy or data handling.
- The methodology is detailed, with clear explanations of the model architecture (BERT-BiLSTM-CRF), the construction of the Rehabilitation Medicine Chinese Character Knowledge Graph (RMCCKG), and the process for embedding these features into the NER task. Specific parameters, such as batch sizes, learning rates, and optimizer choices, are also provided (e.g., Table 2 on model performance, details on embedding types). The construction of the RMCCKG is explained with sufficient granularity (radical, stroke, part of speech, morphology, etc.), allowing other researchers to potentially replicate or adapt the approach. Additionally, the use of both baseline models and comparative experiments strengthens the methodological transparency.
Recommendation:
- It may be helpful to include more explicit discussion of any potential biases in the training data, or mention if steps were taken to mitigate overfitting or data leakage.

Validity of the findings

- The paper does not explicitly focus on assessing the broader impact or novelty of the findings, which is appropriate for this criterion. The focus is more on presenting the results of the research and demonstrating how the proposed approach performs compared to existing models. This aligns with the expected approach of presenting research in a neutral, objective manner, leaving the assessment of novelty and impact to peer reviewers or readers.
- The authors clearly articulate the rationale for their research, emphasizing the limitations of existing methods in Chinese medical NER, particularly in the domain of rehabilitation medicine. They highlight that few studies address the complexities of Chinese character features in this specific medical domain. Their work builds on existing literature, but also advances the field by proposing the integration of a Chinese character knowledge graph (RMCCKG), which can be used to enrich model input.
- The benefit to the literature is well-stated, especially in terms of the contribution their approach makes to improving named entity recognition in niche fields like rehabilitation medicine. Replication is encouraged through the transparent description of their datasets and model configurations. The authors provide sufficient methodological detail for other researchers to replicate their approach, both in terms of the RMCCKG construction and the model architecture.
- The authors provide detailed results, including precision, recall, and F1 scores for their models on both the CMeEE public dataset and their own Rehab dataset (Table 2 and Table 3). These results are robust and statistically sound, using standard metrics widely accepted in the field of NER.
- The data is well-controlled, as the authors compare their model with baseline models (e.g., BERT-Softmax) under the same experimental conditions. They also conduct various experiments to validate the performance of their RMCCKG model, ensuring that the results are not isolated or arbitrary. Furthermore, the use of a public dataset (CMeEE) adds credibility, as it is a widely recognized benchmark in the field.
- The conclusions are closely tied to the original research question, which aimed to improve Chinese medical NER through the introduction of a knowledge graph. The authors conclude that their RMCCKG+BERT-BiLSTM-CRF model outperforms baseline models, particularly in recognizing entities in Chinese rehabilitation medicine texts. This conclusion is backed by quantitative results (improvements in F1 scores), which are directly tied to their experiments.
- The paper avoids overextending its claims, focusing only on the results that directly support the hypothesis. For instance, the conclusion that their model improves on baseline models is strictly based on the presented performance metrics, without speculative claims about broader implications beyond the scope of the research.

Recommendation:
- While the authors have been transparent about the data used, they could provide more insight into the potential limitations of their self-constructed Rehab dataset, such as any biases or coverage limitations.
-

Reviewer 2 ·

Basic reporting

The paper is well-written and well-structured. The use of state-of-the-art references is adequate and provides sufficient context for the research. However, several minor adjustments could further improve the clarity and overall quality of the article:
Lines 239-241: A reference should be added to support the statement made in this section.
Line 264: A reference for the cited book should be included.
Line 319: Please add a brief explanation of the BIO labels for readers unfamiliar with Named Entity Recognition (NER).
Line 400: A reference or link to the CMeEE dataset should be provided.
Lines 406-417: Consider reducing the explanation of precision, recall, and F1 score, as these are well-known concepts.
Additionally, a thorough check for typos is recommended. Some examples include:
Line 336: "}output" should be corrected to "} output".
Line 402: "types. we" should be capitalised as "types. We".

Experimental design

The authors clearly state their research objectives. The proposed approach addresses current limitations in Chinese NER within the sub-domain of medical rehabilitation. The methodology is robust and could easily be applied to other domains. This is a point the authors could emphasise more, as it strengthens the broader applicability of their work.
The experiment is rigorous and well-motivated, and the workflow is clearly outlined, making replication feasible. However, I have a minor suggestion regarding the statement in Line 285 concerning the feature observation for entities. More details should be provided on how the final list of 9,702 entity pieces and 50,767 triplet pieces (including 49,867 relationship triplets) was obtained. Further clarification on the process would enhance the transparency of the methodology.

Validity of the findings

The results are robust and reliable, with a deep analysis presented by the authors. The conclusions are well-stated and directly linked to the research question, supported by the results.
One minor suggestion that could improve the paper is the inclusion of more examples throughout the sections. This would help illustrate the findings and enhance the clarity of the discussion.

Reviewer 3 ·

Basic reporting

1. The description,logic and readability of the abstract is unclear and poor . It is recommended to reorganize the abstract chapters.
2. The lack of innovation in the paper lies in the fact that the proposed RMCCKG, BERT, BiLSTM, and CRF are widely used technologies in this field, and the proposed framework in the paper lacks innovation and novelty.
3. The research motivation of the paper is unclear and the analysis is not enough, which makes it difficult to judge the innovation of the paper.
4. The introduction of the paper is not standardized enough as well as the format , such as aligning. the contribution description does not highlight the key points.
5. In the relevant work, there is a lack of overview of the technologies used in this paper.
6. please add a flowchart describing the data processing process.
7. The document has a light gray background from the pages 15-18.
8. The paper lacks mathematical formulaic descriptions of key processes.
9. The paper lacks a lot of theoretical analysis to explain why the effect is good?
10. All the figures in the paper have unclear meanings, and the quality of the figures needs further improvement.
11.The paper contains a large number of grammatical errors.

Experimental design

As above

Validity of the findings

As above

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.