Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on July 16th, 2025 and was peer-reviewed by 3 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on September 4th, 2025.
  • The first revision was submitted on October 15th, 2025 and was reviewed by 1 reviewer and the Academic Editor.
  • A further revision was submitted on December 2nd, 2025 and was reviewed by the Academic Editor.
  • The article was Accepted by the Academic Editor on December 5th, 2025.

Version 0.3 (accepted)

· · Academic Editor

Accept

This manuscript has undergone three rounds of revision, and the authors have satisfactorily addressed all comments except the concern regarding the use of a larger dataset, which was raised by both the reviewers and myself. The authors have explained that incorporating an additional dataset was not feasible within the given timeframe and due to ethical clearance constraints. On a positive note, they have clearly stated this as a limitation and supported their position by citing peer-reviewed studies that employed similar or even smaller datasets.

Given this justification, and considering that the manuscript does not exhibit methodological shortcomings, I find it acceptable for publication.

[# PeerJ Staff Note - this decision was reviewed and approved by Jyotismita Chaki, a PeerJ Section Editor covering this Section #]

Version 0.2

· · Academic Editor

Minor Revisions

The authors have addressed most of the reviewers comments, and the revisions generally reflect a sincere effort to improve the manuscript. The comparative analysis using RT-DETR, Detectron2, and the YOLOv8 baseline is appropriate and sufficiently strong for the context of meniscal tear detection. However, a major concern raised by the reviewer remains unresolved. The dataset used in the study is notably small, which increases the likelihood of underfitting, overfitting, and dataset bias. The possibility of class imbalance further weakens the robustness of the preprocessing pipeline. To strengthen the scientific validity and to align with PeerJ’s publication standards, the authors are strongly advised to incorporate additional datasets—preferably larger and more diverse—to enhance model reliability, statistical rigor, and overall quality of the work.

Reviewer 2 ·

Basic reporting

no comment

Experimental design

no comment

Validity of the findings

no comment

Additional comments

Thank you for taking into consideration all my comments and making the required changes.

Cite this review as

Version 0.1 (original submission)

· · Academic Editor

Major Revisions

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

**PeerJ Staff Note:** It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors agree that they are relevant and useful.

**Language Note:** When preparing your next revision, please ensure that your manuscript is reviewed either by a colleague who is proficient in English and familiar with the subject matter, or by a professional editing service. PeerJ offers language editing services; if you are interested, you may contact us at [email protected] for pricing details. Kindly include your manuscript number and title in your inquiry. – PeerJ Staff

Reviewer 1 ·

Basic reporting

1.Keep results reporting to a maximum of four decimal places and ensure a consistent reporting format throughout the manuscript.
2. The language requires significant improvement. The abstract should be rewritten to improve clarity and conciseness with a professional academic tone.
3. Check for all typographical errors
Line 58: meniscis
line 304: HiReCAM
Line 388: flod
Check also: Figure 5
4. Materials and Methods requires a clear rewrite, prevent redundancy and make sure the information given is clear and not confusing.
Some ambiguous / confusing statements are such as :
Line 150: "Gender information was not available for 44.58% (n=210) female, 42.02% (male) and 13.38% (other) of the selected patients. Right knee MRI images were used for 53.08% (n=250) and 46.92% (n=221) of the patients. "
Line 262-270 : I suggest this paragraph be shifted to the discussion section and propose the use of newer models as future works.
Line 393 : "The 5-fold cross-validation results showed the overall accuracy and consistency of the model. This shows that the model is reliable and similar performance can be achieved on real-world data."
5.The Discussion section should provide critical insights, interpret the findings, and compare results to previous studies. Lines 415–433 appear redundant, restating results already mentioned.

Experimental design

1. Add more technical details on the HiResCAM method. How does it differ from other technique?
2. The MOD-YOLOv8 architecture figure is too brief. Add detailed descriptions for each module.
3. Clarify how hyperparameters were fine-tuned. What is meant by “repeated experiments”? Were hyperparameter optimization techniques used? If so, describe them.

Validity of the findings

1. Lines 316–330: The explanation of results is vague. Provide clear interpretation of what these results mean in the context of model performance. For example: Why does P fluctuate while other metrics remain stable?
2. Lines 397–405: The statement “PR curves have an unbalanced distribution between classes…” needs quantification. How significant is the performance drop for the “tear” class? Provide numbers or statistical comparison.
3. Line 459: The authors claim that this is the “first algorithm for the detection of meniscal tears in the literature” is factually inaccurate. There are existing studies on meniscal tear detection. Cite relevant prior work, explain differences from your approach, and justify this claim.

Additional comments

The manuscript requires major rewriting to achieve an appropriate academic tone and to clarify vague statements. Overall, major revision is needed before the manuscript can be considered for acceptance.

Cite this review as

Reviewer 2 ·

Basic reporting

Overall, the language quality is good, and the suggested improvements are minor. A careful review with attention to sentence structure, word choice, and consistency will further enhance the manuscript's clarity and readability.
The manuscript is within the scope of the special issue on Artificial Intelligence (AI) applications in medicine, biomedical signal and image processing, and clinical decision support.

It is well structured; however, it needs a significant revision of its main chapters. The introduction is too broad and general in its discussion of DL models and their performances. The author should focus more on state-of-the-art deep learning models applied to the study of menisci in MRIs.

The number of cited works in the review appears to be limited. To strengthen the academic rigor and comprehensiveness of the manuscript, consider including a broader range of relevant literature. This will ensure a more robust foundation for the manuscript and provide readers with a broader perspective on the subject.
For example:
Güngör E, Vehbi H, Cansın A, Ertan MB. Achieving high accuracy in meniscus tear detection using advanced deep learning models with a relatively small data set. Knee Surg Sports Traumatol Arthrosc. 2025 Feb;33(2):450-456. doi: 10.1002/ksa.12369. Epub 2024 Jul 17. PMID: 39015056; PMCID: PMC11792105. and
https://github.com/simranjainds/Meniscus-Tear-Diagnosis-using-YoloV8-and-Mask-RCNN

Roboflow is not referenced.

The raw dataset used for training and validation has not been shared.
Figures' resolution must be increased.

Experimental design

The manuscript fails to clearly define the research gap, leaving the reader without a distinct understanding of why this article contributes significantly to the existing body of literature. To strengthen the manuscript, explicitly highlight how this work advances beyond previous studies in the field and articulate the unique contributions this manuscript brings to the domain of deep learning applications on MRI Knee Scans.

The manuscript lacks a detailed and comprehensive comparison of the results presented in the various SOTA studies. For example, how Yolo models can improve generalization in detection tasks. The article should compare Yolo-based models with other SOTA models, such as U-Net, Vision Transformer, and SAM, EfficientNetV2, to name a few.

Expand on the challenges faced in setting up the best parameters. Provide more details on the considerations for fine-tuning and running iterations.

Validity of the findings

A more thorough analysis and synthesis of findings across different works is needed to provide an understanding of the trends, patterns, and inconsistencies in the applications of deep learning in the domain of MRI knee diagnosis and detection and classification of meniscal tears.

Discuss the limitations of the manuscript in terms of sample size, study design, and potential biases.

The conclusion could summarize the key findings more explicitly and highlight the implications of the article for future research or clinical practice.

Consider improving the section on potential future research directions or areas that need further exploration in the field of deep learning for meniscal tear detection.

Additional comments

Addressing these points will significantly enhance the overall quality and impact of the manuscript, providing a more valuable resource for researchers and practitioners in the field.

Cite this review as

Reviewer 3 ·

Basic reporting

The manuscript proposes a new YOLOv8-based deep learning framework (MOD-YOLOv8) for the detection of meniscal tears in knee MRI images. The study is well-structured, the methodology is thorough, the results are satisfactory, and are presented following the literature. The 5-fold cross-validation contribution, reporting statistical confidence intervals, and interpretability using heatmaps are particularly notable. However, the dataset balance, low diversity, and sagittal plane consideration only are the most significant weaknesses of the study.

Strengths
1. Novelty: Technical innovation is provided by the new modules introduced in YOLOv8.
2. Methodology: 828 images annotated, 471 patient data, preprocessing, annotation, and modeling process are clearly described.
3. Evaluation: Precision, Recall, mAP50, F1 score + CI values are given; comparisons with different versions of YOLO are provided.
4. Interpretability: HiResCAM visualization has increased clinician confidence.
Weaknesses/Criticisms
1. Dataset imbalance: 681 tear-affected and 147 tear-free samples; model generalizability could be poor. Add more healthy cases.
2. Sagittal plane only: Validated the model on sagittal images alone; adding coronal and axial images would provide greater clinical validity.
3. No tear type classification: The model merely distinguishes between the presence and absence of a tear. Classifying types of tears or severity grades would have added value to the model.
4. Clinical significance of findings: Technical performance is extensively discussed, but the clinical significance (accuracy, speed, error types) must be more clearly emphasized.
5. Risk of overfitting: There is a gap between Best and Last in accuracy (0.915 → 0.845). It is recommended to stabilize the model.
6. Updates reference: Some newer versions of YOLOv10 and 11 have been released; the paper refers to those, but they are limited. More up-to-date comparisons can be added.

Actions:
• Class imbalance in the dataset must be more clearly discussed in the discussion, and, if possible, balancing methods (SMOTE, oversampling, augmentation) must be experimented with.
• Validation of the model on coronal and axial images has to be confirmed, or at least a plan for this must be discussed.
• Classification of meniscus tear type must be more strongly proposed as future work.
• Precision differences must be discussed; the causes and proposed solutions must be given.
• Examples of heatmaps should be shown more systematically (with false positive/false negative examples).
• Clinical implications (impact on time to diagnosis, risk of error, potential usage scenarios) should be more clearly debated in the discussion.
• Data augmentation methods and their implementation should be detailed.
• YOLOv10–11 updates should be compared more clearly, even briefly.
• Reliability would be increased if the labeling process used within the methodology (inter-observer reliability, Cohen's kappa, etc.) was provided.
• The addition of the ROC curve and AUC values would provide a clearer picture of the performance of the model.
• Loss plots for training/validation would be helpful.
• Quantitative comparison with other YOLOv4/v5 studies in the literature should be described more in the discussion.
• The limitations of the dataset (age range, single center, lack of diversity) should be more explicitly stated.
• The English is a bit non-fluent; language editing would benefit.
• The conclusion needs to have a scenario (e.g., radiologist assistant, emergency screening car) on how the method will be incorporated into a real clinical workflow.

Experimental design

-

Validity of the findings

-

Additional comments

-

Cite this review as

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.