All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thank you very much for the effort of publishing the paper
[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]
All the reviewers' comments have been successfully taken into account. However, some minor issues must be addressed:
- Check carefully the whole text for typos and mispellings. For example: "Authorcontributions" all together.
- Spaces missing before some citations. For example: algorithm(Du 2019), spikes(Du 2018)....
- Missing citations or misspellings. For example: XU 2024. Chen 2023...
Please, read carefully the whole manuscript, check all the citations, and correct them.
All prior comments have been fully addressed by the authors.
All prior comments have been fully addressed by the authors.
All prior comments have been fully addressed by the authors.
Please take into account carefully the reviewers comments in order to improve the manuscript.
-
Figure 1 contains four images, and each should include a clear title or label indicating whether it was taken under in-situ or ex-situ conditions.
The authors still have not explained the criteria used to select the 500 high-quality images from the original set of 700. If all images were captured using the same camera, the change in terminology from “high-quality” to “high-resolution wheat spike images” is unnecessary. The authors need to provide a specific explanation for how the 500 images were selected and why the remaining 200 were excluded. If the excluded images were removed due to blurriness, the authors should clarify how blurriness was caused and determined, and specify the criteria or standard used for that assessment.
Minor grammatical revisions are needed for clarity (e.g., “the image segmentation effect is still good” → “the segmentation quality remains acceptable”).
Improve caption clarity (e.g., “Figure 1: Images of Wheat Spikes” could elaborate on conditions or variations shown).
Consider formatting all acronyms upon first use (e.g., clearly define MIoU, MPA).
Include a high-level architecture flow diagram illustrating how SPB, MSDC, and CBAM integrate within the ResNet50‑based U‑Net encoder–decoder pipeline.
Clearly label module locations (encoder stage, skip connections, decoder) to improve reader comprehension
Provide inference time benchmarks on realistic hardware (e.g., mobile, edge devices) alongside the reported FLOPs and parameter counts.
Discuss practical deployment considerations, such as memory footprint, latency, and compatibility with field-use devices or embedded systems.
Expand on the notable performance drop on the GWHD dataset (e.g., MIoU 74.7%). Analyze causes of domain shift—like variations in lighting, spike shape, or cultivar—and their effects on segmentation.
Consider adding experiments or discussion on domain adaptation techniques (e.g., style transfer, multi-domain training, adversarial alignment).
Incorporate statistical significance tests (e.g., paired t-test, bootstrap confidence intervals) comparing your model against baselines to confirm that observed improvements aren't due to chance.
Release or describe precisely the train/validation/test splits, preprocessing pipeline, and any randomization protocols.
Discuss possible biases—such as over-representation of specific wheat varietals, growth stages, or environmental conditions—and how they might affect model performance.
Authors present a wheat spike image segmentation method that is able to obtain SOTA results on a fair enough amount of varied RGB images, under different illumination conditions (it seems to be robust to variations related to those factors).
The structure and information given in the paper seems to be enough to reproduce results. There is a good amount of figures exemplifying the segmentation capability of the method.
Section 1 is well written. The different subsections related to different parts of the semantic segmentation strategy are correctly explained and explanation figures are quite helpful.
Very interesting and useful results with a wide and varies rang of applications.
The paper is worth publishing as is.
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
Introduction Detail (Lines 28–31): The introduction is overly broad at this point. The authors should provide a more detailed discussion of what wheat spike segmentation entails and its specific applications or benefits.
Organization of Related Work (Lines 32–61): The authors should focus on studies directly related to wheat spike segmentation or similar areas rather than discussing general image segmentation methods.
Figure Quality: The quality of Figure 3 requires improvement, with enhanced clarity, resolution, and labeling.
Clarification in Table 1: It is unclear what “CL” represents in Table 1. The authors should clarify whether it refers to CE Loss.
Figure 9 Backbone Consistency: The authors should indicate whether the models being compared in Figure 9 share the same backbone to ensure a fair comparison.
Explanation of Technical Components (Lines 62–72): The introduction of components is insufficiently detailed. The authors need to explain how these components address the specific challenges associated with wheat spike segmentation.
In section 1.1, the authors should provide a detailed description of the image collection protocol. The attached example suggests that some images were collected with wheat spikes detached, while others show the spikes attached.
Camera Settings and Image Selection: The authors should provide more details on the camera settings and justify the use of a smartphone camera.
The authors should describe the criteria used to select the 500 high-quality images.
CBAM Module Explanation: The authors need to provide more details on why the CBAM module improves the overall performance for wheat spike segmentation, including its role in feature extraction and model robustness.
Class Imbalance Issue: The authors should explain the problem of class imbalance in greater detail.
In Section 3.2, it is insufficient to conclude that combining these modules can reduce the impact of noises or complex backgrounds, as the section does not provide direct evidence or experiments isolating these effects.
Subsection Division in Section 3.1: It is recommended that Section 3.1 be divided into subsections to improve clarity and enhance the overall organization of the discussion.
The authors should include a figure presenting multiple examples that effectively illustrate the dataset.
Some minor grammatical issues exist (e.g., awkward phrasing such as “the model performance shows the most significant improvement” could be rephrased for clarity).
There are a few inconsistencies in formatting (e.g., misplaced punctuation in tables or equations).
The authors use terms like “wheat ears” and “wheat spikes” interchangeably. Standardizing terminology would enhance clarity.
While the use of ResNet50 is justified, the choice of dilation rates (2, 4, 8) in MSDC could be further explained or validated against other dilation settings.
The use of CBAM is limited to the decoder—discussion on whether it was tried in the encoder would strengthen the experimental rationale.
While the model's performance is high, evaluation on a public benchmark dataset or cross-validation would improve generalizability.
The authors mention robustness in complex backgrounds, but quantitative results on different environmental subsets (e.g., lighting, occlusion) are not reported.
Add computational complexity comparisons (e.g., inference time, parameter count) across models for practical deployment insights.
Include qualitative failure cases to understand where the model may underperform.
Consider evaluating the model on multi-class segmentation or disease detection tasks as a downstream application.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.