All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thank you for the revision of the manuscript and for the detailed rebuttal letter explaining your actions in reply to the suggestions of the reviewers. I consider that the manuscript is now ready for publication.
[# PeerJ Staff Note - this decision was reviewed and approved by Anastazia Banaszak, a PeerJ Section Editor covering this Section #]
All the reviewers recognize the merits of the study and they make multiple good suggestions to ameliorate the manuscript. I encourage you to follow them as much as possible, but this is perhaps not necessary to simplify the method section as proposed by reviewer 1.
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
The manuscript is well written and could be considered for acceptance after revision, especially regarding taxonomic validation.
-The research question is well-defined and addresses a clear methodological gap in biodiversity monitoring by combining DNA megabarcoding and CNN-based imaging.
-The Materials and Methods section is highly detailed and ensures reproducibility. However, some subsections (e.g., DNA extraction, Line 170–193) are overly detailed and should be simplified.
-Taxonomic performance at genus and species levels should be treated carefully. The image-based classifier correctly assigns ~86.0% at the species level, which is not yet sufficient for decision-making, especially among closely related EPT taxa.
-How would your approach handle damaged or fragmented specimens, which are common in biomonitoring samples? This limitation should be explained or acknowledged.
Species names should be italicized throughout the manuscript (e.g., Line 433, Baetis rhodani and Baetis vernus).
The manuscript is easy to read and uses technically correct English, and is appropriate for an international audience. The introduction is also very strong and provides adequate background information on the topics (DNA barcoding, metabarcoding, and imaging). References are also comprehensive and appropriate. All other criteria have also been met.
The manuscript meets all standards, all methods are clearly outlined and can easily be replicable. Gaps have been identified, and they have provided adequate information on how their proposed research addresses that gap. No issues here.
Findings are valid, and the data are robust. Conclusions are well stated, and limitations of the research are acknowledged.
Overall, there are no red flags. There are issues with the success rates, although the authors have been transparent about everything.
-
-
All standards have been met.
Minor suggestion:
1. Address the weighing limitations/errors by suggesting the use of more sensitive scales in future studies.
2. The species-level image classification is a slight limitation for closely related taxa - this should be emphasized more clearly in the conclusions.
Minor comments/suggestions:
Introduction:
1. The authors are encouraged to provide a more detailed explanation of DNA megabarcoding in the introduction. It would be helpful to briefly outline what the technique entails, the key strengths and limitations, as this would improve clarity and contextual understanding for readers who may be less familiar with the technique.
2. The authors highlight the lack of comprehensive freshwater invertebrate datasets as a major bottleneck in ecological research; emphasizing how the dataset provided in this study helps address this gap would strengthen the manuscript.
Methods:
1. The manuscript could benefit from a flow diagram of the lab workflow to help visualize the pipeline from specimen to sequence.
2. Target gene amplification: Why was DreamTaq used and not a high-fidelity Taq for high-throughput sequencing?
3. Library preparation: The workflow indicates pooling after individual tagging, but more detail is needed on the pooling process and how index hopping or tag collisions were addressed.
4. When multiple OTUs were recovered per specimen, the most abundant OTU was used. Please provide a short justification and indicate how often secondary OTUs were detected (contaminants vs parasites/gut content).
Results:
1. The contamination issue on plate 6 is important. I encourage the authors to discuss the possible causes, how reprocessing affected results, and whether any specimens were excluded.
I have found nothing that needs to be reported here. The manuscript is well-prepared, well-written, and a consistent piece of work.
I am afraid I did not have the time to check the raw data, but from scanning other works, I found that this group has a very good strategy of making raw data and the associated metadata available, and I trust that this will be the case here as well. All that needs to be done is to provide the relevant accession numbers once the manuscript has passed the review stage.
The study design is sound and adequate. In fact, it builds on established and well-documented methodological approaches to address hypotheses of practical relevance in the context of aquatic ecosystem assessment.
Also, the methods are described in sufficient depth for other working groups to repeat these exercises should the necessary equipment be available.
Based on my assessment of study design and methodology and the results detailed and discussed, the findings are – within the framework of this specific study – valid. Given the diversity of benthic invertebrates, I remain sceptical of their generalisability to the whole of this group of animals, where various molluscs, crustaceans, insects, worms, and other taxa must be expected.
But this is something for future studies and does not detract from the self-contained validity of this work.
I have several more general comments that I would like the author to address/consider.
1. OTU assignment for this very limited set of taxa appears disastrous. The 14 species included here are (with the exception of Nemoura cinerea and Amphinemura standfussi) readily identifiable in their larval stages and should be properly covered in the reference libraries. The discussion the authors offer on their outcomes (also regarding the absurdly high numbers of OTUs inferred on single specimens!) leaves me unsatisfied. It is clear that once the reference libraries are properly curated and up to the job, taxonomic assignments will be much, much better, but I would like to learn the authors' opinion (and, ideally, see it in written form in the discussion) about methodological(?) and infrastructure-related challenges in the context of megabarcoding. Inferring on average ~13 OTUs (with many non-majority OTUs) in single specimens feels very strange; if that were possible, I would also be interested in learning about the taxonomic assignment of these non-majority OTUs. This is very briefly addressed in the section «Added value of the combination/transferability», but should be discussed in greater depth, as you have assignments that make use of them beyond suspecting parasites and/or gut content.
I also wonder if the method of OTU delineation makes a difference here, as the authors work with a threshold approach. Other approaches will likely produce different OTUs, and this might affect how efficiently specimens can be identified. I also find myself wondering if OTUs of individual specimens should actually be considered here: you could (if it were not for the contamination issue reported) likely build the consensus sequence of each specimen and map that against a reference library. As the aim is identification (not producing reference data), this could be something to explore.
2. Regarding inhibition: I am sure the authors are aware of the fact that too high a DNA concentration also inhibits PCR, and I did not read that DNA concentrations were normalized. Maybe also something worth considering, instead of or in addition to inhibitor removal.
3. I am not too happy with the section on «Future steps of megabarcoding and imaging». This is because of two reasons: (i) Taxonomic expertise – as required for manual identification – is becoming a highly sought-after asset because this is what is needed to make these reference libraries, databases, and so on work properly. Also, it is what makes biodiversity data useful because it allows us to connect with people. I also would argue that, as taxonomic expertise is anyway on the decline, writing “expertise-reliant manual identification” down as time-consuming, etc., is not warranted: in providing relevant and useful data for political/societal decision-making, every hand is needed. Yet, I agree that some of the approaches we currently use should be improved. (ii) The most challenging aspect in processing a benthic invertebrate sample is not identification – a trained operator will have no problem identifying 12 out of the 14 species you have in your samples and will put the Plecoptera on genus-level, returning 14 taxa – it is preparing the sample for identification by picking out the specimens from the debris – and that, too, requires taxonomic expertise (because you can't look for something you don't know). So, taxonomic expertise is a key asset for using the outlined workflow, and this should be reflected in the manuscript. I cannot guess if automated approaches will ever be able to clean a standard benthic invertebrate sample.
Also, if I have to remove every caddisfly from its case for imaging, I'm adding quite a substantial workload to a project. However, I note the value of estimating biomass through imaging+CNN, which is indeed a big step forward (and for which decased caddis are needed).
That said, I would ask the authors to consider the total workload of this project and make an honest assessment of the time and money invested to process these 890 specimens, and ask a trained operator/specialist for their estimate on the effort of getting these identified.
Use italics for scientific binomina throughout the manuscript and figures.
Generally, I am in favour of this manuscript. I have my professional qualms about parts of the discussion, but this can be solved.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.