All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Dear Authors, I am glad to inform you that your paper can be accepted for publication in PeerJ.
Please correct this reference: Doorenspleet, K., Jansen, L., Oosterbroek, S., Kamermans, P., Bos, O., Wurz, E., ... & Nijland, R. (2025). The Long and the Short of It: Nanopore‐Based eDNA Metabarcoding of Marine Vertebrates Works; Sensitivity and Species‐Level Assignment Depend on Amplicon Lengths. Molecular Ecology Resources, e14079.
[# PeerJ Staff Note - this decision was reviewed and approved by Anastazia Banaszak, a PeerJ Section Editor covering this Section #]
Dear Authors, the Reviewers have commented the paper. They found it highly improved compared to the previous version. Nevertheless, they still have concerns about the comparison between amplicon (COI-based) and metagenomics approaches. In my opinion, it is needed to better address this point by extracting some numbers from the datasets to obtain a more comprehensive understanding of the relative performance of the sequencing methods. Finally, they also found some inconsistencies with reference data of their morphological assessments that need to be checked.
I apologise to authors for the delay, I received the review days before I went on leave over the holiday break.
Lines 326-340: I appreciate the revisions made in response to my previous comments. While the clarity of the manuscript has improved, it still lacks a direct comparison between metabarcoding and metagenomics. For instance, how does amplicon-based COI sequencing (e.g., identifying 39 species) compared to metagenomics approaches (e.g., 44 species with Nanopore, 39 using NovaSeq metagenomics with the COI database, and 36 using the nt-NCBI database)? Further, please clarify how many COI sequences were extracted from the Nanopore data and how many species were identified solely by non-COI fragments, specifying which fragments were used. This comparison is critical, and should be explicitly presented—perhaps in Table S3 or S7—so that readers can clearly see instrument and application performance differences.
Line 357-360: mention the number of unique species identified by both methods and sum the total number of unique species.
Line 438-439: authors should check this statement as 7/13 species highlighted in red appear to have valid COI/COX1 sequences on NCBI: Aonides oxycephala, Eumida sanguinea, Harmothoe glabra, Kurtiella bidentata, Protodriloides chaetifer, Pseudopolydora pulchra, Tryphosa nana.
Line 443: 4th bar, figure 6a is not a clear description. This is the same for the following descriptions after. Authors needs to revise how to help readers point to the data more clearly.
NA
NA
NA
Dear authors
Two reviewers carefully read the revised manuscript. Both found the manuscript much improved over the first version. You have addressed many of the initial concerns, improving methodological clarity and expanding on some analyses. However, one of the two reviewers still has some comments that should be addressed before acceptance. The comments regard the bioinformatic pipeline, database reliance, and interpretation of sequencing platform comparability. Specifically, you should expand the taxonomic analysis beyond COI since other mitochondrial or nuclear markers available in metagenomic data can be useful in particular for species not detected with COI-based methods. Moreover, the reviewer has some concerns about the comparability between Illumina and Nanopore platforms, related mainly to bioinformatic processing and sequencing accuracy. So, the reviewer suggests providing a more balanced view rather than implying interchangeability. I hope these comments will help you to improve the manuscript.
The authors have made thorough revisions with significant adjustments made to improve the clarity and depth of their manuscript. While many key points have been addressed effectively, there remain some critical issues, particularly around database reliance and platform comparability, that would benefit from further refinement.
The revised manuscript demonstrates substantial progress following the initial feedback, addressing key areas and enhancing methodological clarity. However, some critical bioinformatic and interpretative aspects remain inadequately justified or lack depth. This iteration continues to rely heavily on COI as the primary marker for taxonomic assignment within the metagenomics dataset, without fully leveraging the broader capabilities of metagenomic sequencing to capture a more diverse set of mitochondrial and nuclear markers. The emphasis on COI alone potentially limits the discovery of species and diversity within the metagenomic dataset, particularly for the 22 morphologically identified species absent in DNA-based methods. Expanding the analysis to include additional mitochondrial and nuclear markers could yield a more comprehensive understanding of the biodiversity present and could potentially identify species undetected by COI-based metabarcoding alone.
Additionally, the manuscript could benefit from tempered language regarding the comparability between Illumina and Nanopore sequencing platforms. While they exhibit similar outputs, the distinct bioinformatics workflows and differences in sequencing error profiles require careful consideration when drawing equivalence conclusions. Adjusting the tone to focus on observed similarities and specific distinctions may improve accuracy and readability.
Overall, while the authors have implemented many suggested revisions, further enhancements in analytical depth and interpretative balance would elevate the rigor and clarity of the study. Below are suggested comments for improvement.
Major Comments
1. Bioinformatic pipeline and taxonomic assignment (Lines 261–271):
Relying on the top five hits and using the best match for taxonomic assignment in metabarcoding, especially in metagenomic analyses, can introduce inaccuracies. The authors are encouraged to perform lowest common ancestor taxonomic assignments and increase the number of blast hits (e.g., BLCA). This will help identify where discrepancies or ambiguities arise and improve the reliability of their taxonomic conclusions.
2. COI limitations and metagenomic analysis
The manuscript leans heavily on COI data despite the metagenomic approach's potential to explore broader taxonomic signals. The authors might consider mining for additional mitochondrial or nuclear markers within the metagenomic dataset. This could provide a more holistic view of biodiversity, especially in identifying the 22 species not detected using DNA-based methods.
3. Comparability between Illumina and Nanopore platforms:
For the section on the comparability between Illumina and Nanopore platforms, I suggest the authors revise the language to focus on the specific similarities and differences between the two methods rather than asserting an overall comparability. The text should highlight that, while both platforms can produce relevant biodiversity data, their bioinformatics pipelines, sequencing error rates, and data outputs have unique characteristics that may impact certain aspects of species detection and abundance estimations.
For instance, the manuscript could benefit from noting that Illumina generally provides high-accuracy, short-read data, which is effective for taxa identification at a fine scale. In contrast, Nanopore offers the advantage of real-time, long-read sequencing, albeit with a typically higher raw error rate that necessitates more intensive error-correction processes in downstream analyses.
By focusing on these platform-specific attributes, the manuscript can provide readers with a nuanced perspective on how each technology contributes to biodiversity assessments, rather than suggesting that they are interchangeable.
Minor comments
Methods
Line 185: 57°C is high for the annealing for this primer set. Is this correct?
Line 284: typo invertabrates – invertebrates
Line 335-336 – refer to taxonomic assignment rather than sequence overlap. Unless the Nanoppore and Illumina sequences were merged to generate the taxonomic assignments then it would be appropriate.
Lines 340: mention the read numbers alongside the percentages.
Results
Lines 328-332: mention the number of species assigned with the reads for both databases.
Discussion
Lines 430-431: mention amplicon-based sequencing and shotgun sequencing.
Line 437: mention 22/56 morphological species identified. It show more than half of the species were identified.
Line479-480: I would recommend authors avoid mentioning “newest” as flow cells are improved and change frequently. I suggest referring to the version of the flow cell and mention what specific improvements are made to quality.
Lines 505-510: this interpretation is unsubstantiated. It is unclear what the authors are referring to in terms of contamination, how they are making comparisons or how they qualified low diversity samples (based on what?). Authors will need to re-evaluate this as the metagenomics method will show a broad taxonomic range from prokaryotes to eukaryotes.
Lines 568-572: how did the 22 species not found in the DNA based methods compare to those that were found. For example, specimens found only one time that were not identified using DNA, are there examples of the other 34 species that were found only one time there were identified using DNA based methods.
Line 580-582: authors should simply check for primer mis-matches is the reference is available.
see above
see above
no comment
no comment
no comment
The article has been significantly improved from the previous version. It is now a truly enjoyable piece of robust science of the very best quality, and a pleasure to read. The presented results and conclusions will be very helpful to the growing community of scientists interested in marine molecular assessment. I strongly recommend acceptance for publication in PeerJ.
Congratulations to the authors.
Owen S. Wangensteen
Dear Authors, we received the comments by three referees. All of them agree that the paper is of interest for many ecological researchers that are currently wondering which molecular method is best for biodiversity assessment. Moreover, they provide useful suggestions and evidence several pitfalls that need to be addressed. As suggested by Referee 1, I agree that the manuscript requires substantial edits for clarity and completeness, particularly in detailing the methods, improving the reference database, and justifying the metagenomic data analysis. In my opinion, also the Introduction need further modifications in terms of clarity and for the presence of numerous typos and inconsistencies that also need correction. Here some comments that to be included in the revision. Finally, please double check the references in the text because sometimes are not well formatted
Line 30: “imminent” I don’t think it is the correct word here.
Line 31: eliminate “bulk”
Line 33: the “diversity” not biodiversity
Line 47: “standardized monitoring method” is present twice.
Line 54: you can delete “North Sea”. The statement is valid for every basin.
Line 92 and following: be consistent all across the manuscript when you write “Nanopore sequencing”, “Oxford Nanopore sequencing”
Line 99: why do you specify “several bacteria taxa” since the paper is dealing with macrobenthos diversity?
Line 109: genetic diversity not biodiversity
Lines 214-125: you need a reference for that or you have to specify why exactly 200 K reads
Line 349: Crepidula fornicata
Lines 354-356: do not seem as stated looking at the figure.
Line 368: what do you mean?
Lines 460-462: this sentence is not clear
Lines 480-481: rephrase
Based on these comments my decision is Major Revision.
First, I must apologize as I submitted my review late. I hope the authors find my comments useful.
Doorenspleet, Aikatarini Mailli, and colleagues examine the performance of next-generation sequencing methods against a morphological assessment using macrobenthos samples representing high, medium, and low diversity collected in the North Sea. They used amplicon-based assays targeting the mitochondrial COI for Illumina MiSeq and Nanopore sequencing, and further this effort with metagenomic sequencing of extracted DNA using the Illumina NovaSeq. They found that data generated by the MiSeq and Nanopore were generally comparable, but this differed from the metagenomic analyses which showed lower diversity and recovery of known species in comparison to the morphological assessment. They conclude that Nanopore amplicon sequencing for metabarcoding diversity analysis is a viable option, which is often cheaper and provides real-time sequencing compared to the Illumina MiSeq. They mention that the metagenomic approach requires more work in terms of generating better reference databases and is not justified in terms of cost and sequencing output.
I applaud the authors for their vision in providing readers with comparisons of alternative biodiversity assessment methods, from standard methods such as the MiSeq to portable real-time amplicon sequencing and metagenomic comparison, which is a less available option to many. The treatment of amplicon data through Nanopore and MiSeq analysis is interesting and showcases that Nanopore can be readily achievable even for those without access to other NGS platforms. That said, I found the manuscript rushed and far from polished with quite a lot of editing needed in terms of typos and errors scattered throughout the manuscript. Far beyond that, I found the treatment of data and analysis not justified based on the usage of the reference database and restricting the metagenomic analysis. I point out suggestions for improvement:
1. Reference Database Limitations: The use of the bespoke North Sea reference database across all sequencing methods is too limiting and introduces bias. Authors should cast the taxonomic net wider. This is evident by the 25 species identified morphologically that were not recognized using the DNA-based methods. It is also unclear in the paper which 25 species these are. Did the authors attempt to include these barcodes in the database with Sanger sequencing? By including a more substantial dataset, authors could at least identify the ASVs/sequences to family, order, genera, etc. This would provide a greater assessment of the community structure rather than returning unclassified sequences
.
2. Weakness of Metagenomic Data: The metagenomic data is weak and unjustified in any practical sense. The authors need to provide more information on the proportion of reads assigned to mtDNA and nuclear DNA. This will provide an understanding of the distribution of reads and how the sequencing of the COI performed. Aside from the minimum request above, the authors should re-attempt a full assessment of metazoan reads and provide taxonomic profiles. Carrying out a like-for-like comparison based on the 1992 COI sequences is inherently flawed as it doesn’t make any practical sense to go in this direction for any biodiversity assessment.
3. The methods require more clarity: The authors need to avoid referring to the COI fragment as the Leray fragment, as this is unclear and may confuse readers. It is essential to specify which primers were used in the study. The popular COI primers used for metabarcoding studies are mICOIintF (Leray et al., 2013) and jGHCO2198 (Geller et al., 2013). The authors should clarify this basic information explicitly in the methods section. It appears that the authors used a primer with the sequence GRTTYTTYGGHCAYCCHGARGTHTA, but it is unclear where this primer sequence originates. Providing detailed information about the primers used, including their sequences and sources, will be helpful for readers and avoid the need to consult other papers for this information.
Specific Comments and Suggestions:
Abstract
• Lines 34-36: Focus more on the method, e.g., amplicon sequencing using Illumina MiSeq and portable real-time sequencing of Nanopore versus shotgun sequencing using NovaSeq.
• Line 36: Remove the terminology "Leray region", it’s the mitochondrial COI metabarcoding fragment and typically amplifies a 313bp region of the COI (as stated in Figure 1), not 303bp.
• Lines 46-47: Duplicate phrasing.
Introduction
• Lines 106-108: Reference needed for this statement.
• Line 155: Difficult to understand Table S2 and how it relates to the three institutes.
• Line 156: Do the authors know how the DNA was extracted by Naturalis Biodiversity Centre? This would inform readers on how to handle such sample types.
Methods
• Line 153: How many grams were homogenized on average per sample?
• Line 169: Still unclear what primer combination was used. Detail primer sequences in text and expected amplicon length.
• Line 181: Describe any modifications of the primer sequences, if any, used for bridge amplification of the indexes.
• Line 185: Detail which MiSeq reagents (v2, v3, etc.) and the percentage of PhiX. Change 2*300bp to 2x300bp.
• Line 193: References for chimera detection software.
• Line 204: Include the fmol concentration range for the amplicon input.
• Line 216: Add “Nanopore” sequence read processing.
• Line 218: Bracket only the year for Doorenspleet et al., 2023.
• Line 225-226: Where does this primer come from: GRTTYTTYGGHCAYCCHGARGTHTA?
• Lines 252-254: Detail the TapeStation reagents used and sequencing length, e.g., 2x100/150/250 bp.
Results
• Line 284: What is the reason for the low number of reads? MiSeq v2 generally ranges between 13-15 million reads and v3 between 22-25 million reads.
• Line 382: Missing a closed bracket.
• Lines 369-372: Figures Fig. 5a and Fig. S4 do not show this data described in the results.
• Lines 383-384: Figures do not reflect the collection analysis of the metagenomics method.
Discussion
• Lines 405-408: Can the authors elaborate on the previous work that showed Nanopore misses certain taxonomic groups? What is the context of this statement with regard to markers used, library preparation kits, and comparisons made? This is a misleading statement without context.
• Line 408: Remove open bracket.
• Line 413: What does 9 vs 2 respectively refer to? Fig. S5 is unclear in this context.
• Lines 415-417: Were the amplicons duplicated for MiSeq and Nanopore – how did the primer design differ between amplifications?
• Lines 421-422: Reference needed to qualify this statement.
• Lines 428-429: Personal communication with whom?
• Line 466: COI is not an rRNA gene.
• Lines 471-472: Without understanding the proportion of COI reads in comparison to other mtDNA genes and nuclear genes, this statement is vague. The proportion of COI reads can differ, and it may not be necessarily correct to advise that more sequencing depth is required if getting 200M reads per sample.
• Lines 480-481: Statement is not clear - requires revision.
• Lines 488-490: This statement is too optimistic. It reads as a throwaway statement that attempts to qualify the revolution of using metagenomics, but the percentage of usable reads, cost, and infancy of work (as the authors state) to generate mitogenomes is beyond practical applications.
• Line 494: What are the counts of these 25 species? Did they make up a substantial proportion of the identified species?
Figures
• Figure 1: Impossible to figure out where the samples are collected based on that map and caption – needs revision. It may be more appealing to show the images of the sequencing platforms rather than names of the instruments used.
• Figure 4: Difficult to read the plot in my version of the manuscript.
no comment
no comment
The manuscript is fine. See additional comments for minor suggestions.
Everything is rigorous. See additional comments for minor corrections.
The manuscript explains the relevance of the findings pretty well.
This manuscript by Doorenspleet et al. is a timely comparison among three molecular methods and the morphological approach for molecular assessment of marine benthic metazoan biodiversity. Their methods are sound and their results are clear and conclusive, and they will be of interest for many ecological researchers that are currently wondering which molecular method is best for biodiversity assessment of marine eukaryotic communities, and possibly other ecosystems, since their results are probably translatable to other eukaryotic communities such as freshwater or terrestrial soils.
The manuscript is in general, well written, and provided that some minor corrections are addressed, which I detail below, I think that it can be publishable in PeerJ.
Minor corrections:
-Abstract:
L36: Correct the length of the fragment to "The 313 bp COI Leray region"
L47: Remove duplicated words: "standardized monitoring method."
-Introduction:
L87 and L89: Correct "Illumina MiSeq" to "Illumina technologies". I do not think that "Illumina MiSeq is currently the standard platform, since many laboratories have moved to Illumina NovaSeq and they are not using the MiSeq anymore. "Illumina technologies are currently the standard platform" and "In comparison to Illumina technologies, the Oxford Nanopore…"
L116: Correct to: "paired-end Illumina MiSeq metabarcoding"
-Materials and methods
L137: What was the full volume (size) of the Van Veen grab? Please specify.
L144: Anthozoa is currently considered a subphylum (McFadden et al., 2022), with two classes: Hexacorallia and Octocorallia. If anthozoans were not further classified in this study, then you should correct the word "class" to "subphylum". If they were classified either as Hexacorallia or Octocorallia, then it is fine to keep the rank "class", even though it can be a little misleading in this context. [McFadden, C. S., van Ofwegen, L. P., & Quattrini, A. M. (2022). Revisionary systematics of Octocorallia (Cnidaria: Anthozoa) guided by phylogenomics. Bulletin of the Society of Systematic Biologists, 1(3), 1–79.]
L162: Correct the details of the incubation instructions: "were incubated overnight at 56 °C in the power-beads tube of the DNeasy PowerSoil Kit (QIAGEN), supplemented with 10 µL of proteinase K (20 mg/ml)."
L169. With the current description, it is ambiguous and difficult to know which primers were used for the amplification. I can see from Van den Bulcke et al. (2021) that the original Leray primer set were not used, but a modified version, produced by replacing the deoxy-inosines (I) by totally degenerated bases (N). These are not the primers from Leray et al. and you should specify this here clearly. Specially, since you used the KAPA HiFi Taq polymerase, which does not work properly with deoxy-inosines. So, please rewrite these sentences.
L268: Change "normalized" to "transformed" in "Therefore, the data were not rarefied but transformed using a log10 transformation".
-Results:
L290: "3,060,417,120 data" is ambiguous. Change to "shotgun reads"
L349: Correct the name: "Crepidula fornicata"
L354: Correct the name: "Nephtys cirrosa"
L369: Delete the first "the" from "the both the"
L375: Correct the name "Scolelepis bonnieri"
-Discussion:
L466: Remove "rRNA" from "contained only COI sequences" or explain it more clearly, in the case that the database contained sequences from COI and other rRNA markers.
L486: Correct "the mitogenomes of mock community were available"
L522: Correct the name: "the Anthozoa Cylista"
-Data Availability:
L567-569: Improve the syntax of the data availability statement. Remove the first "are available", remove the s from "Custom"
Raw data will be shared but cannot be accessed at the moment. The site that will host it offers raw data plus necessary metadata so it should be possible to replicate the findings.
There are a few aspects that need further clarification: If the paper aims to compare different DNA methods amongst themselves and morphological identification, should the authors have tried to give each method the possibility to perform at its best? One aspect is database completeness. If the main factor explaining the differences in performance is lack of matches, then the findings are only valid until the next dump of North Atlanctic sequences on Genbank? Should the authors have generated the reference sequences from the speciemens they found with morphological sorting? Otherwise, should the comparison between MiSeq and Nanopore will also be performed at the OTU level? In terms of statistical analysis, the author performed a PERMANOVA analysis (amongst others) with platform (Miseq, Nanopore and Novaseq metagenomics) and Site as the factors. This might not be thew right analysis - at least it does not seem right to have technical and biological sources of variation at the same level? Should it be a nested analysis, in which we test the ability of each technology to cluster together within the same site?
The script shared in the github repository is not well annotated, so it is hard for me to know which input data it uses, and how. It will be nice to have a script with all the analysis performed
As they are now, the findings of the Metagenomics aspect are not convincing: the authors use only the COI slice of the metagenomics experiment, arguing a large rate of false positives. Is this a problem with the reference database? because if it is, then the authors could have expanded their search to other curated databases? A more troubling explanation would contamination levels that don't show in your curated database because it is locus and regionally restricted.
The MiSeq data was annotated as well using a regionally constricted database,
There are more comments and a few typos added in the attached revised word Document.
Besides that - I would like to know if the consensus sequences reached with Nanopore clusters from decona were the same to those that mapped to the same taxa in MiSeq. Also, are OTUs shared between both technologies, even though they did not match any sequence from your database?
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.