All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thanks for answering all my comments.
# PeerJ Staff Note - this decision was reviewed and approved by Vladimir Uversky, a PeerJ Section Editor covering this Section #
Dear Dr Ankur and Green
As both reviewers have accepted you review, I am almost ready to recommend for publications. I have some last modifications, in the large part to improve readability.
I also noticed something that both reviewers missed, and that you might want to check to make sure you manuscript does not have a major error:
There is something strange. The largest bar are Stramenopiles who are eukaryotes, and perhaps those are chloroplasts. To be perfectly fair, I would say you should have removed those from the analysis, since very rarely we include those in the analysis of prokaryotic diversity. Perhaps you should try one subset of your analysis removing them to show that you will achieve the same conclusions, and mention it in the text.
Please just upload a version with the editorial modifications (or not if you don't agree, in which case I would appreciate a comment). I believe that there is an auto-generated deadline, but those changes should be easy to do
They have suitably responded to my review.
They have suitably responded to my review.
They have suitably responded to my review.
The authors have suitably responded to my original review.
I agree with reviewer 1 that rarification or normalisation of the data is necessary prior to the alpha or beta diversity measures, and that is the reason I consider you need major revisions. Both reviewers consider that the conclusion that the differences in Tms of degenerate primers do not appear to greatly influence microbial community analysis is important to be shared. I ask you to answer all reviewers comments in a revised version of the MS
Raw data are present and available in the SRA as specified.
See below.
See below.
The authors hypothesized that variable melting temperatures within a pool of degenerate primers could lead to un-noticed PCR biases. To test this, they adjusted the melting temperature of each possible primer sequence from a degenerate pool by truncating the end of the primer sequence. The authors also suggested that their primer design could be used in real sequencing studies to increase basepair diversity, by introducing offsets into the 16S rRNA gene amplicons.
I appreciated the main premise of the paper, as I’ve thought a lot about 16S rRNA gene primer biases but never really considered the possible effects of melting temperature from the highly degenerate primers currently in use. It’s reassuring, if not terribly exciting, that the authors did not find a substantial effect, and it would be beneficial for the field to know that this isn’t a cause for major concern.
I’m not personally sold on the utility or appropriateness of the truncated primer set for actual sequencing studies – if I were going to go that route, I’d opt for the Lundberg et al. approach of adding nucleotides to the linker region that the authors discuss. I don’t feel that this manuscript sufficiently tests the effects of the truncated primers on environmental samples to put this forth as a viable option (compare with the level of verification needed by Apprill et al. and Parada et al. for context), and I’d encourage the authors to downplay or outright drop this aspect of the manuscript.
There are a number of aspects of this manuscript that would benefit from clarification, specified below.
One major concern: are your samples rarefied or randomly subsampled to an equal number of sequences? If not, the results need to be reanalyzed with that correction applied. Varying sequence numbers between samples are common but confounding with Illumina sequencing. Further, as far as I can tell, the authors might be working with absolute sequence abundance (based on Tables S6 and S7), which is highly problematic if the authors haven’t rarefied their sequences. Essentially every environmental microbiology paper in the last decade has worked with relative abundance of OTUs because we don’t trust the sequencers to be absolutely quantitative.
Specific comments
Lines 27-28: As noted above, I would discourage this line of discussion in general. But if you want to keep it, rephrase this bit to something like “with minimal addition of exogenous templates.” I read this and was immediately confused about how you’d manage cluster generation, error assessment, etc., without the PhiX spike that the sequencer uses to calculate these – realized in the manuscript proper you’re suggesting less rather than no PhiX, but this reads like the latter. Also Line 70.
Line 75: I don’t feel Caporaso et al. is a useful citation here, as they used the older rather than newer primers, and you didn’t use the EMP barcoding scheme.
Line 120: Please clarify here how the were libraries divided between the two sequencers. Because this seems like a pretty significant confounding factor for a study that’s this focused on the nitty-gritty of sequence analysis. I know it’s in one of the SI tables but you can head off a lot of concern by providing more detail here.
Line 172: Was there a target range of Tm similarity (e.g., within 2 degrees)? Was this the most you could adjust the Tm while meeting the requirements outlined in Lines 164-165 (in which case, help yourselves out and specify that)? Otherwise it doesn’t seem like a very large adjustment in overall range.
Lines 218-220: I don’t feel this is a major caveat – many of the users of the EMP primers purchase individually barcoded primers anyway, as this is the approach the EMP recommends. Your method will still be less expensive than that.
Line 237: To me, the fold-changes reported in Table S6 are at least as interesting as the MDS plots in Figure 3, and I’d like to see them in the manuscript proper. A column chart of the fold-change in relative abundance for select OTUs from standard EMP to shortEMP would be a nice, simple addition.
Line 247: I care less about the within-primer set repeatability and would rather see the Bray-Curtis similarity between primer sets reported.
Lines 241- 261: This section is confusing. It sounds as though you selected an annealing temp and moved on, but then in Line 253 you’re back to talking about the effects of annealing temp. Can you better separate the results of the annealing temp test series and the tests conducted at the optimal annealing temp in this part of the results?
Lines 260 – 261: I don’t buy that, community composition is vastly more important than alpha diversity when it comes to assessing complex microbial communities. If you just want to say that it made sense to continue with your study, fine.
Line 271: What advantage does the ideal score or expect score (be consistent with the terminology) offer over a fold-change or percent difference value that the reader could understand without having to look up a reference to try to figure out the math?
Line 313: And 3’. (Which is potentially more problematic, don’t omit mentioning it.)
Line 318-320: This is a nice hypothesis. This manuscript doesn’t test it. You don’t talk at all about, for example, whether the shortEMP primers pick up novel taxa relative to the EMP primers in the environmental samples.
Line 330: Lake Michigan isn’t a marine system.
Figure 1a: Label 5’ and 3’ ends to make it easier to understand.
Table 1: You should specify somewhere what the ANOVAs are testing (I can guess but can’t verify if I’m guessing correctly).
Figure S1: This is problematic. As I understand it from the legend, the significance test is for within-group variability, but the way it’s presented on the figure looks like it’s testing the median Bray-Curtis Dissimilarity between the two primer sets and finding them to be highly significantly different. Please revise.
Tables S6 and S7: Means of what? I can’t find this in the text or the table.
Please see review below.
Please see review below.
Please see review below.
The manuscript by Naqib et al. aims to characterize the influence of differential annealing temperatures of common primers of 16S rRNA gene studies. The research question, while very technical in nature, is an important question since the applicability, if significant, is extremely far reaching since it such a common tool of microbiology and microbial ecology. The paper argues that an overlooked aspect of such studies is the variability of the annealing temperature required for the different variants of the degenerate PCR primers. The researchers experimentally examine the importance of this variability by trimming the primers of, theoretically, mostly inconsequential bases (in terms of which taxa are targeted). Additionally, they present an idea for how to increase the sequence base calling diversity to improve sequence quality in important positions.
Ultimately the annealing temperature analysis suggests that these modifications did not make a substantial difference in the sequencing output. Therefore, I think that the results will be interesting but not fully adopted, since many labs will not modify their approach a small gain; though it is ultimately ambiguous which approach is “better” , as suggested by the abstract. In fact, the results of a mock community analysis shows that the modifications perhaps made the PCRs less accurate. I find all of the approaches to be of satisfactory technical rigor, but I have some suggestions.
1.) The authors should consider how the predicted annealing temperatures influences which taxa are over or under represented in the mock communities relative to the input. E.g., are those with lower annealing temperatures amplified more, or vice versa, or no relationship. Such analyses would get at the mechanism and allow the highly technical study to be applied and understood more broadly.
2.) The authors should provide data that shows that PhiX is actually needed to increase diversity on state-of-the-art MiSeq/HiSeq runs (i.e., 2018 software, hardware, chemistry). Is a full 20% PhiX still commonly used? I thought this problem was not currently as severe as the authors state, and they do not provide a “control” to show that the sequencing quality is poor when the steps they take are not implemented and/or PhiX is not used.
3.) It appears that all the annealing temperature calculations are for just the 16S primers, but in fact the initial amplifications are done with Illumina sequencing primers attached to the 16S specific primer. How do the common sequences that are added to the 16S primers change their annealing temperatures? It seems that they would influence strongly but I don’t think that this is mentioned. It might be argued that the annealing temperature of the 3’ end of the primers (i.e, the 16S specific portion) is what matters for amplification. With that logic, then the 3’ end of the 16S-specific primer is also probably more relevant that the 5’ end of the 16S-specifc primer. How does that vary between the primers. Can a moving average of the annealing temperature of the primers be calculated? If so, it might help understand the mechanisms for the differences that are observed.
4.) It would be great if the authors double checked the first time the respective primers were used in the literature. For 515F, this should be cited: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-38. And for historical purposes for the 515F, might also consider citing https://www.pnas.org/content/pnas/82/20/6955.full.pdf and making a historical note (use of an only slightly different 515F in the mid-1980s).
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.