All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thank you for carefully addressing the reviewers' comments.
[# PeerJ Staff Note - this decision was reviewed and approved by Valeria Souza, a PeerJ Section Editor covering this Section #]
Authors have addressed my comments.
No comments.
No comments.
No comments.
Nothing further to add
Nothing further to add
Nothing further to add
Thank you for your efforts in trying to address the many comments that were previously provided by the Reviewers. They have all commented that your revised version is markedly improved compared to your initial submission. Nonetheless, the Reviewers have highlighted a few lingering issues with your data analyses and interpretation that need to be addressed.
-
-
-
Authors have addressed my previous concerns and the manuscript is improved greatly. I have one final comment on the revised manuscript:
lines 255-259: Authors are over-interpreting the increasing trend in the variance explained by principal coordinate 1. This increase only reflects the ability of the PCoA graph to display more variation in the data; i.e. the data is better represented by 2D graph. It does not necessarily tell anything about the separation between the groups. The separation between the groups can to be measured by PERMANOVA (adonis test) R-squared (variance explained by the grouping) or other similar tests. Note that adonis test is not able to account for repeated sampling.
I believe this is an impressive level of re-do on the resubmission. I have no further issues with this article.
no comment
no comment
Congratulations on the vast improvement of this paper.
The English is much improved
I am confused by the new analysis where authors have divided the sampling span into seven time intervals.  It is not clear if repeated measures (i.e., samples from the same individual appear in this analysis)?
I am not clear if the LME adjusted for multiple comparisons?
no comment
The issue of repeated measures appears to be outstanding and will skew/inflate the P values.
Figs 2 and 3 compare changes over time within each group, but not between groups.  Therefore, I am unable to see how the authors can make claims about about NEC and LOS showing differences to controls when this direct comparison was not performed.  This is partially addressed in Fig 4, and it is interesting that the Shannon diversity is higher in controls in early disease, but one wonders if this simply reflects the use of antibiotics in the disease groups (which are known to reduce diversity).
My earlier point about adding centroids to the PCoA plots has not been addressed.  Additionally, the authors should add the P values to these plots and inline with the above comments need to ensure they do not have repeated measures in these plots.
The authors have not stated which rarefaction level they used - this needs to be added to the main text.
I wonder if Fig 6 could be improved by making it clear which genera were significant and if they were higher in NEC, LOS, or controls
The reviewers have raised major concerns on your study. While one reviewer reported that the manuscript should be rejected, in light of the other comments put forward, we are offering you the opportunity to address their many concerns. 
Please note that if you intend to re-submit the manuscript, it will be necessary to carefully address the many issues raised, including data re-analyses. Further, the quality of the language in your manuscript is poor, and will require considerable re-writing, and major editorial revision by someone fluent in English.
Authors present a cohort of preterm babies followed by longitudinal stool sampling at NICU. Small subset of babies developed NEC (n=4) or LOS (n=3) which are then compared to healthy controls. The presentation of the data, bioinformatic and statistical methods used and English language are currently inadequate for publishing the manuscript.
The introduction is jumping between different topics and many citations are old and appeared in low impact journals (e.g. Matamoros et al.). There is abudance of recent work in infant gut microbiome which should be cited to introduce readers to the field. The importance of sentence on lines 97-100 is unclear. I suggest starting with couple of sentences on post-birth gut microbiome, briefly defining the conditions in focus (NEC and LOS), describing previous gut microbiome work on these diseases and leading up to the motivations and goals of the current study.
Lines 326-328: Authors are overselling their study
Authors use an outdated method for processing 16S rRNA gene sequecing data. Independent comparisons operating with mock community data have shown that uparse is prone to false positive OTUs (see e.g. PMID: 27822515). Authors should process the 16s amplicon data using an up-to-date method such as DADA2 (PMID: 27214047) or deblur (PMID: 28289731)
Figure 2 legend is missing and any information on statistical test used is missing even though p-values are reported on panel B. Panel B is also mostly unreadable since the relative abundance of most taxa is low and the bars are not scaled accordingly. In Methods, explain the statistical methods used and how do they account for longitudinality in the data.
Authors compare bacterial OTU counts between groups but this is not feasible since the groups differ by size and lacks any statistical assessment. I suggest measure bacterial richness (number of OTUs) or other measures of alpha-diversity per each stool sample and conducting statistical testing between groups. Longitudinal nature of the data (repeated measures from same subjects) needs to be taken in to account in selecting the statistical test used. Similarly, comparing mean relative abundances between groups is not adequate since it doesn't account for repeated measures. I suggest using mixed effects linear models or something similar to conduct statistical testing for relative abundances.
BioProjecgt PRJNA470548 was not found in NCBI SRA.
A small scale study of NEC and LOS in preterm infants, showing the difference in basic metagenomes between controls and NEC/LOS. Clear and unambiguous, professional English used throughout. Sufficient field background/context provided. Professional article structure, figs, tables. 
However, some novel data is first presented only in the discussion section.
Raw data (link to SRA) has been shared privately. However, a link to the SRA does not appear to be with the paper text. There needs to be a mention of the SRA ID in the text. Apologies if I missed it.
Very well done experimental design and research question. Methods sufficiently described. Sequencing depth (data generation) and analysis pipeline is appropriate to achieve aims of analysis.
However, results reveal that the design was underpowered for full 16S analysis across the two treatment and one control group. Thus, few significant results. Appropriately powered to generate a hypothesis, seems appropriate for a small journal such as PeerJ.
Largely negative/inconclusive results. Small (but significant) differences seen between case and control groups. Conclusions from the data are not overstated. Are reasonable, given the results.
Since NEC average onset was 16 (within the 14-21d range) and LOS was 12 (within the 7-14d range), why is LOS not showing a similar significant trend for the 7-14d range in a similar way to the 14-21d range shows significant differences in NEC?
Lines 285-287: Introducing new data in the discussion section. Why are these figures not reported in the results section?
Line 39: "dysbiotis" should be "dysbiosis"
Line 46: "rest17" should be "the rest (17)"
Line 67: delete "For" and start sentence with Preterm infants, who are prone...
Line 255: "less significant" -- the p value is > 0.05 so it is not a significant finding. Please correct the text.
Line 269: change "little" to "few"
Line 270: change "that" to "those"
Line 279: I believe you mean "7,472,400"
Line 317: "<Fig. S2" remove "<"
Figures:
S1a-c: Order of the legend is confusing. Please re-order the legend so that items in the legend from top to bottom are in order of days (i.e. 0-7 on top, and >28 on bottom)
Fig 2: Since PC1 and PC2 are such a high proportion, there is no need for S2b and S2c. S2a is sufficient to show the trend. Other figures are distracting.
The English is not acceptable (see general comments for some additional information)
No P values are provided in the abstract and so there is no way of knowing if the reported diversity of taxa %s are important.  The same largely applies for the main results.
Sometimes the authors report 3 NEC and 4 LOS, and other times they report 4 NEC and 3 LOS.
With only 24 infants and a total of 192 stool samples, this study falls well short of most studies published in this area over the past years.  As an aside, many of these recent larger studies are not cited in the manuscript.   Additionally, with only 4 NEC and 3 LOS infants, the study is greatly underpowered and fails to add substantially to the current literature. 
It is not clear which samples were from the diaper or from the perianal skin surface. Furthermore, the latter represents a unique and none validated sample – is this representative of gut or stool?  I also wonder about the ethics of collecting perianal samples using a spatula from extremely preterm infants.
With read lengths of ~500bp there is likely to be a high error rate in the data. Pat Schloss explains this perfectly in his excellent blog post - http://blog.mothur.org/2014/09/11/Why-such-a-large-distance-matrix/
How were the samples normalised?  There is no mention of rarefaction or other?
No P values are reported for the alpha diversity or taxonomic comparisons, with the exception of the random P value on lines 234 and 235.  For the second P value the authors report “weakly significant”, but this P value was 0.11 and is therefore not significant.
The English is generally poor throughout making the manuscript challenging to read.  While I cannot go through the entire text line-by-line and improve the English, from the abstract alone here is a selection of some errors:
-	line 14 I think the authors mean “the remaining 17 were..”, not “the rest17”.
-	Sometimes abbreviations are used and other times they are wrote in full, e.g., line 50 Late-onset sepsis should be LOS as it has already been abbreviated, in the conclusion for both NEC and LOS
-	Line 50-51 “…held the least diversified gut microbiota” is poor English, similarly line 52 “with the control group held the most diversified one” is also poor English.
-	Line 54 “Both two groups”
It is not clear why sometimes rRNA is used and other times rDNA is used.  The authors should be clear this is 16S rRNA gene sequencing and use this phrase throughout (e.g., in place of rDNA). 
It is not stated what organism was cultured in the third LOS infant.
Centroids should be added to Figure 4, otherwise it is impossible to see what is going on.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.