Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on September 25th, 2024 and was peer-reviewed by 3 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on December 5th, 2024.
  • The first revision was submitted on February 13th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • A further revision was submitted on March 19th, 2025 and was reviewed by the Academic Editor.
  • The article was Accepted by the Academic Editor on March 19th, 2025.

Version 0.3 (accepted)

· Mar 19, 2025 · Academic Editor

Accept

All remaining issues were adequately addressed and the revised manuscript is acceptable now.

[# PeerJ Staff Note - this decision was reviewed and approved by Valeria Souza, a PeerJ Section Editor covering this Section #]

Version 0.2

· Mar 12, 2025 · Academic Editor

Minor Revisions

Please address the remaining concerns of reviewer #2 and revise the manuscript accordingly.

Reviewer 1 ·

Basic reporting

The authors did a great job of satisfactorily addressing each of my concerns. I have no additional comments to make.

Experimental design

The authors did a great job of satisfactorily addressing each of my concerns. I have no additional comments to make.

Validity of the findings

The authors did a great job of satisfactorily addressing each of my concerns. I have no additional comments to make.

Reviewer 2 ·

Basic reporting

1. Throughout this round of responses to the reviewer, the authors note that the ML models are supplements to GSMM approaches, particularly when dealing with complex communities where combinatory explosion would be an issue (highlighted by table 1). This was to address a review misunderstanding of the intent of the work. While clarified in the rebuttal, this is not clarified in the text until reaching lines 244 to 248. To prevent other readers from arriving at the erroneous conclusion, a similar note should be included in the introduction.
2. In Figure 1, linear regression measurements and their p-value are identical to that of the pearson correlation and corresponding p-values. It appears that this column is unnecessary.
3. What is the statistical significance threshold used? P-values are often reported, but the threshold for significance is not easily identifiable.

Experimental design

Given that there is the intent to supplement GSMM approaches due to lower computational cost/time associated with the resulting ML model, a particularly a comparison of computational time or resources required. This could be a strong addition to the argument made as to the purpose of the work, without significant revision to the paper, or significantly more experiments.

Validity of the findings

The rebuttal and rewrite are satisfactory in this regard.

Additional comments

Overall, rewrites, where they occurred, contribute meaningfully to reader understanding.

Version 0.1 (original submission)

· Dec 5, 2024 · Academic Editor

Major Revisions

Please address concerns of all reviewers and amend manuscript accordingly.

Reviewer 1 ·

Basic reporting

1. The second paragraph in the introduction is misleading (lines 51-58). This paragraph seems to show the complexity of designing and engineering microbial consortia in vitro, while eluding the possibility of using their computational method or GSMMs+CBM to design computationally. I think this conveyed information could be misleading because either their method or GSMMs+CBM can only be used to predict the metabolic fluxes of a species and infer the possible interactions between pairwise species; no community assembly process required for engineering microbial consortia can be captured by either method. Therefore, I believe the authors should avoid mentioning too much about the possibility of engineering microbial consortia and community assemblies. Instead, it is better to only focus on capturing metabolic fluxes and pairwise interactions.
2. Although I understand how the number of possible combinations was computed in Table 1, I worry that other readers might be confused. Please highlight that these numbers are derived based on picking a fixed number of species out of 19 selected bacteria in the Methods section.

Experimental design

1. On lines 110-112, the authors wrote: “consortium. Additionally, we added to this encoding the predicted probability of a cross-feeding for each pair of bacteria in the consortia using the methodology described by Silva-Andrade et al [40].” It is not clear to me what is the meaning of encoding encoding the predicted probability of a cross-feeding for each pair of bacteria. I would recommend adding more detailed and necessary descriptions to explain (1) how the probability is calculated, (2) how the probability is encoded, and (3) what is the size of this probability vector.
2. What is the mZMB medium? What is the nutrient composition? Why is this medium used in the simulation rather than the typical Western diet that is often used in flux balance analysis?

Validity of the findings

1. It was not clear to me whether and how including the probability of cross-feeding helps predict the butyrate concentration. I suggest an ablation analysis by removing the probability of cross-feeding and comparing the generated predictive performance with the presented results with interaction probability so that its contribution can be determined.

Additional comments

There are some typos and grammatical errors (as listed below). Please double-check.
1. Line 23: “Genome-scale network reconstructions enable computation” -> “Genome-scale network reconstructions enable the computation”
2. Line 50: “production, its a promising research” -> “production is a promising research”
3. Line 102: “was employed using as reference its closest” -> “was employed using to refer to its closest”
4. Line 201: “least expensive approach available” -> “the least expensive approach available”

Reviewer 2 ·

Basic reporting

On "clear and unambiguous, professional english used throughout":
1. 1. Introduction: Acronyms used are somewhat non-standard. Examples include:
a. GENREs, usually this is discussed as Genome-Scale Models of Metabolism (abbreviated as GEM, GSM, or GSMM, depending on the source).
b. CBMs – Usually, these are called constrained-based reconstruction and analysis (COBRA) methods.
2. Overall: there is an occasional casual, non-scientific, feel to pieces of the writing. For instance, calling the experimental/in vivo data the “real” data.

On "literature references, sufficient field background/context provided":
1. Introduction, lines 69 to 77: It appears this is a rather limited survey of microbial community modeling techniques, listing only MICOM, COMETS, and the Microbiome Modeling Toolbox, then critiquing the computational expense in using them. There are several relatively low compute cost methods, such as OptCom and SteadyCom. A recent review, “Advances in Genome-Scale Metabolic Modeling toward Microbial Community Analysis of the Human Microbiome” published in ACS Synthetic Biology provides a table of optimization tools for community modeling. It seems the computational difficulty of predicting butyrate production from the members of a microbial community using GSMMs is the crux of the need for an ML model to do the same. This point does not seem to be effectively demonstrated, calling into question the need for such an ML model.
2. Methods, lines 101 to 103: This is not the most recent AGORA database, a new AGORA publication was released in 2023 entitled “Genome-scale metabolic reconstructions of 7,302 human microorganisms for personalized medicine”. As is evident, this database is one order of magnitude larger than the previous. It would be recommended to check model reconstructions against this new database.

Experimental design

On "original primary research within the aims and scope of the journal": yes

On "research question is well-defined, relevant and meaningful, it is stated how research fills an identified knowledge gap"
1. It appears no knowledge gaps is filled per say. This research replaces one type of model (a GSMM model with a group of ML models).

On "rigorous investigations performed to a high technical and ethical standard":
1. Methods: line 113: The use of MICOM is confusing here. It is talked about as if it is a framework for simulating microbial community dynamics; however, in the reference, the sentence appears “here, we introduce MICOM, a customizable metabolic model of the human gut microbiome”. Therefore, it seems that MICOM is a community model, rather than a community modeling technique. Further, the MICOM paper states “the described constraints are identical to the ones employed in SteadyCom. However, SteadyCom aims to predict microbial abundances from a list of taxa present in a sample, whereas MICOM predicts growth rates and fluxes and requires abundance as input”. Therefore, it seems more likely that the analysis here is performed with SteadyCom, because no mention is made of the necessary input to MICOM which is taxa abundances. Could the authors please elaborate on how MICOM is used as a modeling technique, rather than as a model, and how the method employed was not SteadyCom. Further, please elaborate on the taxa abundances were determined as an input to MICOM.
2. Figure 1 and lines 165 to 174: Figure 1 and lines 165 to 174 purport to discuss the same data, “correlation in butyrate production (mmol/l) of real vs predicted data” and “blind test using actual experimental data from consortia…”, respectively. However, the numbers reported here are different. For instance, line 167 states that the XGBoost model has a Pearson’s correlation of 0.738 for two bacterium communities, yet this number is 0.84 on the “testing” line in figure 1. Similarly, for a consortium of 13 bacteria, a Pearson correlation of 0.422 is given on line 172 (XGBoost model), but a Pearson’s correlation of 1 is shown in Figure 1. It seems clear that the figure is likely mis-labeled, because it seems what is shown is correlation of ML model predictions with the training and holdout datasets, not “real” data. Table 3 should instead be labeled as the comparison of model to the experimental data.
3. Quality of the training data: here, an ML model is trained on the results of more than 400,000 GSMM predictions. This inherently limits the quality of the ML models to that of the GSMs and how they are implemented. Key factors missing include how relative microbial abundances are determined, how GAM and NGAM values are set, carbon sources, what the limiting nutrient is, relative community member abundances (as an input to MICOM), and the rates at which these limiting nutrients are allowed to be uptaken by models. The last point can be particularly crucial as not all bacteria “eat” at the same speed. This calls into question the validity of the training data, which might also limit the model predictiveness.
4. Structure of the ML models used: additionally, key aspects of the ML models developed are missing including the number of hidden layers used.

On "methods described with sufficient detail & information to replicate": no, see some of the above comments.

Validity of the findings

Validity of the finding:
1. Inconsistencies in the results: In figure 1, it seems argued that there is a perfect correlation between the MICOM-based training data (many Pearson correlations of 1, particularly in larger communities) to the predictions from the ML models. However, if there is a perfect linear correlation between the MIMCOM-generated training data and many of the ML models, particularly for communities of 13 bacteria, then why is the linear relationships between the experimental data and MICOM and that between the experimental data and the ML models not the same?
2. Table 3: The potential problem with the training dataset could explain the poor behavior of the ML models. Indeed, there does not seem to be a very strong relationship between ML model behavior and experimental butyrate production, particularly for three-member communities. This is partially obscured in the text by highlighting the MICOM (the GSMM-based model from which the training data came) as the most predictive for the three-member community, despite the fact that it was not one of the ML models which is the desired product of this work. Indeed, the best Pearson correlation, according to table 3, for three-member community butyrate production prediction for any of the ML models is 0.015.
3. Despite only testing against three sets of data, two-, three-, and thirteen- member communities, there appears to be no consensus on which ML model performs best. There seems to This rather undercuts the idea of using an ML model rather than a community metabolic model if multiple ML models need be consulted.
4. One of the key advantages of GSMMs is not just to predict yields, but trace metabolic flow throughout a metabolic network. Is the increased computational efficiency for just yield predictions meaningful?
5. Units: How is butyerate production in mmol/L being calculated from GSMMs? General analysis techniques for these models produce results in mmol/gDWh. How are the gDW and hour units converted into L?

Additional comments

Summary of the work:
All butyrate-producing community combinations from 2 to 13 community members of a 19-member pool were analyzed using two existing methods. The first method is from a previous work by this same author which predicts interaction type between pairs of bacteria (reference 40), and the second method is the MICOM metabolic model of the human gut microbiome (reference 12). Similarly, most of the input training data was created using GSMMs which are reconstructed using the AuReMe tool (reference 1). Therefore, the innovation of this work is the generation of ML models of different kinds (ElasticNet, KNN, Random Forest, SVM, XGBoost) linking membership of a microbial community to that communities butyrate productions. This model was trained using the results of more than 400,000 GSMMs. It becomes clear when comparing against experimental data that no single ML model performs best, even though only comparing to three conditions.

Summary of reviewer comments:
Overall, the manuscript, in its present form, is missing quite a bit of crucial detail which is necessary to evaluate the training data and the developed ML tools. It seems to do little other than replace one type of modeling with another (e.g. GSMM with ML), without a particular increase in model predictiveness. Addressing this lacking detail, occasionally confusing results, and lack of a research question would be key to turning this into a quality manuscript.

Reviewer 3 ·

Basic reporting

In this manuscript, the authors propose a machine learning method for predicting if a microbial community will produce butyrate. This approach enables the prediction of butyrate production without the use of computationally complex and intensive community modeling approaches. Arguably, if broad experimentally measured production data was available, the method could be used to fit this data as well. Given that the community models are not that computationally intensive, this somewhat limits the utility of the proposed work, although this work may become far more important should we develop high-throughput methods for experimentally measuring butyrate production. Also, if such models were constructed for numerous metabolites, the computational benefits would add up. As such, this work should be of general interest.

Overall, the paper is well-written and clearly explains all analyses. Literature references are good. The paper is well-structured.

Experimental design

I have only a few minor comments about experiment design:
1.) The authors chose 19 strains as the basis for all their analysis, but the criteria for selecting these 19 strains is never explained. The authors should describe why they chose these specific strains.

2.) Regarding the suite of methods used to train the machine learning model, do the authors have any criteria or statistics to evaluate that the specific set of algorithms and parameters applied yielded the best results? Could the authors explain more about why these methods were selected? Why did some methods outperform others? How were method parameters tuned?

Validity of the findings

I have only one question about validity of the findings:
1.) It appears that larger consortia performed better, which is a surprising result, given larger consortia are more complex. Can the authors explain why larger consortia perform better?

2.) How does this approach deal with the scenario where butyrate is produced by one member of the consortia and consumed by another? Is this considered positive production?

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.