All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
I agree with reviewer 4 that the manuscript has improved and gladly suggest it to be accepted. I also encourage the authors to share the code used for Bayesian calibration together with the model output or code to recreate the actual full set of simulations, which also reviewer 4 suggests.
[# PeerJ Staff Note - this decision was reviewed and approved by Dezene Huber, a PeerJ Section Editor covering this Section,who added 'This is a paper for which publication of the reviews would be particularly helpful, so please encourage the authors to do so.' #]
The style of writing in the manuscript has been improved and the issues with clarity and English use have been mostly addressed. Literature references and background provide good context for the modeling study. The article structure and organization are good, and figures and tables are clear and work well.
In terms of raw data sharing: The model code is available on Github and the meteorological drivers are included in supplementary material. However, the code used for Bayesian calibration is not posted, and the model output or code to recreate the actual full set of simulations is not made available. For complete data sharing, I would suggest adding the code used for calibration to the Github repository and posting the model output and/or a script for running the actual simulations that were analyzed in the results.
The design of the model is sound. The calibration procedure is not very robust due to lack of data, as I noted in my previous review. I do think that the revised framing of the calibration procedure and interpretation of the results has mitigated most of these issues. The simulations are now explained more clearly as an initial test of the model, and the results are not interpreted as being quantitatively predictive. I think the Bayesian calibration procedure does make sense when viewed as a strategy for balancing the different model parameters to produce a simulation that is relatively close to steady state for the measured pools, for the purpose of investigating the model’s behavior, and the paper now acknowledges that this calibration procedure should not be viewed as a strong constraint on actual model parameters or quantitative predictions. Overall I think this framing addresses the main issues with the calibration procedure that I brought up in my previous review.
I think the manuscript has been improved in terms of the framing of the conclusions and results. In terms of underlying data, as I noted above I would suggest providing scripts and/or model output data for the calibration procedure and the actual simulations that were analyzed in the manuscript.
Overall I think the revised manuscript gives a good overview of the model and demonstration of how it works, including both strengths and weaknesses of the model. There are certainly areas where the model and its application could be improved in the future, but I think the revised manuscript does a better job of acknowledging the weaker areas of this initial model application and sets a baseline for future improvements.
The resubmission has now been assessed by three reviewers, one of whom one is new (reviewer 4) since only two of the three original reviewers assign for a second review. Reviewer 4 raises a lot of concern regarding the parameterization and of the interpretation of the calibration. Based on the comments by Reviewer 4 I recommend a new major revision of the manuscript.
Overall, the authors updates really improved the manuscript.
There is only one thing, which I missed in my previous assessment.
"... soil structure as ecosystem engineers are predated...", this is the first mentioning of the term ecosystem engineer.
From ecology, ecosystem engineers are organisms that creates, significantly modifies or maintains an ecosystem.
Sure earthworms are doing that, but so are beavers, sphagnum mosses, kelp and some bacteria.
It would be much better to replace ecosystem engineer with earth worms throughout the document, since that is what you are referring to. But of course acknowledging that earth worms can be considered to be ecosystem engineers.
One reason I am assuming that is because you are allowing pH to directly affect the group, not sure that ants would be as affected by pH as earth worms are.
If not, then it has to be very clear that you in this study how you are defining ecosystem engineers.
Discussion needs more flesh.
I thank Flores et al for revising their manuscript and addressing my concerns in their revised manuscript. The revised manuscript has improved, and it complements well with their other review manuscript (Deckmyn et al.). I have mainly suggestions to improve their discussion a bit. For instance, they have three citations in total in their entire discussion. Authors should discuss their results with a little more context before the manuscript is accepted. For example, there were two recent review papers out highlighting the importance of soil predators in driving soil microbiome (Thakur and Geisen 2019, Trends in Microbiology and Erktan et al. 2020, Soil Biology & Biochemistry), which could easily be integrated in their discussion (lines 668-682). Another minor issue is the way authors claim KELYLINK as “one of the first and most ambitious attempts ….”. I will be careful and tone it down-particularly “the most ambitious attempts”. The authors should leave that to the readers to judge!
Language use throughout the article could use improvement. There are numerous grammatical issues throughout the text and the writing style at some points is less formal than typical for academic articles. I would recommend thorough proofreading of the text for grammar and style. In addition, at some points the text reads less as a scientific description and evaluation of the model and more as an advertisement for others to use the model.
Literature references are adequate, taking into account that the scientific context for the model, including a more extensive literature review, was mostly presented in a separate companion paper. There are a few specific model structural choices that could use more explanation or citation support.
The general structure of the model makes good sense in terms of process representation and the goals of the model. The integration of biological factors and their impacts on soil porosity and structure are interesting and well based in soil science. There are some issues with equations and organization of the model description, but overall I think the model design is sound.
The application of the model and calibration of parameters, however, have a lot of problems and I think the model simulations overall fail to reach accepted technical standards. The model has a large number of pools (13) and a lot of parameters that are poorly constrained by site measurements. Parameter estimation was conducted using a Bayesian Markov Change Monte Carlo approach using only 9 data points, of which several were literature-based estimates rather than site estimates. The study treats these 9 data points (representing a biomass number for each of 9 pools) as 99 points by assuming steady state across years, which I think is unjustifiable pseudo-replication. Overall, I think a Bayesian calibration of a model with 13 C pools using only a total of 9 data points cannot be justified. This is borne out in the results, such as Tables 5 and 6 which show huge ranges in parameter value distributions. In addition, mean pool values (Table 5) are not at all consistent with the data values (Table 2), suggesting that the calibration was unsuccessful. For example, fungal biomass was calibrated to a value of 15 g C m-3 (Table 2) but in model simulations has a mean value of 200 (Table 5).
It does not seem that there is enough data available for the study’s field site to actually constrain the parameters of the model, so it’s hard for me to say what the best path forward might be. I think that for this study to be successful the calibration procedure needs to be completely re-conceived and rewritten in a way that is consistent with the availability of data to constrain the model’s parameters. It is extremely important to evaluate the calibration procedure itself, which clearly failed in the results presented. Careful examination of posterior parameter distributions and the pattern of the MCMC results is important for establishing whether the parameterization procedure actually converged in a useful way.
Perhaps a more theoretical analysis of the model behavior, using reasonable parameter values and focusing on understanding interactions among different pools in more detail, would be one path forward. Overall I think this model is a good example both of the process insights that can be gained by increasing model complexity and an example of the difficulty of getting meaningful results from an increasingly complex model with parameters that are difficult to constrain.
One of the most promising aspects of the model is the connection between faunal engineers and soil structural changes. It seems like a missed opportunity to run the model only at a site with very low faunal engineer biomass, where this effect cannot be investigated fully.
The results section of the manuscript is very problematic. The poor design of the model calibration and the complexity of the model itself produced results that are difficult to interpret and inconsistent with reality. The interpretations of the results in the manuscript take an optimistic view that is not consistent with the actual results. Model simulations produced a very wide range of outcomes as evident in Table 5. Biomass time series (Figure 3) look very unstable and are characterized by short-term spikes in biomass of some pools, including huge variability in bacterial biomass, that is not realistic for an ecosystem-scale soil model. Because only the mean of multiple simulations is shown, it is difficult to tell how these time series varied among simulations (within the same scenario) but I expect the variability is very high, calling into question many of the interpretations of the results. It would help to see the variability in simulations behind Fig 3. I would not be surprised if individual simulations within each scenario were extremely variable due to parameter uncertainty (consistent with Table 5 and 6).
The analysis of the results is very short and quite shallow and does not address the key aspects of the different scenarios that were simulated. There is not really a meaningful analysis of the model simulations beyond a cursory, qualitative description. The results and discussion seem to start from the assumption that the model will be useful because of the processes it includes, and do not reflect an actual objective analysis of the simulation results. There is a lot of complexity in the model results which is mostly overlooked. I think to meet scientific standards the results and discussion would need to be completely rewritten to reflect an objective assessment of the model results.
Overall, the results and discussion are more an argument trying to justify the value of the model’s structure than a real analysis of the model outcomes. The analysis of the calibration procedure does not take a serious critical view of the possibility that the Bayesian approach might not give strong constraints on the model parameters. Part of the value of a Bayesian analysis is that it provides estimates of parameter uncertainties, and insights about whether parameters could be well constrained by available data, and it seems that the analysis did not take advantage of this but rather started with the assumption that the model could be well constrained despite the limited measurements that were used.
Line 31: I suggest starting the abstract with a sentence about the scientific context or knowledge gap that motivate the model before jumping into the model description
Lines 34-37: This reads as more of an advertisement for the model than a statement about the science. In this study, KEYLINK was not coupled to another ecosystem model so it does not seem relevant to the study to say how it could be coupled.
Line 50: The model was not actually compared with a first-order model, so there isn’t evidence that it was actually a more successful approach.
Line 59: Century and RothC are not the oldest soil models that exist. It would be more accurate to just say that they are widely used
Line 102: I would start the methodology section with an overview of the whole model and how the pieces relate to each other as shown in Fig 1.
Line 114: If earthworms eat all soil, do they have access to all pore sizes, or only the bacterial pores? It is not clear from the description
Line 127-129: The statement about macroporosity should have a citation to literature supporting it. And this statement does not seem to fit with this section since none of the other pore size classes are described in terms of laboratory measurements.
Line 139: Equation 1 should have a + between the terms in the denominator, not a -
Line 153-155: Is evaporation assumed to be equal to potential evapotranspiration? This does not seem realistic since it ignores source limitation of evaporation as well as boundary layer and conductance effects that limit actual evapotranspiration
Line 199: Equation 9 should include terms for modifiers to gmax (temperature, pH, etc). The sum notation should indicate the index that is being summed over. The role of excreted faeces also needs to be in this equation, otherwise it states that all of the substrate is being converted to biomass growth which is untrue.
Line 205-207: An equation should be shown or referenced for fa
Line 208: Physical recalcitrance is far from a novel concept, and has been included in conceptual and numerical models prior to KEYLINK (e.g., MIMICS, and the passive pool in CENTURY)
Line 218: The model simulates increases in biomass of pools, not in population number
Line 225: It’s confusing to refer to temperature sensitivity in growth when gmax as presented so far does not include temperature dependence (Eq. 9). Adding these factors to Eq. 9 would make this clearer.
Line 235: This equation states that predation rate depends only on the predator’s total growth rate from all substrates, and not on the biomass of the prey. This does not make sense. If a predator feeds on multiple types of prey, then this suggests that all are predated at the same rate regardless of their different biomass amounts. This equation would make sense if it refers to the fraction of a predator’s growth from a single prey type, but that is not what the equation actually shows.
Line 248: availability was f_a in Eq. 9, but is noted as (a) here
Line 254-312: This description would be easier to follow if it was moved to be right after Eq. 9 (which describes gmax), or if there were a statement after Eq. 9 stating that modifiers to gmax were described below. Either way, Eq. 9 should show that gmax is modified by additional factors.
Line 259: Density-dependent microbial turnover should be a modifier on microbial death rate, not on growth rate
Line 271: This equation has a discontinuity in it at T=Tmax, where mT goes directly from 1 to 0. This doesn’t make much biological sense
Line 280-281: The text should reference literature supporting the pH effects on bacteria and fungi
Line 283-284: These equations do not seem correct. gmax would approach infinity as pH approaches the threshold in each case. For example, at pH=8.01, mpH for fungi would be 10. An exponential function would work better. Also, Eq. 18 defines a declining fungal growth at high pH and constant fungal growth at lower pH (< 8), while the text above refers to an increasing fungal growth at low pH.
Line 287: It is not clear why there is a linear response of engineer saprotrophs to pH but a 1/pH response for bacteria and fungi. Is there some literature support for this choice?
Line 291-293: This sentence does not make grammatical sense and needs to be rewritten.
Line 294: mrec should be defined when it is introduced
Line 303: Why was a linear equation chosen here and a power law above?
Line 314: This section is not really about closing the C budget, it is about the fraction of decomposed material that is converted to faeces (which also needs to be included in Eq. 9)
Line 325 and 328: Recalcitrance is not a conserved mass quantity and does not have a budget
Line 335: Based on the text, N limitation should be modifying r rather than gmax
Line 336: This equation states that growth rate is positively correlated with C:N of SOM, the opposite of the statement here that growth slows with lower N
Line 348-349: What specific parameter is “twice as recalcitrant” referring to?
Line 360-361: These units do not make sense.
Line 363-364: Eq. 31 burrow volume as being directly proportional to engineer biomass, in an instantaneous way. It does not make sense to pair this with a turnover rate. If rates are defined, then burrow volume needs to have both a formation and a turnover rate.
Line 369: Should these units be m3/m3?
Line 395: These units are also incorrect
Line 398: Can layer thickness change in the model? Based on other equations, it does not seem like it, so this statement is confusing
Line 423-446: This section does not actually specify how much DOM is leached
Line 472-474: Since KEYLINK has not actually been modified or calibrated for different ecosystems or coupled to any other ecosystem models, this statement is unsupported. I think this whole paragraph (except for the Github link) could be removed since it is mostly an advertisement of the model and not a scientific statement.
Line 480-481: This text should specify why multiple runs are needed (to explore parameter space). generally, I think this would be specific to the uncertainty in model parameters rather than general to the model itself, so I’m not sure it makes sense as a general recommendation.
Lines 506-531: I think these lines could be removed. It is not necessary to describe the basics of how Bayesian parameter estimation works. Previous literature could be cited instead.
Line 524-525: This is the only mention of a drylands version of the model, and seems irrelevant to the rest of the manuscript.
Line 536: Assuming that 9 data points are equivalent to 99 data points by assuming pool values are constant is unjustifiable. I think this is a fatal flaw of the parameterization approach.
Line 550-552: This is not specific to this study and could be removed.
Line 575: R is respiration rate predicted by the model. It is not a parameter. Should this be r?
Line 578: This calibration procedure used 9 data points to constrain 9 model parameters. There is much too little data for this to be a workable approach.
Line 582: There are not 81 data points. There are 9 data points.
Line 584: “no measurements were available” — measurements of what exaclty?
Lines 614-617: Clay, litter recalcitrance, and litter C:N are not included in Table 1
Line 621: What is LHS?
Line 606-624: Was the model calibrated to steady state for each scenario? Or did it use the baseline parameters for all scenarios? If the second is true, then the model was likely out of steady state for other scenarios, making the results unreliable
Line 630-634: Based on Table 5, it looks like the model parameters were not well constrained at all. The +/- values don't make much sense as they include negative biomass for most pools. I imagine the distributions are skewed, so it would make more sense to show a figure with actual distributions rather than report a standard deviation that is not a good measure of actual variance. Showing posterior parameter distributions would provide better information about whether the MCMC approach was actually effective at constraining parameters.
Line 635: What is meant by “relatively uncoupled?” They were poorly correlated with each other?
Line 637-638: The behavior of bacteria in the model is clearly unrealistic, with a huge biomass spike at the beginning of the simulation followed by death of basically the entire bacterial community.
Line 642: “bacteria would profit most from a rapidly changing environment” This doesn’t make much sense either in the model or in real life. Typical soils have large bacterial populations whether they are rapidly changing or not
Line 645: Looking at Figure 3, the opposite of this statement is true. C pools seem very unstable and are characterized by spikes and large fluctuations which suggests to me that the model is poorly balanced. One scenario lost 75% of SOM in the first year!
Line 651: Different soil layers should have the same long-term average temperature, although shallower soil layers would be expected to have wider fluctuations
Line 658-659: The study included six different model scenarios, but only one is discussed and only in one sentence here.
Figure 2: The diagram should show which pools are external to the model (tree shoots, litter, CO2) and which are actually model pools. From the text, it seems like the SOM pools shown (different POM sizes, DOC) are not actually in the model, so it is misleading to show these as model components in the diagram. The diagram should be consistent with the model that was actually used in the study.
The colors of the lines in Figures 3 and 4 are very difficult to tell apart.
Three independent reviews have been received of which two reviews are positive and one negative. I believe that the authors deserve a chance to respond to these questions, and therefore propose major revisions. In addition to the many questions raised by the reviewers, I think the authors carefully need to consider Reviewer 1´s comments on the existing literature on the dynamics of soil macro-fauna and dimension errors. Reviewer 2 also highlights an important issue with the dependency to the accompanying manuscript “review and model concept”, which will carefully be monitored. Despite knowing the final fate of the review and model concept manuscript, this manuscript needs an improved background section to be able to have a chance to stand alone.
I review lots of manuscripts with models, but when I meet dimension errors, I cease further reviewing. Unfortunately this applies to the current ms
Acceptable, but not great. A lot of literature exists on the dynamics of soil macro-fauna and the authors only mention that they found little information.
General: What I really like in Keylink is the attempt to integrate phsyical, chemical and biological aspects of the soil.
I do think that the result can potentially be valuable for management purposes and assessments.
When I see the term "model" as theoretical biologist, I think of a list motivated assumptions from which a mathematical model follows,
which properties can be studied and realism tested. To be honest, it was a bit of a disappointment to see that the paper is
in fact a description of computer code, and the result even depends on the sequence in which the various steps are calculated.
The many if-then statements make it impossible to study the mathematical properties. So it will be of little help to gain more
(theoretical) insight in soil dynamics, but can, nonetheless, still be usefull in practical applications.
I do understand that the type of mathematical model that had in mind can easily be too simple to match reality close enough,
but the implication is that validation will be very hard, which the authors in fact admit.
Living in a country where N-deposition is a key issue for soil pH, I was missing how this is incorporated in the model.
What was not really clear to me is the time-scale the authors had in mind in the background of seasonal forcing.
Apart from cycles in temperature and precipitation, leaf litter is of course dominating food availability for soil macro-fauna.
Earthworms easily life for 7 years (depending on temperature); the simulated period 1999-2008 is in fact matching this life span.
My complaint boils down to the notion that the memory of the various player of the game, from bacteria to macro-fauna differs by
order of magnitude, which I do not see reflected in the structure of the model.
The only memory that the model accounts for is the existing pools.
Line 216 introduces growth G, but G is not specified. Previously g_max was introduced, with dim(g_max) = #/time, which follows from
d/dt N = g_max.
From (10) follows that the units of R is g C/d.m^3 and not g C/m^3, like the line above (1) suggest.
In (27), however, g_max has the units of R, which amounts as a dimension error.
In (28) and (29), g_max has the units of CN_eng, and even time disappeared from the units of g_max.
There are more issues with units and dimensions.
In (39), DOM is in g/m^3, f_DOM is a fraction, but respiration R is a rate.
When I notice dimension issue, I cease further reviewing. Sorry for this.
As was stated by the authors in their letter to the editor, this is a follow up paper on model tat is described in an unpublished paper. I have no problem with this, more how it's referenced:
(G Deckmyn et al., 2020, unpublished data)
I think it should be:
(G Deckmyn et al., in review)
There are 2 problems with the reference, you have put a year on it, and you have choosen the word data. To me, that indicates that there is data that support a statement, but it hasn't been published yet. But that's not the case here.
In any case, this manuscript is not publishable without the first paper, so I guess that will sort it self.
A part from this, I think (given the circumstances I can not be sure) that the background is sufficent. In all sections but the model description this was the case. And this is maybe something that needs to be changed, even though you have the accompanying paper, there are no references to the logic/science behind using certain formulations. The current setup makes it impossible to follow witout having the other paper.
A comment on the notation CN ratio, I think it should either be C-N ratio or C-N ratio (C:N) the first time, and then C:N throughout the text. And also on the same, C is defined in the Abstract. The Abstract is a stand alone part of the manuscript, any definitions made there have to be redefined in the main text. N is never defined.
A part from the comments I have made above, I think it is a good manuscript, it was easy to read and how the assumptions that have been made in both model and experimental design affected the results were clear and easy to follow.
Even with accompanying paper, I think some background is needed in the model description part of the paper.
I read the manuscript by Flores et al. about an ecosystem model (KEYLINK) to integrate soil biological and physical processes to predict soil processes like carbon dynamics. In general, I like the modeling approach, and the details provided by the authors. I, however, have little background on soil hydrology and I am not an ecosystem modeler myself. My background is more soil biology and particularly soil food webs. Authors will realize that when reading some of my concerns below and probability some of my naivety on the topic.
Line 51: remove “shaping both”.
Line 57: perhaps use Carbon (C) for the first time, and then use C throughout the manuscript.
Line 63: soil organic matter (SOM).
Line 64: this sentence reads odd. Do you mean “the relevance of other processes in the physical protection…”
Lines 67: do these models have full name? If so, please provide them before abbreviating!
Line 68: ecosystem engineers.
Line 72: what are detailed soil models? Not clear!
Line 75: I again wonder if there is a full form of KEYLINK? If not, why is it called KEYLINK?
Line 77: such as? Provide some examples what those parameters are in extant models?
Line 87: please use ecosystem engineers throughout the manuscript.
Lines 81-88: were there also different thermal and moisture sensitivity modeled for soil organisms? For example, large body sized soil organisms could be more sensitive to higher temperature.
Lines 89-90: I understand that authors have another paper with more scientific background about their approach. It, however, is unpublished. So as reviewer, I have no access to it to understand the scientific background behind the model. It will be therefore a favor to the readers if authors provide at least with few sentences about the scientific background of the model.
Line 118: which ecosystem engineers authors have in mind here? Earthworms? Adult earthworms? The size differences between adult and juvenile ecosystem engineers may have differences in accessibility to soil pores.
Lines 129-130: are these densities general? Won’t they differ with soil type?
Line 198: did not get why authors have inverted comma for found?
Line 200: does completely dry pores have no access to a consumer?
Lines 633: not sure why soil bacteria would profit most from a rapidly changing environment. Please clarify.
Lines 651-656: there could be some other small sized ecosystem engineers, such as enchytraeids. Did authors consider “other ecosystem engineers” of the soil than earthworms?
Lines 659-660: which engineers increased? Do engineers then compensate for decay reduction? Also, some ecosystem engineers stimulate bacterial growth. How does that balance out?
Line 666: the loss of predator effects was exclusively owing to large predators or all body sized predators. Please make this clear as this is so far the most interesting model outcome.
Line 687: are authors developing any software to make KEYLINK accessible to ecosystem modelers?
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.