All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
The authors have addressed all the reviewers' comments and this manuscript is now ready for publication.
The author has adressed my comments.
None
None
None
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
The submission has been significantly revised by the authors. Some new references, a sampling flow chart, and a calibration curve diagram have been added to the article.
-
-
The main requests from the reviewers for the first version of the submission were: a clear statement of the hypothesis, clarification of sample selection and exclusions, handling of missing data, a clear description of the BMA-based logistic regression, and clarification of internal/external validation and limitations.
In the revised manuscript, the abstract provides the sample size (n=952) and the main validation indicators, which are in line with the review request. The generalizability limitations are also briefly stated in the abstract.
The research hypothesis is clearly stated at the end of the “Introduction” section (fibrosis can be predicted with simple clinical variables).
The “Study design and population” chapter explains the methodological details of the sampling at the two sites, including the target population and the inclusion/exclusion criteria, elaborating on the relevant comments of the review. It was clearly stated that there was no missing data during the processing. The methodology chapter now explicitly includes the wording “BMA based on binary logistic regression”, so the regression framework is clearly identified. The authors have clarified and confirmed the internal validation (2000 bootstrap, AUC 95% CI, calibration curve, and Brier score - these are essential elements for the management and analysis of reproducibility and the risk of overfitting). The detailed specification of the statistical environment and the R packages used (R 4.5.0; BMA, MASS, pROC, boot, rms, ggplot2) improves reproducibility and helps to carry out analyses with similar purposes.
The “Discussion” chapter, similarly to the previous version, amply supports the added value of the research with references: it compares it to several reference models and highlights the advantages of simple, routinely available variables and calibration. The authors clearly indicate that external validation is necessary to confirm generalizability. Limitations are highlighted, including the lack of biopsy, the cross-sectional design, and the limitations of sample representativeness. Future research directions include the examination of interactions between predictors.
Overall, the main methodological and interpretative requests and comments of the review are largely met in the revised submission.
-
-
-
In the discussion section, detailed numerical results from other studies (e.g., ORs and 95% CIs) should be de-emphasized. Their inclusion in excess creates redundancy and detracts from the focus on the current study’s findings. The authors should streamline this section to maintain clarity and relevance.
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
**Language Note:** When preparing your next revision, please ensure that your manuscript is reviewed either by a colleague who is proficient in English and familiar with the subject matter, or by a professional editing service. PeerJ offers language editing services; if you are interested, you may contact us at [email protected] for pricing details. Kindly include your manuscript number and title in your inquiry. – PeerJ Staff
The language of the article is mostly appropriate. The sentence structures are sometimes too long, which makes them difficult to read. The bibliography contains 69 items, showing a thorough literature review. The structure of the article is appropriate: introduction, methods, results, discussion, etc. are all included. Raw data are provided by the authors. The aim of the study is to develop a simple, easy-to-apply prediction model to predict the risk of liver fibrosis in Vietnamese male adults. The hypothesis of the paper is therefore that simple clinical variables can be used (with reasonable accuracy) to predict the presence of liver fibrosis. The results largely support this hypothesis. The research objective/aim and the research hypothesis should be formulated more clearly and separately.
The study is based on primary, retrospective data collection (ethical approval verified, target population justified) and aims to develop a simple liver fibrosis prediction model based on clinical predictors in Vietnamese men. The concept of a “simple” model is relevant, especially in low-resource settings. However, the strength of the novelty is not entirely clear, as several similar predictive models already exist (this is partially acknowledged by the authors in the Discussion section). The research question can be deduced as: “can an effective predictive model for fibrosis be developed from simple clinical variables?” The article does not formally state a research hypothesis and/or research question. The variables (FibroScan, laboratory chemistry values, etc.) are relevant and standardized.
Deficiency: There is no mention of how the model was validated (e.g., internal or external validation) or to what extent potential confounding factors were controlled.
The variables and analyses are partially specified. The model was developed using Bayesian Model Averaging (BMA) analysis combined with stepwise variable inclusion. The effectiveness was validated using ROC (Receiver Operating Characteristic) and AUC (Area Under the Curve). The steps of the modeling are not fully reproducible (e.g., handling missing data, specifying an accurate regression form). The description of the sample selection, inclusion, and exclusion criteria needs to be clarified.
The article does not analyze its own novelty and scientific impact compared to previous models. The idea of a “simple multivariable model” could be interpreted as novel, but this is not demonstrated by the authors in the form of a comparison or analysis. Further testing of the model would be useful, especially in other geographical or demographic groups, but this is not highlighted as a future research direction. As a suggestion, it would be worthwhile to justify how the model offers advantages over existing solutions and why it would be important to test the model in other populations. The statistical methods used (ROC, AUC, remark: logistic regression indicated indirectly in the publication ) are appropriate. However, it is not entirely clear how missing data were handled, and the internal validation of the model is lacking. Conclusions are consistent with the results and generally do not go beyond the range of conclusions that can be drawn from the data. The authors do not claim that the model is universal – this is correctly restricted to the Vietnamese male population. Note, however, the conclusions are too general and do not address the limitations of the model (e.g., population bias, lack of external validation).
Recommendation: The weaknesses of the model should be discussed in more detail, indicating the conditions under which the model could be generalized.
Remarks and further suggestions - per chapter:
Abstract: The sample element number should also be given in the abstract. The limitations of the model can be briefly indicated in the abstract.
Introduction: The Introduction chapter is sufficiently detailed, presenting both global and Vietnamese data on the topic. The formulation of the objective (and clarified concepts - research question, hypothesis - agreed upon) should be highlighted in a separate paragraph.
Material and methods: The methodology in the chapter is adequately detailed, partially replicable, and ethically sound (permit number provided). The combination of BMA and stepwise methods is justified and state-of-the-art. The applied BMA procedure and the related settings (priors, calculation method, implementation) should be documented in more detail to ensure reusability. The authors used the Bayesian Model Averaging (BMA) method for the weighted evaluation of the predictor variables, but did not provide clear information about the type of regression model. Given that the outcome variable is binary, it is likely that logistic regression (binary?) was used. It is recommended that this be clearly stated in the methodology section (e.g., ‘BMA based on logistic regression’). More detailed information is needed on the handling of missing data, if it was used.
Results: The authors clearly present the main results (e.g., OR values, AUC). More attention could be paid to the explanation of the AUC confidence interval and the possible risk of overfitting.
Discussion: The Discussion section is very detailed. The authors compare the results with other international data. The structure of the Discussion chapter follows the structure of the Results chapter. However, the Discussion is too long, and the abundance of data reduces readability, sometimes involving more distant areas.
Conclusions: The Conclusions section provides a concise and clear conclusion; it summarizes the significance of the results well. It would be recommended to highlight the limitations (lack of validation, sample representativeness) more prominently. Recommendation: concrete suggestions for future research. For example, Several of the predictors examined are frequently biologically and clinically interacting in the development of liver fibrosis. The current model works exclusively with main effects. In the future, explicit examination of interaction effects is recommended, especially to increase predictive power, interpretability, and translational utility of the model.
References: The bibliography is relevant, with recent sources (69 cited sources, many references after 2020).
In summary, the publication is of adequate quality, methodologically sound, although minor structural and stylistic refinements could greatly improve readability. Its great strength is the combination of BMA and Stepwise, but the lack of external validation limits the generalizability of the results. The paper is recommended for publication with minor revisions, after clarifying the details outlined above.
-
-
-
The study employed a cross-sectional design, using both Bayesian Model Averaging and stepwise methods to identify potential predictors of liver fibrosis. However, I have several concerns regarding the methodology and presentation:
1. In the methods, the authors could include a flowchart illustrating data collection, exclusion, and inclusion procedures. This can help readers gain a clearer understanding of the overall process.
2. The definitions of key variables also require clarification, such as the criteria used to define obesity and overweight.
3. Regarding variable selection, while BMA is designed to average over model uncertainty, the authors appear to have selected only the model with the highest posterior probability and then applied stepwise procedures to it. This undermines the core strength of BMA. If only one model is ultimately selected, using AIC or WAIC to determine the best-fitting model may be more appropriate. Additionally, the phrase “maximum number of statistically significant variables” is statistically inappropriate and may raise concerns of p-hacking. The authors should justify the choice of method. It is also necessary to clarify whether the final model was a logistic regression or another type of model.
4. The methods should explicitly list which R packages were used.
5. The discussion section should be more focused and clearly structured around the main findings of the study. Currently, it devotes disproportionate attention to the prevalence of liver fibrosis, which is not the central research question and may not be meaningful given the study’s limited sample and representativeness for the whole of Vietnam. In addition, many external studies are cited, without a deeper interpretation of how they relate to the present findings. For example, considerable space is given to the study by Kariyama et al., yet its relevance is not clearly explained. Referencing prior work should serve to contextualise and support the study’s findings, not to simply summarise their results. The authors should refocus the discussion on their own results and provide more targeted and interpretive analysis.
6. The study identified HBV, HCV, age, and other factors as being associated with liver fibrosis. However, given the cross-sectional nature of the study, only associations (not causal relationships) can be inferred. As such, the use of terms like "risk factors" should be applied with caution. Furthermore, these findings are not particularly novel, as many studies with stronger causal inference frameworks have already established such relationships. In my view, the greatest strength of this study lies in its clinical utility, namely, the ability to assess the potential risk of liver fibrosis using simple, easily obtainable variables. This could serve as a cost-effective basis for deciding whether a more detailed diagnostic evaluation is warranted. I recommend that the authors elaborate further on this point in the discussion.
7. There are also some minor writing issues, such as abbreviations not being defined upon first use (e.g., OR, CI) and inconsistent use of statistical terminology. Additionally, there are instances of typo errors, for example, “19,9%” in line 155 should be written as “19.9%”.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.