Background: Accurate assessment of hepatic steatosis is crucial for selecting suitable liver donors and predicting post-transplant outcomes. This study aimed to develop interpretable machine learning models for predicting hepatic steatosis using comprehensive clinical and biochemical features.
Methods: We retrospectively analyzed 306 adult donors. Twenty-six prespecified predictors (21 laboratory measures and 5 demographic/physiologic variables) were processed within cross-validated pipelines (median/mode imputation, IQR-based winsorization, collinearity screening). Five algorithms—decision tree, random forest, support vector machine, logistic regression, and gradient boosting—were tuned via five-fold cross-validation in the training set. Discrimination was evaluated on a stratified 70/30 hold-out test set using ROC AUC. Model explanations used Shapley Additive exPlanations (SHAP).
Results: Random forest achieved the highest ROC AUC (0.8148), followed by gradient boosting (0.7620). SVM and logistic regression yielded ROC AUCs of 0.6069 and 0.5136; the decision tree achieved 0.4721. Global SHAP analysis ranked uric acid (UA), high-density lipoprotein cholesterol (HDL-C; lower values), low-density lipoprotein cholesterol (LDL-C), and body mass index (BMI) as the most influential predictors.
Conclusions: Interpretable machine-learning models can effectively predict hepatic steatosis using readily available clinical and biochemical features. The random forest model showed the best discrimination (AUC-ROC=0.8148), and SHAP revealed UA, HDL-C (lower values), LDL-C, and BMI as key predictors.
If you have any questions about submitting your review, please email us at [email protected].