Study on the mortality risk and predictive model for COVID-19 inpatients with pneumonia manifestations

Zhi Li; Jiamin Liang; Katie Lu; Shuyu Tang; Jinyi Huang; Jinrong Zhang; Jianjun Zou; Dongsheng Huang; Chenli Xie; Linglong Zeng; Zhiwei Wang; Yibin Deng; Jiachun Lu

doi:10.7717/peerj.20795

Study on the mortality risk and predictive model for COVID-19 inpatients with pneumonia manifestations

Zhi Li¹, Jiamin Liang¹, Katie Lu², Shuyu Tang³, Jinyi Huang¹, Jinrong Zhang¹, Jianjun Zou¹, Dongsheng Huang⁴, Chenli Xie⁵, Linglong Zeng¹, Zhiwei Wang⁶, Yibin Deng ⁷, Jiachun Lu ¹

1The Key Laboratory of Advanced Interdisciplinary Studies, The First Affiliated Hospital of Guangzhou Medical University; The Institute for Chemical Carcinogenesis, School of Public Health, Guangzhou medical university, Guangzhou, China

2School of Medicine, University of Arizona, Tucson, AZ, United States of America

3Guangzhou Women and Children’s Medical Center, Guangzhou, China

4Department of Respiratory and Critical Care Medicine, Shenzhen Longhua District Central Hospital, Shenzhen, China

5Department of Respiratory and Critical Care Medicine, Dongguan Binhaiwan Central Hospital, Dongguan, China

6Department of 12320 Health Hotline, Guangzhou Center for Disease Control and Prevention, Guangzhou, China

7Key Laboratory of Research on Clinical Molecular Diagnosis for High Incidence Diseases in Western Guangxi; Centre for Medical Laboratory Science, the Afliated Hospital of Youjiang Medical University for Nationalities, Baise, China

DOI: 10.7717/peerj.20795

Published: 2026-02-10
Accepted: 2025-12-23
Received: 2025-04-25

Academic Editor: Faiza Farhan

Subject Areas: Epidemiology, Public Health, COVID-19
Keywords: COVID-19, In-patient, Mortality risk, Machine learning, Prediction model

Copyright: © 2026 Li et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits using, remixing, and building upon the work non-commercially, as long as it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Cite this article: Li Z, Liang J, Lu K, Tang S, Huang J, Zhang J, Zou J, Huang D, Xie C, Zeng L, Wang Z, Deng Y, Lu J. 2026. Study on the mortality risk and predictive model for COVID-19 inpatients with pneumonia manifestations. PeerJ 14:e20795 https://doi.org/10.7717/peerj.20795

The authors have chosen to make the review history of this article public.

Abstract

Background

In 2020, COVID-19 posed a major threat to global public health in a remarkably short period. Although the WHO declared an end to the emergency phase in May 2023, a considerable proportion of recovered cases experience medium- and long-term effects, which pose ongoing health challenges to society. Therefore, it remains necessary to conduct relevant research in the post-epidemic era to explore the risk factors for death in COVID-19 inpatients.

Methods

We determined the mortality of COVID-19 inpatients with pneumonia manifestations through one-year follow-up, utilizing real-world data from three medical centers. Clinical characteristics associated with mortality risk were analyzed by logistic regression. Then, the dataset was randomly partitioned into three sets at a ratio of 4:2:4. Three machine learning algorithms were employed to develop and validate a mortality risk predictive model for COVID-19 inpatients, and a web-based visualization tool was created.

Results

There were 100 fatalities among the 1,693 samples included in this study. Meanwhile, we identified 37 factors correlated with increased mortality risk in COVID-19 inpatients with pneumonia manifestations. Ultimately, we developed a mortality risk predictive model using the random forest algorithm, which demonstrated superior predictive performance (AUC=0.907, 95% CI=0.849-0.957).

Conclusions

This study reports a mortality rate of 5.9% for COVID-19 inpatients with pneumonia manifestations. The high-performance mortality risk prediction model obtained in this study provides important practical guidance for monitoring the mortality risks of COVID-19 inpatients with pneumonia manifestations.

Introduction

A novel coronavirus named SARS-CoV-2 (which causes COVID-19) swept the world in 2020, posing a significant global public health threat in a short period of time due to its strong transmissibility (Wang et al., 2020; Wu, Leung & Leung, 2020; Deng & Peng, 2020). As of February 4, 2024, there have been over 774 million confirmed cases of COVID-19 and more than 7 million deaths reported worldwide (WHO, 2024). In certain countries, the fatality rate may exceed 20% (Sorci, Faivre & Morand, 2020).

Epidemiological data suggest that SARS-CoV-2 is primarily transmitted through short-range airborne aerosols, respiratory droplets, and both direct and indirect contact with surfaces contaminated by infectious respiratory droplets (Chan et al., 2020; Ong et al., 2020; Lednickya et al., 2020; Toubiana et al., 2020; Cheng et al., 2020; Riddell et al., 2020). Individuals can exhibit a range of clinical manifestations following infection, which may include asymptomatic carriers, pneumonia, an exaggerated inflammatory response, respiratory failure, and acute respiratory distress syndrome (ARDS) (Yang et al., 2020; Guan et al., 2020; Grasselli et al., 2020; Mehta et al., 2020). Among these, pneumonia is the most prevalent clinical manifestation observed in hospitalized patients (George et al., 2020). At the early stages of the epidemic, COVID-19 was mainly manifested as an acute respiratory illness, which may present with varying degrees of acute upper or lower respiratory tract syndromes. Research has indicated that certain patients with COVID-19, particularly among the elderly and immunocompromised populations, are at risk of developing severe pneumonia (Jartti et al., 2011). Pneumonia represents a more serious complication following infection with the SARS-CoV-2 virus; it can result in impaired oxygen exchange, respiratory failure, and potentially multiple organ failure, ultimately leading to death (Chen et al., 2020).

Despite the World Health Organization officially declaring the cessation of the emergency phase of COVID-19 in May 2023, it is estimated that approximately 31% to 69% of individuals who have recovered from the initial infection may experience effects from a range of persistent or novel symptoms in the medium to long term (Tenforde et al., 2020; Su et al., 2022; Ballering et al., 2022). These effects can severely reduce patients’ quality of life long after they are free of the virus and pose global health challenges to society (Koc et al., 2022; Briggs & Vassall, 2021). COVID-19 has been proven to be a challenging disease, and while multiple organ failure and death are rare in mild, self-limiting respiratory presentations of COVID-19, it can present as severe progressive pneumonia. Therefore, in the post-pandemic era, it is still necessary to conduct relevant research on previous cases and explore the impact of related risk factors on the mortality risk of COVID-19 pneumonia cases.

In view of the serious consequences that pneumonia can have on COVID-19 patients, and the lack of research focusing on SARS-CoV-2 infection presenting with pneumonia manifestations, this paper will focus on cases of SARS-CoV-2 pneumonia with lung inflammatory manifestations (also known as pneumonia of COVID-19). This study aims to explore the risk factors of death in cases of it, in order to provide a scientific basis for clinical treatment of relevant populations in the post-epidemic era, thereby reducing the mortality rate of COVID-19 pneumonia.

Materials & Methods

Study objects

We collected 2,283 cases of COVID-19 (caused by the SARS-CoV-2 variant B.1.1.529) that presented with pneumonia manifestations, from patients hospitalized in three hospitals in Guangdong Province (1,007 cases from Guangzhou Chest Hospital, 764 cases from Shenzhen Longhua District Central Hospital and 512 cases from Dongguan Binhaiwan Central Hospital) between October 2022 and February 2023 through an electronic medical record system, following the obtainment of their informed consent. Meanwhile, we collected the basic information, symptoms (including fever, cough, shortness of breath, dyspnea), complications (including severe pneumonia and respiratory failure), and blood test results of the cases through the above system. Participants were followed up for one year to confirm their survival status.

After excluding patients with lost follow-up and incomplete blood test results (see Fig. 1), 1,693 cases were eventually included for analysis. All cases were confirmed by the nucleic acid etiological testing and chest computerized tomography (CT) examination.

Figure 1: Flowchart of the study.
The dataset was randomly split into a training set, validation set, and test set at a ratio of 4:2:4.

Download full-size image

DOI: 10.7717/peerj.20795/fig-1

All methods of this study were carried out in accordance with the Declaration of Helsinki. Meanwhile, written informed consent was obtained from all study subjects, and the study was approved by the institutional review boards of Guangzhou Medical University (NO. 202404023).

Sample size

The sample size was calculated according to the following formula: $N = \frac{Z_{a / 2}^{2} \times p \times (1 - p)}{δ^{2}}$

p is the estimated mortality rate of hospitalized patients with COVID-19. a is the probability of type I error, and Z = 1.96 when a is set to 0.05. δ is the allowable error.

According to research reports, the global mortality rate of inpatients with COVID-19 (p) was 16% (Baptista et al., 2023), and δ was set to 0.2p. Therefore, a sample size of 505 was required for the study. However, considering a 10% loss to follow-up rate, the minimum sample size required for the study was 556. In this study, a total of 1,693 samples were included, which could meet the requirements of statistical power.

Modeling by machine learning

We constructed a mortality risk prediction model utilizing machine learning (ML) algorithms (see Fig. 1). Initially, multivariate logistic regression analysis was employed to conduct a preliminary screening of the model’s characteristic factors. Subsequently, the dataset was randomly partitioned into training, validation and test sets in a ratio of 4:2:4 using the Caret package (Version: 7.0-1) (Zou et al., 2023; Li et al., 2025). Secondly, the Synthetic Minority Over-sampling Technique (SMOTE) (Yang et al., 2022; Hao, Wang & Bryant, 2014) was applied to the training set for resampling, which was implemented via the Smotefamily package (Version 1.4.0). The oversampling ratio of SMOTE was set close to 1:1 (with an oversampling multiple of 15), which can effectively balance recall and precision and align with the practical needs of clinical decision-making. The number of k-nearest neighbors (k) was set to 3—the most commonly used and robust default value—and no additional parameter adjustments were made. Then, the h2o package (Version: 3.44.0.3) was used for the construction and performance evaluation of machine learning models—three ML algorithms, namely random forest (RF), gradient boosting machine (GBM), and elastic net (EN), were employed to develop the mortality risk prediction model on this training set (Hu & Szymczak, 2023; Natekin & Knoll, 2013; Hong, Chen & Harris, 2013; López-Blanco & Chacón, 2016). Subsequently, the auto-validation function of the h2o package was utilized to perform five-fold cross-validation on the validation set to adjust and optimize the parameters of the machine learning models (the hyperparameter tuning protocol is shown in Table S5), so as to enhance the models’ performance. Ultimately, the prediction performance of the constructed machine learning models was assessed using the test set, and the randomForest package (Version: 4.7-1.2) was employed to output important variables.

Statistical analysis

Statistical methods for variable filtering and model construction are described in the preceding section. The quantitative data that conformed to a normal distribution or approximate normality were presented as mean ± standard deviation ( $\bar{X}$ ± S), with independent sample t-tests employed for comparison between two groups. Otherwise, results were expressed as median (M) and percentiles (P₂₅, P₇₅), and analyzed using the rank sum test between groups. The qualitative data were represented as percentages (%) and compared between groups utilizing the Chi-square test or Fisher’s exact probability method. Receiver operating characteristic (ROC) curves were plotted using the h2o package (Version: 3.44.0.3) and pROC package (Version: 1.19.0.1), and the area under the curve (AUC), sensitivity, specificity, and their respective 95% confidence intervals (95% CIs) were calculated to evaluate the performance of the models. The higher these indicators’ values, the better the model performance. A significance level of 0.05 was established, and differences were deemed statistically significant when P < 0.05.

All data analyses were performed in R Project for Statistical Computing (R software, Version 4.4.1; R Core Team, 2024).

Results

Demographic characteristics

A total of 1,693 subjects were included in this study (see the flow chart in Fig. 1). As illustrated in Table 1, there were 100 fatalities among the cases included in the study, resulting in a mortality rate of 5.9% for COVID-19 inpatients with pneumonia manifestations. The mean age of the participants was 59.1 ± 22.0 years, with a significant proportion (53.0%) being over 60 years old. Additionally, males comprised a larger percentage than females (58.8% vs. 41.2%), and the majority of subjects presented with non-severe pneumonia (89.0%).

Table 1:

Demographic characteristics of the subjects.

	Total (N = 1693)	G Hospital (N = 845)	S Hospital (N = 531)	D Hospital (N = 317)	P^*
Age (year)	59.1 ± 22	57.1 ± 21.7	57.7 ± 24.1	66.9 ± 16.8	<0.001
Age (year)					<0.001
<60	796 (47.0%)	254 (47.8%)	95 (30.0%)	447 (52.9%)
≥60	897 (53.0%)	277 (52.2%)	222 (70.0%)	398 (47.1%)
Sex					0.995
Female	698 (41.2%)	349 (41.3%)	218 (41.1%)	131 (41.3%)
Male	995 (58.8%)	496 (58.7%)	313 (58.9%)	186 (58.7%)
Severe pneumonia					<0.001
No	1506 (89.0%)	790 (93.5%)	464 (87.4%)	252 (79.5%)
Yes	187 (11.0%)	55 (6.5%)	67 (12.6%)	65 (20.5%)
Status					0.452
Alive	1593 (94.1%)	801 (94.8%)	497 (93.6%)	295 (93.1%)
Dead	100 (5.9%)	44 (5.2%)	34 (6.4%)	22 (6.9%)

DOI: 10.7717/peerj.20795/table-1

Notes:

*t-test or Chi-square test.

G Hospital: Guangzhou Chest Hospital
S Hospital: Shenzhen Longhua District Central Hospital
D Hospital: Dongguan Marina Bay Central Hospital

Symptoms and comorbidities of COVID-19 such as dyspnea and severe pneumonia are associated with mortality risk

We conducted an analysis of the risk factors associated with mortality in COVID-19 inpatients with pneumonia manifestations after adjusting for age and sex, and explored the relationship between symptoms, comorbidities, and blood test indicators and patients’ mortality risk. As detailed in Tables S1–S4, we identified 37 significant factors linked to increased mortality risk in COVID-19 inpatients. These include age ≥60 years (adjusted OR = 7.64, 95% CI [4.23–15.26]), anhelation (adjusted OR = 2.03, 95% CI [1.33–3.11]), dyspnea (adjusted OR = 3.70, 95% CI [2.21–6.05]), severe pneumonia (adjusted OR = 20.04, 95% CI [12.45–32.96]), respiratory failure (adjusted OR = 11.60, 95% CI [7.36–18.49]), coagulation dysfunction (adjusted OR=5.62, 95% CI [1.87–15.37]), platelet abnormalities (PLT, adjusted OR = 1.87, 95% CI [1.21–2.87]), D-dimer abnormalities (adjusted OR = 7.72, 95% CI [4.62–13.52]), abnormal procalcitonin levels (PCT, adjusted OR = 4.75, 95% CI [2.96–7.88]), and elevated lactate dehydrogenase levels (LDH, adjusted OR = 4.65, 95% CI [2.96–7.49])—all of which were identified as significant risk factors for increased mortality (P < 0.05).

Construction and validation of mortality risk prediction model

Among the 37 factors identified as influencing the risk of mortality in COVID-19 inpatients with pneumonia manifestations above, we employed three machine learning (ML) algorithms—RF, GBM, and EN—to construct a predictive model for mortality risk. The parameter values for these models are detailed in Table S5. As illustrated in Fig. 2 and Table 2, the AUC, sensitivity, specificity, and corresponding 95% Confidence Intervals (CI) for the test set were as follows: RF: aUC = 0.907 (0.849–0.957), sensitivity = 0.875 (0.773–0.978), specificity = 0.870 (0.844–0.896); GBM: AUC = 0.898 (0.840–0.956), sensitivity = 0.850 (0.739–0.961), specificity = 0.856 (0.828–0.883); and EN: AUC = 0.896 (0.825–0.967), sensitivity = 0.725 (0.587–0.863), specificity = 0.911 (0.888–0.933).

The larger the area under the curve (AUC), sensitivity, and specificity, the better the predictive performance of the model. Based on a comprehensive assessment of the models’ AUC, sensitivity, and specificity, the RF model demonstrated superior performance, with the highest AUC value and relatively high sensitivity and specificity. Consequently, it was selected as the primary mortality risk prediction model for this study. Moreover, key features contributing to this model’s efficacy are depicted in Fig. 3, underscoring that characteristics such as severe pneumonia were of great importance to the mortality risk in COVID-19 inpatients. Among the 37 factors identified as influencing the risk of mortality in COVID-19 inpatients above.

Figure 2: (A–C) ROC curves of three machine learning mortality risk prediction models on the test set.

Download full-size image

DOI: 10.7717/peerj.20795/fig-2

Table 2:

Effect of the mortality risk prediction model by machine learning on the test set.

Model	Sensitivity (95% CI)	Specificity (95% CI)	AUC (95% CI)
RF	0.875 (0.773–0.978)	0.870 (0.844–0.896)	0.907 (0.849–0.957)
GBM	0.850 (0.739–0.961)	0.856 (0.828–0.883)	0.898 (0.840–0.956)
EN	0.725 (0.587–0.863)	0.911 (0.888–0.933)	0.896 (0.825–0.967)

DOI: 10.7717/peerj.20795/table-2

Notes:

RF: Random Forest
GBM: Gradient Boosting Machine
EN: Elastic Net
AUC: Area under the curve
95% CI: 95% confidence interval

Figure 3: Top 15 Variable importance assessed based on random forest model.

Download full-size image

DOI: 10.7717/peerj.20795/fig-3

Web-based model visualization

To enhance the practical application of the model, we developed a web-based visual interaction tool for the mortality risk prediction model using the Shiny package in R software, based on the key features illustrated in Fig. 3. As depicted in Fig. 4, users can access and utilize this tool for risk prediction via the website (https://outch-lee.shinyapps.io/Lung_Health/), by following the corresponding operational prompts.

Figure 4: Screenshot of the web-based mortality risk model.

Download full-size image

DOI: 10.7717/peerj.20795/fig-4

Discussion

In this study, we used ML methods to analyze real-world data from three centers, reported a mortality rate of 5.9% among COVID-19 inpatients with pneumonia manifestations, identified 37 significant factors related to the mortality risk of these patients, and also built and validated a predictive model for mortality risk with superior performance.

Our study reported a mortality rate of 5.9% among COVID-19 inpatients with pneumonia manifestations, which is comparable to the 5% mortality rate documented by Li et al. (2020a). The mortality rate of these patients is influenced by various factors, including domestic socioeconomic status, development levels, income, and health care conditions (Abou Ghayda et al., 2022). Current literature indicates that the case mortality rates of COVID-19 inpatients range from 3.67% to 10% (Baptista et al., 2023; Li et al., 2020a; Verity et al., 2020; Onder, Rezza & Brusaferro, 2020; Chowdhury, Rathod & Gernsheimer, 2020), suggesting that our findings are within a moderate range. According to data from the WHO website (https://data.who.int/dashboards/covid19/cases), over 760 million cases of COVID-19 have been recorded globally since December 2019, and approximately 10–20% of these cases transition into a prolonged state in the post-emergency phase. This indicates that even in the late post-pandemic period, we still need to focus on the health risks of patients with COVID-19.

In addition to compromising the pulmonary function of patients, leading to symptoms such as cough and potentially respiratory failure, the novel coronavirus may also inflict damage on other organs and tissues, including the heart and kidneys, thereby manifesting as a systemic disease (Docherty et al., 2020; Giacca & Shah, 2022; Qi & Yu, 2020). Although the virulence of the virus has diminished with ongoing mutations—resulting in a reduced incidence of multiple organ failure and mortality—there remain instances where severe progressive pneumonia can occur. This study identifies severe pneumonia as a significant risk factor for mortality among COVID-19 inpatients. This is likely attributable to its association with various comorbidities that exacerbate the effects of pulmonary infection on overall health (Guan et al., 2020). Furthermore, patients exhibit impaired adaptive immune responses alongside abnormal elevations in inflammatory cells and cytokine storms, which may contribute to both local and systemic organ damage (Li et al., 2020b). Research indicates that abnormalities in blood markers such as lymphocytes, platelets, D-dimer, creatine kinase, and procalcitonin are closely correlated with COVID-19 severity and mortality risk (Huang et al., 2020; Gao et al., 2021; Mao et al., 2020). Our study corroborates these findings through multivariate variable screening analysis, and it suggests that excessive inflammation induced by COVID-19 significantly impacts patient prognosis. Additionally, consideration should be given to whether hospitalized individuals who have previously contracted COVID-19 will experience prolonged effects from this inflammatory response during the sequelae of COVID-19.

The medium- and long-term consequences of COVID-19 are imposing a substantial burden on the quality of life for survivors, while also generating significant health and economic challenges globally (Koc et al., 2022; Davis et al., 2023). Nevertheless, there seems to be a notable lack of awareness regarding the health impacts of COVID-19, particularly in low- and middle-income countries (The Lancet, 2023). Generally, the symptoms of sequelae associated with COVID-19 may include fatigue, myalgia, palpitations, cognitive impairment, dyspnea, anxiety, chest pain, and others. Additionally, factors such as age, comorbidities, blood coagulation abnormalities, and diabetes can predispose individuals to the development of sequelae (Tenforde et al., 2020; Ballering et al., 2022; Raveendran, Jayadevan & Sashidharan, 2021; Pretorius et al., 2022; Turner et al., 2023). Moreover, our study indicates that respiratory complications and coagulopathy significantly elevate the mortality risk of COVID-19 inpatients. This underscores the necessity for vigilance concerning this population’s health status in the post-pandemic era. It is imperative to closely monitor potential abnormalities in respiratory function and coagulation parameters while implementing timely interventions to mitigate mortality risks.

In comparison to traditional models such as logistic regression, ML algorithms are adept at managing nonlinear relationships between independent and dependent variables, demonstrating superior performance relative to conventional predictive models (Boulesteix & Schmid, 2014). Furthermore, these algorithms excel in processing high-dimensional datasets and can effectively operate even when the number of features exceeds that of samples, whereas this scenario may induce overfitting issues in logistic regression (Deo & Nallamothu, 2016). This study evaluates three distinct ML models: RF, GBM, and EN. The findings indicate that the RF model outperformed the other models in predicting patient mortality risk, particularly regarding sensitivity. Additionally, compared with the mortality risk prediction models developed in existing studies—specifically, the logistic regression model constructed by Murri et al. (2021) using blood test indicators of COVID-19 inpatients from a single hospital (AUC = 0.87, sensitivity = 0.840, specificity = 0.774), and the model built by Chen et al. (2021) via the random forest algorithm using blood test and other indicators of 6,415 patients (AUC = 0.90, sensitivity = 0.872, specificity = 0.800)—our study not only focused on blood test indicators but also prioritized the assessment of patients’ comorbidities. This inclusion of comorbidities better reflects patients’ risk of disease progression, and the resulting model demonstrates superior predictive accuracy, along with higher sensitivity and specificity (AUC = 0.907, sensitivity = 0.875, specificity = 0.870).

Nevertheless, this study is subject to certain limitations due to objective constraints. Firstly, the participants in this research are exclusively from the Guangdong region of China, which may not adequately reflect the circumstances of other regions or diverse populations. Secondly, a limited number of fatalities were recorded in this study, potentially impacting the stability of the evaluation metrics. Thirdly, due to resource limitations, important predictive factors or confounding factors, such as COVID-19 vaccination status, prior SARS-CoV-2 infection history, socioeconomic status, and residential area, were not included in the analysis, which may limit the comprehensiveness of the model and may lead to residual confounding. Finally, owing to the limitation of the number of outcome events, reserving one center for validation or conducting stratified sensitivity analysis would be susceptible to random errors, potentially leading to statistically insignificant results. Therefore, we did not consider conducting sensitivity analysis or reserving one of the hospitals for external validation. However, we have fully verified the stability and generalizability of the model through alternative validation analyses such as feature importance assessment and cross-validation.

Conclusions

In conclusion, through an analysis of multicenter hospitalized cases of COVID-19 with pneumonia manifestations, we found that the mortality rate among these patients was 5.9%. The clinical characteristics influencing mortality in COVID-19 inpatients were identified, and a novel predictive tool was developed and validated based on real-world multicenter data. This tool can accurately assess the mortality risk among COVID-19 inpatients with commendable predictive performance, which holds significant practical implications for monitoring mortality risks in these individuals.

Supplemental Information

Raw data

DOI: 10.7717/peerj.20795/supp-1

Download

Demographic characteristics, pneumonia symptoms, and blood markers of complications are associated with the risk of death from COVID-19

DOI: 10.7717/peerj.20795/supp-2

Download

[1] Abou Ghayda R, Lee KH, Han YJ, Ryu S, Hong SH, Yoon S, Jeong GH, Yang JW, Lee HJ, Lee J, Lee JY, Effenberger M, Eisenhut M, Kronbichler A, Solmi M, Li H, Jacob L, Koyanagi A, Radua J, Park MB, Aghayeva S, Ahmed MLCB, Al Serouri A, Al-Shamsi HO, Amir-Behghadami M, Baatarkhuu O, Bashour H, Bondarenko A, Camacho-Ortiz A, Castro F, Cox H, Davtyan H, Douglas K, Dragioti E, Ebrahim S, Ferioli M, Harapan H, Mallah SI, Ikram A, Inoue S, Jankovic S, Jayarajah U, Jesenak M, Kakodkar P, Kebede Y, Kifle M, Koh D, Males VK, Kotfis K, Lakoh S, Ling L, Llibre-Guerra J, Machida M, Makurumidze R, Mamun MA, Masic I, Van Minh H, Moiseev S, Nadasdy T, Nahshon C, Ñamendys Silva SA, Yongsi BN, Nielsen HB, Nodjikouambaye ZA, Ohnmar O, Oksanen A, Owopetu O, Parperis K, Perez GE, Pongpirul K, Rademaker M, Rosa S, Sah R, Sallam D, Schober P, Singhal T, Tafaj S, Torres I, Torres-Roman JS, Tsartsalis D, Tsolmon J, Tuychiev L, Vukcevic B, Wanghi G, Wollina U, Xu R-H, Yang L, Zaidi Z, Smith L, Shin JI. 2022. The global case fatality rate of coronavirus disease 2019 by continents and national income: a meta-analysis. Journal of Medical Virology 94(6):2402-2413

[2] Ballering AV, Van Zon SKR, Olde Hartman TC, Rosmalen JGM. 2022. Persistence of somatic symptoms after COVID-19 in the Netherlands: an observational cohort study. Lancet 400(10350):452-461

[3] Baptista A, Vieira AM, Capela E, Julião P, Macedo A. 2023. COVID-19 fatality rates in hospitalized patients: a new systematic review and meta-analysis. Journal of Infection and Public Health 16(10):1606-1612

[4] Boulesteix A-L, Schmid M. 2014. Machine learning versus statistical modeling. Biometrical Journal 56(4):588-593

[5] Briggs A, Vassall A. 2021. Count the cost of disability caused by COVID-19. Nature 593(7860):502-505

[6] Chan JFW, Zhang A, Yuan SF, Poon VKM, Chan CCS, Lee ACY, Chan WM, Fan ZM, Tsoi HW, Wen L, Liang RH, Cao JL, Chen YX, Tang KM, Luo CT, Cai JP, Kok KH, Chu H, Chan KH, Sridhar S, Chen ZW, Chen HL, To KKW, Yuen KY. 2020. Simulation of the clinical and pathological manifestations of coronavirus disease 2019 (COVID-19) in a golden syrian hamster model: implications for disease pathogenesis and transmissibility. Clinical Infectious Diseases 71(9):2428-2446

[7] Chen Z, Chen J, Zhou J, Lei F, Zhou F, Qin J-J, Zhang X-J, Zhu L, Liu Y-M, Wang H, Chen M-M, Zhao Y-C, Xie J, Shen L, Song X, Zhang X, Yang C, Liu W, Zhang X, Guo D, Yan Y, Liu M, Mao W, Liu L, Ye P, Xiao B, Luo P, Zhang Z, Lu Z, Wang J, Lu H, Xia X, Wang D, Liao X, Peng G, Liang L, Yang J, Chen G, Azzolini E, Aghemo A, Ciccarelli M, Condorelli G, Stefanini GG, Wei X, Zhang B-H, Huang X, Xia J, Yuan Y, She Z-G, Guo J, Wang Y, Zhang P, Li H. 2021. A risk score based on baseline risk factors for predicting mortality in COVID-19 patients. Current Medical Research and Opinion 37(6):917-927

[8] Chen N, Zhou M, Dong X, Qu J, Gong F, Han Y, Qiu Y, Wang J, Liu Y, Wei Y, Xia Ja, Yu T, Zhang X, Zhang L. 2020. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet 395(10223):507-513

[9] Cheng VCC, Wong SC, Chan VWM, So SYC, Chen JHK, Yip CCY, Chan KH, Chu H, Chung TWH, Sridhar S, To KKW, Chan JFW, Hung IFN, Ho PL, Yuen KY. 2020. Air and environmental sampling for SARS-CoV-2 around hospitalized patients with coronavirus disease 2019 (COVID-19) Infection Control and Hospital Epidemiology 41(11):1258-1265

[10] Chowdhury MS, Rathod J, Gernsheimer J. 2020. A rapid systematic review of clinical trials utilizing chloroquine and hydroxychloroquine as a treatment for COVID-19. Academic Emergency Medicine 27(6):493-504

[11] Davis HE, McCorkell L, Vogel JM, Topol EJ. 2023. Long COVID: major findings, mechanisms and recommendations. Nature Reviews. Microbiology 21(3):133-146

[12] Deng SQ, Peng HJ. 2020. Characteristics of and public health responses to the coronavirus disease 2019 outbreak in China. Journal of Clinical Medicine 9(2):575

[13] Deo RC, Nallamothu BK. 2016. Learning about machine learning: the promise and pitfalls of big data and the electronic health record. Circulation Cardiovascular Quality and Outcomes 9(6):618-620

[14] Docherty AB, Harrison EM, Green CA, Hardwick HE, Pius R, Norman L, Holden KA, Read JM, Dondelinger F, Carson G, Merson L, Lee J, Plotkin D, Sigfrid L, Halpin S, Jackson C, Gamble C, Horby PW, Nguyen-Van-Tam JS, Ho A, Russell CD, Dunning J, Openshaw PJ, Baillie JK, Semple MG, ISARIC4C investigators. 2020. Features of 20, 133 UK patients in hospital with covid-19 using the ISARIC WHO clinical characterisation protocol: prospective observational cohort study. BMJ 369:m1985

[15] Gao Y-D, Ding M, Dong X, Zhang J-J, Kursat Azkur A, Azkur D, Gan H, Sun Y-L, Fu W, Li W, Liang H-L, Cao Y-Y, Yan Q, Cao C, Gao H-Y, Brüggen M-C, Van de Veen W, Sokolowska M, Akdis M, Akdis CA. 2021. Risk factors for severe and critically ill COVID-19 patients: a review. Allergy 76(2):428-455

[16] George PM, Barratt SL, Condliffe R, Desai SR, Devaraj A, Forrest I, Gibbons MA, Hart N, Jenkins RG, McAuley DF, Patel BV, Thwaite E, Spencer LG. 2020. Respiratory follow-up of patients with COVID-19 pneumonia. Thorax 75(11):1009-1016

[17] Giacca M, Shah AM. 2022. The pathological maelstrom of COVID-19 and cardiovascular disease. Nature Cardiovascular Research 1(3):200-210

[18] Grasselli G, Zangrillo A, Zanella A, Antonelli M, Cabrini L, Castelli A, Cereda D, Coluccello A, Foti G, Fumagalli R, Iotti G, Latronico N, Lorini L, Merler S, Natalini G, Piatti A, Ranieri MV, Scandroglio AM, Storti E, Cecconi M, Pesenti A. 2020. Baseline characteristics and outcomes of 1,591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy Region, Italy. Journal of the American Medical Association 323(16):1574-1581

[19] Guan W-J, Ni Z-Y, Hu Y, Liang W-H, Ou C-Q, He J-X, Liu L, Shan H, Lei C-L, Hui DSC, Du B, Li L-J, Zeng G, Yuen K-Y, Chen R-C, Tang C-L, Wang T, Chen P-Y, Xiang J, Li S-Y, Wang J-L, Liang Z-J, Peng Y-X, Wei L, Liu Y, Hu Y-H, Peng P, Wang J-M, Liu J-Y, Chen Z, Li G, Zheng Z-J, Qiu S-Q, Luo J, Ye C-J, Zhu S-Y, Zhong N-S. 2020. Clinical characteristics of coronavirus disease 2019 in China. New England Journal of Medicine 382(18):1708-1720

[20] Hao M, Wang Y, Bryant SH. 2014. An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data. Analytica Chimica Acta 806:117-127

[21] Hong X, Chen S, Harris CJ. 2013. Elastic-net prefiltering for two-class classification. IEEE Transactions on Cybernetics 43(1):286-295

[22] Hu J, Szymczak S. 2023. A review on longitudinal data analysis with random forest. Briefings in Bioinformatics 24(2):bbad002

[23] Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, Zhang L, Fan G, Xu J, Gu X, Cheng Z, Yu T, Xia J, Wei Y, Wu W, Xie X, Yin W, Li H, Liu M, Xiao Y, Gao H, Guo L, Xie J, Wang G, Jiang R, Gao Z, Jin Q, Wang J, Cao B. 2020. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395(10223):497-506

[24] Jartti L, Langen H, Söderlund-Venermo M, Vuorinen T, Ruuskanen O, Jartti T. 2011. New respiratory viruses and the elderly. The Open Respiratory Medicine Journal 5:61-69

[25] Koc HC, Xiao J, Liu W, Li Y, Chen G. 2022. Long COVID and its management. International Journal of Biological Sciences 18(12):4768-4780

[26] Lednickya JA, Lauzardo M, Fan ZH, Jutla A, Tilly TB, Gangwar M, Usmani M, Shankar SN, Mohamed K, Eiguren-Fernandez A, Stephenson CJ, Alam MM, Elbadry MA, Loeb JC, Subramaniam K, Waltzek TB, Cherabuddi K, Morris JG, Wu CY. 2020. Viable SARS-CoV-2 in the air of a hospital room with COVID-19 patients. International Journal of Infectious Diseases 100:476-482

[27] Li L-Q, Huang T, Wang Y-Q, Wang Z-P, Liang Y, Huang T-B, Zhang H-Y, Sun W, Wang Y. 2020a. COVID-19 patients’ clinical characteristics, discharge rate, and fatality rate of meta-analysis. Journal of Medical Virology 92(6):577-583

[28] Li S, Jiang L, Li X, Lin F, Wang Y, Li B, Jiang T, An W, Liu S, Liu H, Xu P, Zhao L, Zhang L, Mu J, Wang H, Kang J, Li Y, Huang L, Zhu C, Zhao S, Lu J, Ji J, Zhao J. 2020b. Clinical and pathological investigation of patients with severe COVID-19. JCI Insight 5(12):e138070

[29] Li Z, Zhang W, Huang J, Lu L, Xie D, Zhang J, Liang J, Sui Y, Liu L, Zou J, Lin A, Yang L, Qiu F, Hu Z, Wu M, Deng Y, Zhang X, Lu J. 2025. Machine learning and discriminant analysis model for predicting benign and malignant pulmonary nodules. BMC Medical Informatics and Decision Making 25(1):272

[30] López-Blanco JR, Chacón P. 2016. New generation of elastic network models. Current Opinion in Structural Biology 37:46-53

[31] Mao R, Qiu Y, He J-S, Tan J-Y, Li X-H, Liang J, Shen J, Zhu L-R, Chen Y, Iacucci M, Ng SC, Ghosh S, Chen M-H. 2020. Manifestations and prognosis of gastrointestinal and liver involvement in patients with COVID-19: a systematic review and meta-analysis. The Lancet Gastroenterology & Hepatology 5(7):667-678

[32] Mehta P, McAuley DF, Brown M, Sanchez E, Tattersall RS, Manson JJ. 2020. COVID-19: consider cytokine storm syndromes and immunosuppression. Lancet 395(10229):1033-1034

[33] Murri R, Lenkowicz J, Masciocchi C, Iacomini C, Fantoni M, Damiani A, Marchetti A, Sergi PDA, Arcuri G, Cesario A, Patarnello S, Antonelli M, Bellantone R, Bernabei R, Boccia S, Calabresi P, Cambieri A, Cauda R, Colosimo C, Crea F, De Maria R, De Stefano V, Franceschi F, Gasbarrini A, Parolini O, Richeldi L, Sanguinetti M, Urbani A, Zega M, Scambia G, Valentini V. 2021. A machine-learning parsimonious multivariable predictive model of mortality risk in patients with Covid-19. Scientific Reports 11(1):21136

[34] Natekin A, Knoll A. 2013. Gradient boosting machines, and a tutorial. Frontiers in Neurorobotics 7:21

[35] Onder G, Rezza G, Brusaferro S. 2020. Case-fatality rate and characteristics of patients dying in relation to COVID-19 in Italy. Journal of the American Medical Association 323(18):1775-1776

[36] Ong SWX, Tan YK, Chia PY, Lee TH, Ng OT, Wong MSY, Marimuthu K. 2020. Air, surface environmental, and personal protective equipment contamination by severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) from a symptomatic patient. Jama—Journal of the American Medical Association 323(16):1610-1612

[37] Pretorius E, Venter C, Laubscher GJ, Kotze MJ, Oladejo SO, Watson LR, Rajaratnam K, Watson BW, Kell DB. 2022. Prevalence of symptoms, comorbidities, fibrin amyloid microclots and platelet pathology in individuals with long COVID/post-acute sequelae of COVID-19 (PASC) Cardiovascular Diabetology 21(1):148

[38] Qi Z, Yu Y. 2020. Epidemiological features of the 2019 novel coronavirus outbreak in China. Current Topics in Medicinal Chemistry 20(13):1137-1140

[39] R Core Team. 2024. Version 4.4.1 Vienna: R Foundation for Statistical Computing. software

[40] Raveendran AV, Jayadevan R, Sashidharan S. 2021. Long COVID: an overview. Diabetes, Metabolic Syndrome and Obesity 15(3):869-875

[41] Riddell S, Goldie S, Hill A, Eagles D, Drew TW. 2020. The effect of temperature on persistence of SARS-CoV-2 on common surfaces. Virology Journal 17(1):145

[42] Sorci G, Faivre B, Morand S. 2020. Explaining among-country variation in COVID-19 case fatality rate. Scientific Reports 10(1):18909

[43] Su Y, Yuan D, Chen DG, Ng RH, Wang K, Choi J, Li S, Hong S, Zhang R, Xie J, Kornilov SA, Scherler K, Pavlovitch-Bedzyk AJ, Dong S, Lausted C, Lee I, Fallen S, Dai CL, Baloni P, Smith B, Duvvuri VR, Anderson KG, Li J, Yang F, Duncombe CJ, McCulloch DJ, Rostomily C, Troisch P, Zhou J, Mackay S, De Gottardi Q, May DH, Taniguchi R, Gittelman RM, Klinger M, Snyder TM, Roper R, Wojciechowska G, Murray K, Edmark R, Evans S, Jones L, Zhou Y, Rowen L, Liu R, Chour W, Algren HA, Berrington WR, Wallick JA, Cochran RA, Micikas ME, Wrin T, Petropoulos CJ, Cole HR, Fischer TD, Wei W, Hoon DSB, Price ND, Subramanian N, Hill JA, Hadlock J, Magis AT, Ribas A, Lanier LL, Boyd SD, Bluestone JA, Chu H, Hood L, Gottardo R, Greenberg PD, Davis MM, Goldman JD, Heath JR. 2022. Multiple early factors anticipate post-acute COVID-19 sequelae. Cell 185(5):881-895.e20

[44] Tenforde MW, Kim SS, Lindsell CJ, Billig Rose E, Shapiro NI, Files DC, Gibbs KW, Erickson HL, Steingrub JS, Smithline HA, Gong MN, Aboodi MS, Exline MC, Henning DJ, Wilson JG, Khan A, Qadir N, Brown SM, Peltan ID, Rice TW, Hager DN, Ginde AA, Stubblefield WB, Patel MM, Self WH, Feldstein LR. 2020. Symptom duration and risk factors for delayed return to usual health among outpatients with COVID-19 in a multistate health care systems network—United States, March—2020. MMWR. Morbidity and Mortality Weekly Report 69(30):993-998

[45] The Lancet. 2023. Long COVID: 3 years in. Lancet 401(10379):795

[46] Toubiana J, Poirault C, Corsia A, Bajolle F, Fourgeaud J, Angoulvant F, Debray A, Basmaci R, Salvador E, Biscardi S, Frange P, Chalumeau M, Casanova JL, Cohen JF, Allali S. 2020. Kawasaki-like multisystem inflammatory syndrome in children during the COVID-19 pandemic in Paris, France: prospective observational study. Bmj-British Medical Journal 369:m2094

[47] Turner S, Khan MA, Putrino D, Woodcock A, Kell DB, Pretorius E. 2023. Long COVID: pathophysiological factors and abnormalities of coagulation. Trends in Endocrinology & Metabolism 34(6):321-344

[48] Verity R, Okell LC, Dorigatti I, Winskill P, Whittaker C, Imai N, Cuomo-Dannenburg G, Thompson H, Walker PGT, Fu H, Dighe A, Griffin JT, Baguelin M, Bhatia S, Boonyasiri A, Cori A, Cucunubá Z, FitzJohn R, Gaythorpe K, Green W, Hamlet A, Hinsley W, Laydon D, Nedjati-Gilani G, Riley S, Van Elsland S, Volz E, Wang H, Wang Y, Xi X, Donnelly CA, Ghani AC, Ferguson NM. 2020. Estimates of the severity of coronavirus disease 2019: a model-based analysis. The Lancet Infectious Diseases 20(6):669-677

[49] Wang C, Horby PW, Hayden FG, Gao GF. 2020. A novel coronavirus outbreak of global health concern. Lancet 395(10223):470-473

[50] World Health Organization (WHO). 2024. COVID-19 epidemiological update 2024.