Integrating cerebrovascular morphology and radiomics features for predicting stroke prognosis: a retrospective study
- Published
- Accepted
- Received
- Academic Editor
- Luigi Di Biasi
- Subject Areas
- Cardiology, Neurology, Radiology and Medical Imaging, Computational Science, Data Mining and Machine Learning
- Keywords
- Stroke, Vascular structure, Radiomics, Machine learning, 90-day mRS prediction prediction
- Copyright
- © 2026 Pu et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
- Cite this article
- 2026. Integrating cerebrovascular morphology and radiomics features for predicting stroke prognosis: a retrospective study. PeerJ 14:e20588 https://doi.org/10.7717/peerj.20588
Abstract
Accurately predicting 90-day Modified Rankin Scale (mRS) scores for acute ischemic stroke (AIS) patients is crucial for guiding treatment strategies. However, many existing mRS prediction methods rely on clinicians to manually evaluate relevant features, and the accuracy of feature quantification and model reproducibility still need to be further improved. This study proposes a machine learning framework that combines multimodal imaging features in order to predict 90-day mRS outcomes. A retrospective analysis was conducted on 86 AIS cases. Morphological features of the intracranial arterial and venous system were extracted from computed tomography angiography (CTA) images. Additionally, radiomics features were obtained from the ischemic lesion on diffusion-weighted imaging (DWI). Recognizing the significance of the peri-infarct penumbra in stroke prognosis, radiomics features were also extracted from the annular region surrounding the ischemic lesion. Redundant features were eliminated using a sparse representation method, and a sparse representation-based classifier was developed to predict mRS outcomes. Model performance was validated using cross-validation and independent test. A total of 1,066 features, including 40 vascular morphological features and 1,026 radiomics features, were extracted. Both feature types demonstrated statistical significance (P < 0.05). Ultimately, 26 features were selected to construct the classification model. The proposed model achieved robust performance on the independent test set, with a classification accuracy of 0.828, an area under the curve (AUC) of 0.942, sensitivity of 0.789, specificity of 0.900, positive predictive value of 0.937, and negative predictive value of 0.692. By integrating vascular morphological features with radiomics features from the ischemic lesion and peri-ischemic lesion regions in DWI, the proposed machine learning model provides accurate predictions of 90-day clinical outcomes for AIS, offering valuable insights for personalized stroke management.
Introduction
Stroke is one of the leading causes of morbidity and mortality worldwide, resulting in significant long-term consequences for both patients and healthcare systems (Mistry et al., 2021). The Modified Rankin Scale (mRS) is widely used to assess the functional outcome of stroke patients, and predicting the 90-day mRS score is critical for guiding clinical decision-making and rehabilitation strategies (Xu et al., 2021). Accurate prediction of post-stroke functional outcomes can help identify patients at high risk of poor recovery, allowing for targeted interventions and resource allocation (Huo et al., 2023). Moreover, this prediction has the potential to enhance prognostic models, aid in the evaluation of new therapies, and improve patient counseling and planning. Recent advances in neuroimaging and machine learning techniques have shown promising results in improving the accuracy of 90-day mRS outcome predictions. Imaging biomarkers, such as lesion volume, location, and the presence of collateral circulation, have been identified as critical predictors of functional recovery. However, challenges remain in optimizing these models for clinical use, and a better understanding of the underlying factors contributing to functional outcomes is essential for improving stroke rehabilitation and personalized treatment strategies (Xu et al., 2021; Mistry et al., 2021; Huo et al., 2023).
Collateral circulation, an alternative blood flow pathway that sustains and protects brain tissue post-stroke, plays a pivotal role in determining stroke prognosis (Wang et al., 2013; Chen et al., 2023). The structural morphology of cerebral blood vessels plays a crucial role in collateral circulation, and numerous studies have investigated the predictive value of vascular morphology in stroke outcomes. For example, Lei et al. (2024) explored the association between enlarged perivascular spaces (PVS) and outcomes of intravenous thrombolysis in acute ischemic stroke (AIS) patients, while Zhang et al. (2017) examined the relationship between the total magnetic resonance imaging (MRI) burden of cerebral small vessel disease (SVD) and post-stroke depression in patients with acute lacunar stroke. Van Der Hoeven et al. (2016) demonstrated that superior collateral circulation was correlated with better 90-day mRS outcomes in patients undergoing endovascular therapy, underscoring its importance in prognosis. Despite these achievements, limitations such as manual assessment bias and limited quantitative analysis continue to restrict clinical translation. Many existing methods rely on manual vascular assessments performed by radiologists, which introduces subjectivity and variability due to differences in expertise and experience, thereby reducing reproducibility. Furthermore, these studies often lack a comprehensive and quantitative vascular assessment, which restricts their clinical applicability and predictive performance (Weng et al., 2023).
Simultaneously, the ischemic lesion on diffusion-weighted imaging (DWI) has also emerged as a key focus for stroke prognosis research. Li et al. (2025) extracted 851 features of the ischemic lesion to predict 90-day mRS outcomes, while Wei et al. (2024) analyzed imaging features from DWI and apparent diffusion coefficient (ADC) maps to identify AIS patients at high risk of poor recovery. Wang et al. (2022) utilized DWI-based radiomics to predict 1-year ischemic stroke recurrence. However, these methods often focus exclusively on the ischemic lesion and neglect the surrounding penumbra, which is a critical target for therapeutic intervention (Vagal et al., 2018). Although both the ischemic lesion and penumbra are closely linked to stroke prognosis (Tang et al., 2020), their characterization fundamentally depends on the global morphology of the intracranial arterial and venous system. A combined assessment that integrates these dimensions may provide a more comprehensive understanding of stroke pathology and improve outcome predictions.
In this study, we propose a novel machine learning framework that integrates whole-brain vascular morphological features with radiomics features of both the ischemic lesion and peri-infarct region to predict 90-day mRS outcomes. Specifically, our objectives are: (1) to systematically quantify cerebral vascular morphology, (2) to extract radiomics features from the ischemic lesion and peri-ischemic lesion regions, and (3) to construct and validate a sparse representation-based multimodal predictive model.
Materials & Methods
Materials
A total of 207 patients with acute cerebral infarction hospitalized in the Department of Neurology of Xuhui District Dahua Hospital and Minhang Hospital, Fudan University between April 2022 and April 2024 were included in this study. All these cases met the diagnostic criteria for acute cerebral infarction, were diagnosed using imaging such as cranial computed tomography (CT) or cranial MRI and neurological physical examination, and had NIHSS scores ranging from 0 to 25. Inclusion criteria were: (1) Case diagnosed as AIS; (2) Pre-treatment computed tomography angiography (CTA) imaging; (3) DWI imaging performed within 24 h of admission; (4) Complete clinical data. Exclusion criteria were: hemorrhagic stroke, brain tumors, poor image quality, or missing data. Thrombolysis or thrombectomy status (TICI score) was not used as an inclusion criterion, as our aim was to construct a model applicable across the entire AIS population. Figure 1 provides the patient selection flowchart. Eighty-six cases were ultimately included in our experiment. These data were randomly divided into a cross-validation set and an independent testing set at a ratio of 2:1. Based on existing studies (Seker et al., 2020) on 90-day mRS predictions, we defined a 90-day mRS score of less than 3 as a good prognosis and scores of 3 or higher as a bad prognosis, and then built two classification models to predict these outcomes.
In this retrospective study, all patients underwent non-contrast CT and CTA at admission for emergency diagnosis and treatment decision-making in accordance with standard AIS protocols. MRI examinations including DWI and ADC sequences were performed within two days after hospitalization for post-treatment evaluation and radiomic analysis (Li et al., 2025). No CT perfusion (CTP) imaging was used in this study.
Figure 1: The patient selection flowchart.
The CTA images were acquired on Siemens Sensation 64 with the following parameters: tube voltage of 80 kV, tube current of 50 mAS, and slice thickness 10 mm. The voxel resolution and size of the CTA images are 0.486 mm * 0.486 mm * 1 mm and 512 * 512 * 319, respectively. The DWI images were acquired on a Magnetom Trio 3T GE scanner with the following parameters: (TR/TE=2800/75.4 msec; FA=90°; Slice thickness/gap=5/1.5 mm; FOV=230 mm×220 mm). The voxel resolution and size of the DWI images are 0.9375 mm * 0.9375 mm * 6 mm and 256 * 256 * 16, respectively.
Methods
The overall framework of our proposed method is shown in Fig. 2. For the input DWI images, we extracted radiomic features from both the ischemic lesion and peri-ischemic lesion regions. For the input CTA images, we first automatically segmented the whole brain vessels and then extracted the morphological features of the vessels. After the radiomic features and vascular features were fused, the features were filtered using the sparse representation method. Sparse representation is a commonly used feature selection method. It uses the features of samples in the training set to linearly represent the class labels and selects the optimal feature combination to represent the sample labels by sparsifying the representation coefficients. Sparse representation can select a set of features that are highly relevant to the sample class and have low inter-feature correlation, thereby improving model classification performance and reducing model overfitting. (For the specific mathematical expression model of sparse representation feature selection, please refer to Wu et al. (2019b)). Finally, the filtered features were sent to the classifier to predict the quality of the 90-day mRS scores. In each of the following subsections, we described each step in detail. This retrospective study was approved by the Institutional Ethics Committee of Xuhui District Dahua Hospital (Approval No. 20241008) and Minhang Hospital, Fudan University Approval No. 2021-Batch-008-01K. Written informed consent was obtained from all subjects (patients) in this study.
Feature extraction
We defined the DWI lesion as the ischemic lesion. Because CT perfusion data were not available for all patients, the peri-ischemic penumbra was approximated by outwardly expanding the ischemic lesion mask on DWI by 10 pixels, following previous studies. This approximated penumbra was not used for direct clinical diagnosis, but only as a region of interest for subsequent radiomics analysis. Radiomics transforms imaging data into high-throughput features, enabling the discovery of imaging biomarkers associated with stroke prognosis. Figure 3 illustrates the original DWI image, manually delineated ischemic lesion, and calculated peri-ischemic region. The ischemic lesion region was annotated by two senior radiologists: one performed the annotation and the other verified its accuracy. We extracted three types of radiomics features from the ischemic lesion region and peri-ischemic region. Intensity features: 18 features quantifying the statistical distribution of voxel intensities within the lesion. Texture features: 39 features evaluating the spatial arrangement of the lesion, categorized into four subgroups (gray-level co-occurrence matrix, gray-level run-length matrix, gray-level size zone matrix, and neighborhood gray-tone difference matrix). Wavelet features: each feature was then decomposed into eight frequency sub-bands, resulting in 456 (i.e., (18+39)∗8) wavelet features. In total, 513 (i.e., 18+39+456) features were extracted, the detailed calculations of which refer to Vallières et al. (2015) and Wu et al. (2019a).
Figure 2: The workflow of the proposed framework.
Figure 3: Region of interest on the DWI image.
(1) The original DWI image (left), (2) the annotated ischemic lesion (middle), and (3) the delineated peri-ischemic region (right) used for radiomics analysis.To analyze vascular morphology, a U-Net (Lv et al., 2023) model was employed to automatically segment cerebral vessels from CTA images. The 3D vessel centerline was then extracted to quantify vascular structural information. As shown in the second row of the feature extraction part of Fig. 2, based on the centerline, 10 vascular features were calculated: vessel volume, number of vessel branches, total vessel length, average vessel length, average distance factor (DF) (Ronneberger, Fischer & Brox, 2015), average sum of angles metric (SOAM) (Fu et al., 2020), number of edge nodes, number of network nodes, clustering coefficient, and structure entropy. To enhance detail, 2D vascular features were also extracted. First, the 3D vessels were projected onto the 2D axial plane. Blood vessel enhancement, 2D threshold segmentation, and centerline extraction were then performed to calculate 10 additional features. Stroke-induced asymmetry in the intracranial arterial and venous system was further analyzed by computing morphological differences between the affected and healthy hemispheres (affected side minus healthy side). In total, 40 morphological features of cerebral vascular structures were extracted, including both 3D and 2D features.
All vascular feature extraction steps were fully automated. No manual correction was required. Vascular features were analyzed independently from DWI radiomics, and integration was performed at the feature level rather than through spatial registration.
Feature selection and classification model
There was some redundant information in the 1,066 extracted features that could potentially increase computational complexity and risk of overfitting. To address this, a sparse representation-based feature selection method was applied. This approach retains features with the highest discriminative ability by sparsely representing each test sample with training features from different classes. Ultimately, 26 features were selected, as they achieved the best classification performance in cross-validation while maintaining model simplicity and interpretability. A classification model leveraging sparse representation was then built. During classification, test sample features were sparsely represented using the training set features from different classes, and residuals were calculated for each class. The test sample was assigned to the class with the minimum residuals (Wu et al., 2018). The dataset of 86 cases was randomly split into a 10-fold cross-validation set and an independent testing set at a 2:1 ratio.
Statistical analyses
Prior to model development, independent sample t-tests were performed to evaluate differences in extracted imaging features between the good- and poor-prognosis groups (P < 0.05 was considered statistically significant). Model performance was assessed using accuracy (ACC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and the area under the receiver operating characteristic curve (AUC). Differences in AUC between models were compared using the DeLong test.
Results
The clinical statistical characteristics of the 86 final enrolled patients are shown in Table 1. There were 39 cases in the poor prognosis group (MRS>2), and 47 cases in the good prognosis group (MRS<3). Overall, the average age and risk factors of the poor prognosis group were higher than those of the good prognosis group. Of the enrolled cases, 82% of the patients who underwent thrombectomy achieved a Thrombolysis in Cerebral Infarction (TICI) score ≥ 2b, reflecting successful reperfusion in the majority of EVT cases.
| Variables | ALL | MRS > 2 (39) | MRS < 3 (47) |
|---|---|---|---|
| Age | 64.16 ± 11.75 | 66.15 ± 10.25 | 62.51 ± 12.49 |
| Sex, male (%) | 64 (74.42) | 27 (69.23) | 37 (78.72) |
| Location of occlusion (%) | |||
| Internal carotid artery | 17 (19.77) | 7 (17.95) | 10 (21.28) |
| Middle cerebral artery | 69 (80.23) | 32 (82.05) | 37 (78.72) |
| Risk factors (%) | |||
| Hypertension | 36 (41.86) | 26 (66.67) | 10 (21.28) |
| Diabetes | 21 (24.42) | 12 (30.77) | 9 (19.15) |
| Smoke | 54 (62.79) | 28 (71.79) | 26 (55.32) |
| Drinking | 11 (12.79) | 6 (15.38) | 5 (10.64) |
Among the extracted features, 41 features showed statistical differences between the good and bad prognosis groups (P < 0.05). Figure 4 shows the statistical box plots of the five features with P <= 0.01. These five features included one ischemic lesion region radiomics feature (wavelet5.textures.gray-level size zone matrix.small zone low gray-level emphasis, WT5TGLSZMSZLGE), one vascular morphology feature (clustering coefficient), and three penumbra radiomics features (image.textures.gray-level co-occurrence matrix.homogeneity, ITGLCMH; image.textures.gray-level co-occurrence matrix.dissimilarity, ITGLCMD; image.textures.gray-level co-occurrence matrix.contrast, ITGLCMCONT).
Figure 4: The statistical box plots of the five features with P ≤ 0.01.
Table 2 summarizes the overall prediction performance of the models on both the training and test sets. Of the comparison methods, the “Vessel” model relied solely on vascular morphology, the “DWI” model used only imaging omics, and the “Vessel+DWI” model integrated both feature types. These models achieved prediction accuracies of 0.684, 0.789, and 0.842 on the cross-validation set, and 0.655, 0.793, and 0.828 on the independent test set, respectively. The similar performance on the cross-validation and independent test sets demonstrates the robustness of the proposed models.
| Datasets | Methods | ACC | SEN | SPE | PPV | NPV |
|---|---|---|---|---|---|---|
| Cross validation | Vessel | 0.684 | 0.500 | 0.784 | 0.556 | 0.744 |
| DWI | 0.789 | 0.700 | 0.838 | 0.700 | 0.838 | |
| Vessel+DWI | 0.842 | 0.750 | 0.892 | 0.789 | 0.868 | |
| Independent test | Vessel | 0.655 | 0.632 | 0.700 | 0.800 | 0.500 |
| DWI | 0.793 | 0.684 | 1.000 | 1.000 | 0.625 | |
| Vessel+DWI | 0.828 | 0.789 | 0.900 | 0.937 | 0.692 |
Additionally, the sensitivity and specificity differences for the “Vessel+DWI” model between the cross-validation and independent test sets were within 0.15, indicating a relatively balanced classification of positive and negative samples. Figures 5 and 6 illustrate the area under the curve (AUC) and confusion matrices for the three models on the cross-validation and independent test sets, respectively. Notably, the combined “Vessel+DWI” model achieved AUC values of 0.876 and 0.942 on the cross-validation and independent test sets, highlighting its superior predictive capability.
Figure 5: The AUC curves and confusion matrices of the three groups of models on the cross-validation set.
Figure 6: The AUC curves and confusion matrices of the three groups of models on the independent test set.
Figure 7 presents the decision curves for the three models on the independent test set, demonstrating that the combined “Vessel+DWI” model delivered higher net benefits compared to the other models. To further emphasize the model’s clinical applicability, a diagnostic nomogram was constructed by integrating the prediction scores of the “Vessel+DWI” model with key clinical indicators, as shown in Fig. 8. The calculated prediction probabilities from the nomogram offer valuable guidance for the clinical diagnosis and treatment of stroke.
Figure 7: The decision curves of the three models on the independent test set.
Figure 8: Diagnostic nomogram combining the combined model with clinical characteristics.
Discussion
Predicting 90-day mRS scores for stroke patients holds significant value for guiding treatment strategies. While numerous methods exist for predicting stroke prognosis based on brain CT or MRI images (Tanriverdi et al., 2016; Cao et al., 2025), challenges remain in the comprehensive quantification of whole-brain features. For instance, cerebrovascular-based approaches typically focus on the morphology of a few specific blood vessels but neglect the holistic evaluation of the entire cerebral vasculature (Brugnara et al., 2020). Furthermore, many evaluation methods rely on manual assessments by clinicians, which compromises the robustness and reproducibility of predictive models.
On the other hand, numerous studies have utilized radiomic features from the ischemic lesion region in DWI to predict stroke prognosis, but they often overlooked the impact of features derived from the penumbra region surrounding the ischemic lesion region. This limitation restricts the predictive performance of such models. To address these gaps, this study integrated morphological features of the whole-brain vasculature, radiomic features of the ischemic lesion region, and radiomic features of the penumbra region to predict 90-day mRS scores for stroke patients.
Previous research suggests that structural changes in the cerebrovascular system can distinguish healthy brains from diseased ones (Liu et al., 2018). Arterial anatomical features, such as curvature and bifurcation, influence hemodynamics and contribute to plaque formation and progression, which may trigger ischemic events (Leng, Wong & Liebeskind, 2014). Thus, changes in vascular structure are thought to reflect potential functional and pathophysiological alterations in the brain (Hu et al., 2023). In this study, the selected vascular morphological features effectively highlighted differences in the number of blood vessels between the infarct and healthy sides of the brain. Additionally, the findings suggest that minor reductions in small blood vessels may reliably predict infarction trends, providing valuable insights for clinical intervention.
In this study, a total of 1,066 imaging features were extracted. While high-throughput features can provide a comprehensive characterization of lesion conditions, such a large number of variables may increase computational burden and the risk of model overfitting. To address these challenges, we applied a sparse representation method for feature selection, followed by the construction of a non-parametric sparse representation classifier. Through this process, only 26 features were ultimately retained, which demonstrated the greatest discriminative power and yielded stable cross-validation performance. These results suggest that a compact and informative set of imaging biomarkers is sufficient to achieve accurate prognosis prediction while minimizing overfitting risk.
The experimental results demonstrated that all three feature types extracted in this study were statistically significant. Notably, the penumbra region exhibits more features with statistically significant differences (as shown in Fig. 4). The combined “Vessel + DWI” model achieved a prediction accuracy exceeding 0.82 on both the cross-validation and independent test sets, underscoring the effectiveness of the proposed approach. Furthermore, the combined models outperformed single-feature models, reinforcing the importance of the multi-modal feature integration proposed in this study.
Figure 9 illustrates the features ultimately used for model building, ranked by importance, with different colors representing various feature categories. Overall, brain parenchymal features from DWI images were found to play a more critical role than cerebrovascular features. This may be because the characteristics of the ischemic lesion region and peri-ischemic region directly determine stroke prognosis.
Figure 9: Modeling feature categories and importance ranking.
This study has several limitations. First, we did not perform detailed morphological analyses of individual major vessels such as the MCA or ICA, which could provide additional prognostic insights. Second, delineation of the penumbra was approximated by outward expansion of the ischemic lesion on DWI rather than derived from perfusion imaging, due to the absence of consistent CTP data. Although sufficient for radiomics feature extraction, this approximation cannot fully replace perfusion-based characterization. Moreover, our current framework did not incorporate the TICI score. This was because our study cohort included both intravenous thrombolysis (IVT) and endovascular thrombectomy (EVT) patients. As the TICI grading system is applicable only to patients who undergo mechanical thrombectomy with angiographic evaluation, this parameter was unavailable for the entire population. While many previous studies on mRS prediction have focused exclusively on EVT cohorts (e.g., Brugnara et al., 2020; Mistry et al., 2021), several recent radiomics and machine learning studies (e.g., Li et al., 2025; Wei et al., 2024; Weng et al., 2023) have adopted a mixed-patient design that includes both IVT and EVT cases. Such a design enhances the generalizability and real-world applicability of prognostic models, as it reflects the true heterogeneity of acute ischemic stroke (AIS) patients encountered in clinical settings. In line with these studies, our aim was to develop a broadly applicable imaging-based prediction framework rather than a treatment-specific prognostic model. Therefore, the TICI grade was analyzed descriptively but was not incorporated as a predictive feature in the model. We acknowledge that future large-scale multicenter studies with homogeneous EVT data will enable the systematic integration of detailed reperfusion metrics—such as TICI grade and occlusion site—into model development, which is expected to further improve predictive accuracy, robustness, and interpretability.
Conclusion
This study proposes a machine learning framework that integrates whole-brain vascular morphological features and radiomics features from both the ischemic lesion region and peri-ischemic region to predict the 90-day mRS scores for AIS patients. By extracting 1,066 features and employing sparse representation for feature selection, the final model achieved an accuracy of 82.8% and AUC of 0.942. The results highlight the importance of both vascular morphology and DWI-based radiomics features, especially from the penumbra region, in predicting stroke outcomes. This multimodal feature integration outperformed single-feature models, offering promising support for personalized treatment and early intervention, with strong potential for clinical application.
Supplemental Information
Code for experimental results display
Run ”results_show.m” to get the result data and figures in the paper.
Extracted original feature data and model prediction result data
Vascular morphological features and radiomics features of 86 cases.Multiple sets of comparison model prediction result data, including true labels, predicted labels and predicted scores








