BTSANet: Integrating structural prior and temporal dynamics with Mamba for accurate cerebrovascular segmentation in DSA sequences
Abstract
Digital Subtraction Angiography (DSA) is widely regarded as the clinical gold standard for cerebrovascular disease diagnosis, but cerebrovascular segmentation in DSA sequences remains challenging due to complex vascular morphology and strong spatiotemporal dependencies. Existing methods, including Convolutional Neural Networks (CNNs) and Transformers, struggle with spatiotemporal coupling and computational inefficiency, especially Transformers, which suffer from quadratic complexity (O(N²)) when processing long sequences. To address these issues, we propose BTSANet, a novel spatiotemporal segmentation network based on the Mamba architecture that enables efficient long-range dependency modeling with linear complexity (O(N)). Our network features a dual-branch encoder: one processes raw DSA sequences to capture hemodynamics, while the other takes Minimum Intensity Projection (MinIP) images to extract enhanced vascular structural priors. At the bottleneck, we introduce a Bidirectional Temporal-Spatial Attention (BTSA) module with an asymmetric cross-attention strategy. This module prioritizes spatial structural information, enabling adaptive alignment of temporal dynamics with spatial features to enhance the delineation of vascular boundaries, particularly for fine branches. Extensive experiments on the public DIAS dataset demonstrate that BTSANet outperforms state-of-the-art models based on CNNs, Transformers, and Mamba across multiple metrics, achieving a Dice score of 94.39±1.40% and an Intersection over Union (IoU) of 89.41±2.51%. Systematic ablation studies confirm the synergistic effects between structural priors and temporal dynamics, underscoring the efficacy of each core component. This work offers a robust technical solution for intelligent DSA sequence analysis, with promising clinical translation value.