Abstract
Efficient and highly accurate road crack detection algorithms are particularly important in road inspection systems. However, some of their limitations have gradually come to the fore as the target detection aspect has become more in-depth. Existing road target detection algorithms face the difficulty of capturing long-range dependencies, resulting in limited feature expressiveness and high leakage rates in small target detection scenarios (e.g., road fine crack identification). Therefore, in this paper, we propose an improved model YOLO-SW based on YOLO11n. Firstly, based on the structure of the C2f module, we introduce the self-developed SP module (responsible for the multiscale feature aggregation and attention mechanism), and propose the SP_C2f module, which enhances the ability of small-target detection. the method replaces the C3k2 of YOLO11n with the C3k2 of YOLO11n to enhance the feature expression ability by combining the multiscale feature aggregation and attention The method replaces C3k2 of YOLO11n with the SP_C2f module that enhances the feature expression ability by combining multi-scale feature aggregation and attention mechanism, which effectively improves the feature expression accuracy of small targets. At the same time, the CGAFusion module is added to enhance the feature expression ability by combining spatial attention, channel attention, and pixel attention. Experiments on the pavement crack detection Computer Vision Project dataset show that the mean average accuracy of the improved YOLO-SW model (
[email protected]) reaches 58.2%, which is an improvement of 8.7 percentage points over the baseline model YOLO11n. The experimental results validate the significant advantages of YOLO-SW for crack detection in complex road scenarios.