Anomaly synthesis and detection in accounting data via a generative adversarial network
Abstract
To address the challenge of anomaly detection in high-dimensional, heterogeneous accounting data, we propose MF-GAN, a multimodal fusion framework that integrates synthesis and detection within a unified architecture. MF-GAN employs a dual-branch spatio-temporal generator: a temporal branch with regional residual learning to extract sequence dynamics, and a spatial branch based on convolutional networks to capture unstructured features from voucher images and associated text. A generative sample-augmentation mechanism—combining bootstrap resampling with noise injection—produces realistic anomalous samples, improving recall of rare anomaly types by 18.6% on a corpus of 180,000 real accounting records and the EAAD dataset containing 3,200 labeled anomalies, thereby mitigating severe class imbalance. We further introduce a dynamic-weight joint optimization scheme that unifies generator and discriminator losses for collaborative training; ablating this component reduces performance by 21.4%, underscoring the importance of gradient co-training for generalization in unsupervised settings. Experimental results show that MF-GAN achieves 78.48% precision, 96.67% recall, and a 19% F1 improvement over mainstream baselines on EAAD. On the CBFAD dataset, MF-GAN attains 76.48% precision, 86.67% recall, and an 86.05% F1 score. The framework overcomes feature-extraction bottlenecks in high-dimensional time series and, through generative–discriminative co-optimization, provides an interpretable and practical pathway for financial anti-fraud supervision.