Metaheuristic-driven optimization of ensemble deep learning model for image manipulation classification
Abstract
Detecting image manipulations, particularly in double Joint Photographic Experts Group(JPEG)-compressed images, remains a critical challenge in digital forensics due to the repetitive compression chain introduced by social media platforms. These processes obscure manipulation traces, complicating detection tasks. This study addresses these challenges by proposing a metaheuristic-optimized ensemble deep learning (DL) model that combines Squeeze-and-Excitation Residual Network (SE-ResNet50; fine-grained spatial detail extraction), Squeeze-and-Excitation Dense Convolutional Network (SE-DenseNet121; efficient feature reuse), Squeeze-and-Excitation Inception architecture (SE-InceptionV3; multi-scale feature extraction), and Vision Transformer (ViT-Base/32; long-range dependency and global relationship analysis). Ensemble weights were optimized using five metaheuristic techniques: Artificial Bee Colony (ABC), Ant Colony Optimization (ACO), Bayesian Optimization (BO), Differential Evolution (DE), and Genetic Algorithm (GA), with GA achieving the best optimization results. Experiments on three benchmark datasets—Break Our Steganographic System Base (BOSSBase), 10-class Mini-ImageNet (ImageNet-Mini10K), Uncompressed Colour Image Database (UCID)—across multiple JPEG quality factors (75, 85, 95) showed that the proposed ensemble model significantly outperformed individual DL models and traditional equal-weighted ensembles in terms of accuracy and F1-scores. This study advances manipulation classification by demonstrating the effectiveness of ensemble learning and metaheuristic optimization, providing scalable and robust solutions for real-world forensic applications.