Optimizing stealthiness in universal adversarial perturbations via class-selective and perceptual similarity metrics
Abstract
Universal Adversarial Perturbations (UAPs) represent a substantial threat to deep learning models, causing widespread misclassification across multiple inputs with a single perturbation. While traditional methods primarily optimize for perturbation strength, they often overlook stealthiness, making the attacks easily detectable. This paper introduces Stealthy-UAP, a novel method designed to enhance the stealthiness of UAPs through two key mechanisms: class-selective attack strategies and perceptual similarity metrics. The class-selective strategy enhances the stealthiness and effectiveness of adversarial attacks by selectively choosing specific classes as target classes, causing them to be misclassified while preserving the original predictions for non-target classes. Perceptual similarity metrics leverage high-level semantic information from pre-trained convolutional neural networks to generate perturbations that align closely with human visual perception, making them imperceptible to human observers while maintaining high attack success rates. Experimental results demonstrate that Stealthy-UAP significantly improves the stealthiness and effectiveness of UAPs compared to existing methods, providing a robust framework for generating imperceptible yet effective adversarial perturbations.