Abstract
Reversible Adversarial Example (RAE) has been widely researched for its ability to ensure authorized access while preventing unauthorized recognition. Existing RAE schemes focus on Reversible Data Hiding techniques and white-box attacks. However, white-box attacks might be impractical due to the unknown parameters of the target model. Besides, these methods suffer massive loss during the embedding of perturbations, impacting the RAE's quality. In this paper, we propose I-RAE scheme to generate black-box RAE with minimal loss based on Invertible Neural Network (INN). Specifically, Black-box Attack Flow (BAFlow) is introduced to generate perturbations on a Gaussian distribution that are more easily embeddable. Furthermore, to enhance the embedding capability of RAE, we innovatively treat the embedding of perturbation as an image hiding and propose Perturbation Hiding Network (PHN) to reversibly hide the entire perturbation into the adversarial example. We also implement wavelet high-frequency hiding to reduce the degradation in the visual quality of RAE. Experimental results on the ImageNet and CIFAR-10 datasets demonstrate that I-RAE achieves state-of-the-art black-box attack ability and visual quality.
Original language | English |
---|---|
Article number | 105094 |
Journal | Image and Vision Computing |
Volume | 147 |
DOIs | |
Publication status | Published - Jul 2024 |
Keywords
- Adversarial attack
- Image restoration
- Invertible neural network