Black-box reversible adversarial examples with invertible neural network

Jielun Huang, Guoheng Huang, Xuhui Zhang, Xiaochen Yuan, Fenfang Xie, Chi Man Pun, Guo Zhong

Research output: Contribution to journalArticlepeer-review

Abstract

Reversible Adversarial Example (RAE) has been widely researched for its ability to ensure authorized access while preventing unauthorized recognition. Existing RAE schemes focus on Reversible Data Hiding techniques and white-box attacks. However, white-box attacks might be impractical due to the unknown parameters of the target model. Besides, these methods suffer massive loss during the embedding of perturbations, impacting the RAE's quality. In this paper, we propose I-RAE scheme to generate black-box RAE with minimal loss based on Invertible Neural Network (INN). Specifically, Black-box Attack Flow (BAFlow) is introduced to generate perturbations on a Gaussian distribution that are more easily embeddable. Furthermore, to enhance the embedding capability of RAE, we innovatively treat the embedding of perturbation as an image hiding and propose Perturbation Hiding Network (PHN) to reversibly hide the entire perturbation into the adversarial example. We also implement wavelet high-frequency hiding to reduce the degradation in the visual quality of RAE. Experimental results on the ImageNet and CIFAR-10 datasets demonstrate that I-RAE achieves state-of-the-art black-box attack ability and visual quality.

Original languageEnglish
Article number105094
JournalImage and Vision Computing
Volume147
DOIs
Publication statusPublished - Jul 2024

Keywords

  • Adversarial attack
  • Image restoration
  • Invertible neural network

Fingerprint

Dive into the research topics of 'Black-box reversible adversarial examples with invertible neural network'. Together they form a unique fingerprint.

Cite this