Diffbias: Harnessing diffusion models’ prediction bias for adversarial patch defense

Research output: Contribution to journalArticlepeer-review

Abstract

Adversarial patches pose a significant and real threat to deep neural networks, capable of inducing misclassification in realistic physical scenarios. Developing reliable and robust defense methods against these attacks is a critical application, and current research remains unsatisfactory. In this paper, we propose a novel framework that exploits the fact that unnatural perturbations introduced by adversarial patches can produce prediction biases significantly different from those of clean images during denoising. In the localization stage, our method focuses on the critical denoising steps through an adaptive temporal sampling strategy and introduces an energy metric that fuses kinetic and potential energy to quantify the degree of anomaly in the denoised trajectory. Furthermore, by combining this with the adaptive similarity weighting mechanism and the striding trajectory consistency analysis, our method effectively suppresses the interference of background noise, so as to achieve accurate locking of the patch area. In the restoration phase, the same diffusion model is applied to the patch region to restore the original visual content and integrity. This two-stage architecture shares a unified diffusion model, enabling the localization and inpainting processes to enhance the overall defense performance through information complementarity. Extensive experiments on the INRIA, COCO2017, and APRICOT datasets show that our approach achieves state-of-the-art detection performance under both digital and physical attack types without compromising the recognition accuracy of clean images.

Original languageEnglish
Article number133009
JournalNeurocomputing
Volume676
DOIs
Publication statusPublished - 1 May 2026

Keywords

  • Adversarial attack
  • Adversarial defense
  • Diffusion model
  • Object detection

Fingerprint

Dive into the research topics of 'Diffbias: Harnessing diffusion models’ prediction bias for adversarial patch defense'. Together they form a unique fingerprint.

Cite this