Abstract
This study improves existing protection strategies for image processing models by embedding invisible watermarks into model outputs to verify the sources of images. Most current methods rely on CNN-based architectures, which are limited by their local perception capabilities and struggle to effectively capture global information. To address this, we introduce the Swin-UNet, originally designed for medical image segmentation tasks, into the watermark embedding process. The Swin Transformer’s ability to capture global information enhances the visual quality of the embedded image compared to CNN-based approaches. To defend against surrogate attacks, data augmentation techniques are incorporated into the training process, enhancing the watermark extractor’s robustness specifically against surrogate attacks. Experimental results show that the proposed watermarking approach reduces the impact of watermark embedding on visual quality. On a deraining task with color images, the average PSNR reaches 45.85 dB, while on a denoising task with grayscale images, the average PSNR reaches 56.60 dB. Additionally, watermarks extracted from surrogate attacks closely match those from the original framework, with an accuracy of 99% to 100%. These results confirm the Swin Transformer’s effectiveness in preserving visual quality.
Original language | English |
---|---|
Article number | 5250 |
Journal | Applied Sciences (Switzerland) |
Volume | 15 |
Issue number | 10 |
DOIs | |
Publication status | Published - May 2025 |
Keywords
- deep learning model
- image processing
- Swin transformer
- watermark