TY - JOUR
T1 - RNPM
T2 - Neural-Guided Embedding Region Selection and Error Correction for Robust Audio Multi-Watermarking
AU - Li, Qiutong
AU - Liu, Tong
AU - Yuan, Xiaochen
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Robust audio watermarking plays a crucial role in copyright protection; however, existing techniques suffer from low embedding capacity and limited robustness under severe signal distortions. To solve these limitations, this paper proposes a Robust Neural-Guided Parallel Multi-Watermarking (RNPM) scheme. In the RNPM, we propose a U-Net-Based Embedding Region Selection (ERSU-Net) module to accurately locate multiple embedding regions based on robustness characteristics. To better exploit the intrinsic frequency and energy distribution of audio signals, the ERSU-Net module is enhanced with dual-attention modules, thereby improving the robustness. After determining the embedding regions, they are segmented into multiple overlapping frames to facilitate embedding. To further enhance embedding capacity without compromising robustness, the proposed RNPM integrates Discrete Cosine Transform (DCT) and inter-frame difference-based embedding with Gram–Schmidt orthogonalization, enabling parallel multi-watermark embedding. Furthermore, to mitigate extraction errors caused by signal distortion, an error correction mechanism is integrated with the localized embedding regions, improving overall extraction reliability. Experimental results demonstrate that the proposed RNPM achieves superior robustness and inaudibility. In particular, RNPM maintains high robustness with a Bit Error Rate (BER) value of 0 under 20% cropping, MP3 compression at 64 kbps, and 22.5 kHz resampling attacks, surpassing existing state-of-the-art methods.
AB - Robust audio watermarking plays a crucial role in copyright protection; however, existing techniques suffer from low embedding capacity and limited robustness under severe signal distortions. To solve these limitations, this paper proposes a Robust Neural-Guided Parallel Multi-Watermarking (RNPM) scheme. In the RNPM, we propose a U-Net-Based Embedding Region Selection (ERSU-Net) module to accurately locate multiple embedding regions based on robustness characteristics. To better exploit the intrinsic frequency and energy distribution of audio signals, the ERSU-Net module is enhanced with dual-attention modules, thereby improving the robustness. After determining the embedding regions, they are segmented into multiple overlapping frames to facilitate embedding. To further enhance embedding capacity without compromising robustness, the proposed RNPM integrates Discrete Cosine Transform (DCT) and inter-frame difference-based embedding with Gram–Schmidt orthogonalization, enabling parallel multi-watermark embedding. Furthermore, to mitigate extraction errors caused by signal distortion, an error correction mechanism is integrated with the localized embedding regions, improving overall extraction reliability. Experimental results demonstrate that the proposed RNPM achieves superior robustness and inaudibility. In particular, RNPM maintains high robustness with a Bit Error Rate (BER) value of 0 under 20% cropping, MP3 compression at 64 kbps, and 22.5 kHz resampling attacks, surpassing existing state-of-the-art methods.
KW - Gram–Schmidt orthogonalization
KW - Robust audio watermarking
KW - discrete cosine transform (DCT)
KW - error correction
KW - inter-frame projection
UR - https://www.scopus.com/pages/publications/105019930976
U2 - 10.1109/TASLPRO.2025.3624964
DO - 10.1109/TASLPRO.2025.3624964
M3 - Article
AN - SCOPUS:105019930976
SN - 1558-7916
VL - 33
SP - 4552
EP - 4562
JO - IEEE Transactions on Audio, Speech and Language Processing
JF - IEEE Transactions on Audio, Speech and Language Processing
ER -