Abstract
The authenticity of audio-visual content is being challenged by advanced multimedia editing technologies inspired by Artificial Intelligence-Generated Content (AIGC). Temporal forgery localization aims to detect suspicious contents by locating forged segments. So far, most of the existing methods are based on Convolutional Neural Networks (CNNs) or Transformers, yet neither of them has fully considered the complex relationships within forged audio-visual content. To address this issue, in this paper, we propose a novel method, named TransHFC, which innovatively introduces hypergraphs to model group relationships among segments while considering point-to-point relationships through Transformers. Through its dual hypergraph filtering convolution branch, TransHFC captures both temporal and spatial level group relationships, enhancing the representation of forged segment features. Furthermore, we propose a new hypergraph filtering convolution Auto-Encoder that uses a multi-frequency filter bank for adaptive signal capture. This design compensates for the limitation of a single hypergraph filter. Our extensive experiments on Lav-DF, TVIL, Psynd, and HAD datasets demonstrate that TransHFC achieves state-of-the-art performance.
| Original language | English |
|---|---|
| Pages (from-to) | 9261-9275 |
| Number of pages | 15 |
| Journal | IEEE Transactions on Circuits and Systems for Video Technology |
| Volume | 35 |
| Issue number | 9 |
| DOIs | |
| Publication status | Published - 2025 |
Keywords
- Temporal forgery localization
- hypergraph
- hypergraph convolution
- transformer
Fingerprint
Dive into the research topics of 'TransHFC: Joints Hypergraph Filtering Convolution and Transformer Framework for Temporal Forgery Localization'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver