Multi-Granularity Query Network With Adaptive Category Feature Embedding for Behavior Recognition

  • Nuoer Long
  • , Yonghao Dang
  • , Kaiwen Yang
  • , Chengpeng Xiong
  • , Shaobin Chen
  • , Tao Tan
  • , Wei Ke
  • , Chan Tong Lam
  • , Jianqin Yin
  • , Peter H.N. de With
  • , Yue Sun

Research output: Contribution to journalArticlepeer-review

Abstract

Behavior recognition is a highly challenging task, particularly in scenarios requiring unified recognition across both human and animal subjects. Most existing approaches primarily focus on single-species datasets or rely heavily on prior information such as species labels, positional annotations, or skeletal keypoints, which limits their applicability in real-world scenarios where species labels may be ambiguous or annotations are insufficient. To address these limitations, we propose a query-based Multi-Granularity Behavior Recognition Network that directly mines cross-species shared spatiotemporal behavior patterns from raw video inputs. Specifically, we design a Multi-Granularity Query module to effectively fuse fine-grained and coarse-grained features, thereby enhancing the model's capability in capturing spatiotemporal dynamics at different granularities. Additionally, we introduce a Category Query Decoder that leverages learnable category query vectors to achieve explicit behavior category modeling and mapping. Without relying on any extra annotations, the proposed method achieves unified recognition of multi-species and multi-category behaviors, setting a new state-of-the-art on the Animal Kingdom dataset and demonstrating strong generalization ability on the Charades dataset.

Original languageEnglish
JournalIEEE Transactions on Multimedia
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • Behavior recognition
  • Category Query
  • Cross-Species
  • Multi-Granularity

Fingerprint

Dive into the research topics of 'Multi-Granularity Query Network With Adaptive Category Feature Embedding for Behavior Recognition'. Together they form a unique fingerprint.

Cite this