TY - GEN
T1 - A Multimodal Behavior Recognition Network with Interconnected Architectures
AU - Long, Nuoer
AU - Un, Kin Seong
AU - Xiong, Chengpeng
AU - Li, Zhuolin
AU - Chen, Shaobin
AU - Tan, Tao
AU - Lam, Chan Tong
AU - Sun, Yue
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The feature extraction part of the behavior recognition network plays a crucial role in the results of recognition. Different feature extraction networks may lead to varying accuracies, and for higher efficiency, networks usually select only the optimal feature extraction network. In response to this, we propose a network architecture that combines the advantages of different feature networks, which is referred to as the connecting feature network (CFN). The CFN framework involves a two-stage method: in the first stage, we use ResNet as the feature extraction network; in the second stage, we utilize a behavior-aware network based on the vision transformer for feature extraction. We hope that the phased training will ensure the complete preservation of the advantages of different feature extraction networks. Importantly, CFN can be flexibly applied to various tasks involving multiple network architectures, thereby achieving the integration of diversified feature extraction capabilities. By strategically integrating these components, we aim to enhance the overall performance of behavior recognition systems across different domains. Finally, the effectiveness of CFN was validated in the Animal Kingdom dataset.
AB - The feature extraction part of the behavior recognition network plays a crucial role in the results of recognition. Different feature extraction networks may lead to varying accuracies, and for higher efficiency, networks usually select only the optimal feature extraction network. In response to this, we propose a network architecture that combines the advantages of different feature networks, which is referred to as the connecting feature network (CFN). The CFN framework involves a two-stage method: in the first stage, we use ResNet as the feature extraction network; in the second stage, we utilize a behavior-aware network based on the vision transformer for feature extraction. We hope that the phased training will ensure the complete preservation of the advantages of different feature extraction networks. Importantly, CFN can be flexibly applied to various tasks involving multiple network architectures, thereby achieving the integration of diversified feature extraction capabilities. By strategically integrating these components, we aim to enhance the overall performance of behavior recognition systems across different domains. Finally, the effectiveness of CFN was validated in the Animal Kingdom dataset.
KW - Behavior Recognition
KW - Connecting Feature Network (CFN)
KW - Diverse Feature Extraction
KW - Two-Stage Approach
UR - http://www.scopus.com/inward/record.url?scp=85203831608&partnerID=8YFLogxK
U2 - 10.1109/ICMEW63481.2024.10645380
DO - 10.1109/ICMEW63481.2024.10645380
M3 - Conference contribution
AN - SCOPUS:85203831608
T3 - 2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024
BT - 2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2024
Y2 - 15 July 2024 through 19 July 2024
ER -