TY - GEN
T1 - Cross-Modality Disentangled Information Bottleneck Strategy for Multimodal Sentiment Analysis
AU - Deng, Zhengnan
AU - Huang, Guoheng
AU - Zhong, Guo
AU - Yuan, Xiaochen
AU - Huang, Lian
AU - Pun, Chi Man
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Multimodal Sentiment Analysis (MSA) has been a pivotal domain in current research area which utilizes diverse information carriers such as videos containing multiple modal-ities to understand the user's sentiment. With the success of multimodal fusion techniques, lots of fusion strategies have been proposed to obtain a favorable multimodal joint representation for MSA. However, existing studies hardly consider the problem of redundant information in unimodal, resulting in the joint representation may contain much redundant information from different modalities, thus limiting the accuracy of sentiment prediction. In this work, we propose a Cross-Modality Disentangled Information Bottleneck Strategy (CMDIBS), which consists of a Cross-Modality Knowledge Awareness (CMKA) module and a Multimodal Disentangled Information Bottleneck (MDIB) mechanism. Specifically, the CMKA module encourages in-teractions among different modalities to learn the sentiment embedding relevant to the predicted goals. In particular, MDIB mechanism aims to maximize the mutual information (MI) between the multimodal joint representation and the predicted label, and maximize the MI between the style embedding with the label and the input data while constraining the MI between the multimodal joint representation and the style embedding to obtain a succinct and efficient multimodal joint representation. Experimental results on the benchmark datasets, namely CMU-MOSI and CMU-MOSEI, indicated that the proposed method surpasses existing approaches and attains SOTA performance.
AB - Multimodal Sentiment Analysis (MSA) has been a pivotal domain in current research area which utilizes diverse information carriers such as videos containing multiple modal-ities to understand the user's sentiment. With the success of multimodal fusion techniques, lots of fusion strategies have been proposed to obtain a favorable multimodal joint representation for MSA. However, existing studies hardly consider the problem of redundant information in unimodal, resulting in the joint representation may contain much redundant information from different modalities, thus limiting the accuracy of sentiment prediction. In this work, we propose a Cross-Modality Disentangled Information Bottleneck Strategy (CMDIBS), which consists of a Cross-Modality Knowledge Awareness (CMKA) module and a Multimodal Disentangled Information Bottleneck (MDIB) mechanism. Specifically, the CMKA module encourages in-teractions among different modalities to learn the sentiment embedding relevant to the predicted goals. In particular, MDIB mechanism aims to maximize the mutual information (MI) between the multimodal joint representation and the predicted label, and maximize the MI between the style embedding with the label and the input data while constraining the MI between the multimodal joint representation and the style embedding to obtain a succinct and efficient multimodal joint representation. Experimental results on the benchmark datasets, namely CMU-MOSI and CMU-MOSEI, indicated that the proposed method surpasses existing approaches and attains SOTA performance.
KW - Cross-modality
KW - Disentangled information bottleneck
KW - Multimodal sentiment analysis
UR - https://www.scopus.com/pages/publications/85217837223
U2 - 10.1109/SMC54092.2024.10831481
DO - 10.1109/SMC54092.2024.10831481
M3 - Conference contribution
AN - SCOPUS:85217837223
T3 - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
SP - 2267
EP - 2274
BT - 2024 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2024
Y2 - 6 October 2024 through 10 October 2024
ER -