跳至主導覽 跳至搜尋 跳過主要內容

A multi-modal speech emotion recognition method based on graph neural networks

研究成果: Article同行評審

摘要

For emotion recognition tasks, to combine text and speech is a scheme with better modal interactivity and less feature redundancy, which can effectively improve performance. However, existing methods easily neglect the potential relationships and differences between modalities, which makes it difficult to better understand and utilize multi-modal and multi-level emotional information. Therefore, based on graph neural networks, this article proposes a novel multi-modal speech emotion recognition method. Specifically, it designs a reconstructed-graph fusion mechanism to achieve cross-modal interaction and enhance the interpretability of fused features. A gating update mechanism is designed to eliminate modal redundancy and preserve emotional characteristics. The weighted accuracy of 78.09% and the unweighted accuracy of 78.44% are achieved on the IEMOCAP dataset, which is comparable or even superior to the baseline methods.

原文English
文章編號1051
期刊Applied Intelligence
55
發行號16
DOIs
出版狀態Published - 11月 2025

指紋

深入研究「A multi-modal speech emotion recognition method based on graph neural networks」主題。共同形成了獨特的指紋。

引用此