TY - JOUR
T1 - Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques
AU - Liu, Tong
AU - Yuan, Xiaochen
N1 - Publisher Copyright:
© 2023, The Author(s).
PY - 2023/12
Y1 - 2023/12
N2 - Emotion plays a dominant role in speech. The same utterance with different emotions can lead to a completely different meaning. The ability to perform various of emotion during speaking is also one of the typical characters of human. In this case, technology trends to develop advanced speech emotion classification algorithms in the demand of enhancing the interaction between computer and human beings. This paper proposes a speech emotion classification approach based on the paralinguistic and spectral features extraction. The Mel-frequency cepstral coefficients (MFCC) are extracted as spectral feature, and openSMILE is employed to extract the paralinguistic feature. The machine learning techniques multi-layer perceptron classifier and support vector machines are respectively applied into the extracted features for the classification of the speech emotions. We have conducted experiments on the Berlin database to evaluate the performance of the proposed approach. Experimental results show that the proposed approach achieves satisfied performances. Comparisons are conducted in clean condition and noisy condition respectively, and the results indicate better performance of the proposed scheme.
AB - Emotion plays a dominant role in speech. The same utterance with different emotions can lead to a completely different meaning. The ability to perform various of emotion during speaking is also one of the typical characters of human. In this case, technology trends to develop advanced speech emotion classification algorithms in the demand of enhancing the interaction between computer and human beings. This paper proposes a speech emotion classification approach based on the paralinguistic and spectral features extraction. The Mel-frequency cepstral coefficients (MFCC) are extracted as spectral feature, and openSMILE is employed to extract the paralinguistic feature. The machine learning techniques multi-layer perceptron classifier and support vector machines are respectively applied into the extracted features for the classification of the speech emotions. We have conducted experiments on the Berlin database to evaluate the performance of the proposed approach. Experimental results show that the proposed approach achieves satisfied performances. Comparisons are conducted in clean condition and noisy condition respectively, and the results indicate better performance of the proposed scheme.
KW - Multi-layer perceptron classifier
KW - Paralinguistic features
KW - Spectral features
KW - Speech emotion classification
KW - Support vector machine
UR - http://www.scopus.com/inward/record.url?scp=85159854814&partnerID=8YFLogxK
U2 - 10.1186/s13636-023-00290-x
DO - 10.1186/s13636-023-00290-x
M3 - Article
AN - SCOPUS:85159854814
SN - 1687-4714
VL - 2023
JO - Eurasip Journal on Audio, Speech, and Music Processing
JF - Eurasip Journal on Audio, Speech, and Music Processing
IS - 1
M1 - 23
ER -