跳至主導覽 跳至搜尋 跳過主要內容

Reconstructing Speech from Real-Time Articulatory MRI Using Neural Vocoders

  • Yide Yu
  • , Amin Honarmandi Shandiz
  • , László Tóth

研究成果: Conference contribution同行評審

16 引文 斯高帕斯(Scopus)

摘要

Several approaches exist for the recording of articulatory movements, such as eletromagnetic and permanent magnetic articulagraphy, ultrasound tongue imaging and surface electromyography. Although magnetic resonance imaging (MRI) is more costly than the above approaches, the recent developments in this area now allow the recording of real-time MRI videos of the articulators with an acceptable resolution. Here, we experiment with the reconstruction of the speech signal from a real-time MRI recording using deep neural networks. Instead of estimating speech directly, our networks are trained to output a spectral vector, from which we reconstruct the speech signal using the WaveGlow neural vocoder. We compare the performance of three deep neural architectures for the estimation task, combining convolutional (CNN) and recurrence-based (LSTM) neural layers. Besides the mean absolute error (MAE) of our networks, we also evaluate our models by comparing the speech signals obtained using several objective speech quality metrics like the mean cepstral distortion (MCD), Short-Time Objective Intelligibility (STOI), Perceptual Evaluation of Speech Quality (PESQ) and Signal-to-Distortion Ratio (SDR). The results indicate that our approach can successfully reconstruct the gross spectral shape, but more improvements are needed to reproduce the fine spectral details.

原文English
主出版物標題29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings
發行者European Signal Processing Conference, EUSIPCO
頁面945-949
頁數5
ISBN(電子)9789082797060
DOIs
出版狀態Published - 2021
對外發佈
事件29th European Signal Processing Conference, EUSIPCO 2021 - Dublin, Ireland
持續時間: 23 8月 202127 8月 2021

出版系列

名字European Signal Processing Conference
2021-August
ISSN(列印)2219-5491

Conference

Conference29th European Signal Processing Conference, EUSIPCO 2021
國家/地區Ireland
城市Dublin
期間23/08/2127/08/21

指紋

深入研究「Reconstructing Speech from Real-Time Articulatory MRI Using Neural Vocoders」主題。共同形成了獨特的指紋。

引用此