TY - JOUR
T1 - Recognition of score words in freestyle kayaking using improved DTW matching
AU - Zhang, Qiyuan
AU - Yuan, Xiaochen
AU - Lam, Chan Tong
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
PY - 2024/9
Y1 - 2024/9
N2 - Voice is the most natural information carrier for human beings, and is likely to become the main method of human–computer interaction in the future. This article focuses on the recognition of score words in freestyle kayaking, and collects words from multiple speakers, each with a specific freestyle kayak action word. In this paper, a new method using mel-scale frequency cepstral coefficients (MFCC) and improved dynamic time warping (DTW) is presented for isolated speech recognition. An endpoint detection method is proposed and implemented based on short-time energy and zero-crossing rate. After preprocessing with endpoint detection, the speech signal was analyzed and converted into speech feature parameters using MFCC. During the training phase, the signals of the training part were trained, and the labeled features were generated. During the identification phase, we improved the DTW algorithm by using multiple constraints to make path matching within the constraints more accurate. Experiments were conducted and the results showed a high recognition rate for a specific score word in freestyle kayaking. In addition, this method provides relatively good results in noisy environments with high signal-to-noise ratios.
AB - Voice is the most natural information carrier for human beings, and is likely to become the main method of human–computer interaction in the future. This article focuses on the recognition of score words in freestyle kayaking, and collects words from multiple speakers, each with a specific freestyle kayak action word. In this paper, a new method using mel-scale frequency cepstral coefficients (MFCC) and improved dynamic time warping (DTW) is presented for isolated speech recognition. An endpoint detection method is proposed and implemented based on short-time energy and zero-crossing rate. After preprocessing with endpoint detection, the speech signal was analyzed and converted into speech feature parameters using MFCC. During the training phase, the signals of the training part were trained, and the labeled features were generated. During the identification phase, we improved the DTW algorithm by using multiple constraints to make path matching within the constraints more accurate. Experiments were conducted and the results showed a high recognition rate for a specific score word in freestyle kayaking. In addition, this method provides relatively good results in noisy environments with high signal-to-noise ratios.
KW - Endpoint detection
KW - Freestyle kayaking
KW - Improved dynamic time warping
KW - Score words recognition
UR - http://www.scopus.com/inward/record.url?scp=85185329759&partnerID=8YFLogxK
U2 - 10.1007/s11042-024-18383-w
DO - 10.1007/s11042-024-18383-w
M3 - Article
AN - SCOPUS:85185329759
SN - 1380-7501
VL - 83
SP - 75731
EP - 75755
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 31
ER -