摘要
This paper focuses on seven data augmentation methods based on the Emphasized Channel Attention Propagation and Aggregation-Time Delay Neural Network (ECAPA-TDNN) model for increasing the diversity of training data to improve model accuracy and true positive rate (TPR/recall). We propose a method to improve classification performance by replacing and reducing the datasets. We also verified the effect of the number of layers on the classification performance by modifying the number of layers of the SE-Res2Block in the ECAPA-TDNN model. The proposed method is validated with the ZhVoice and VoxCeleb datasets, and the results show that the best model accuracy and classification performance can be obtained by using ZhVoice with seven data augmentations on a 3-layer SE-Res2Block. The accuracy reached 0.9477, the TPR reached 0.8945, and the EER was 0.1278. We also used the diagonal cosine algorithm to determine the similarity between two speakers, validating the classification performance of the model.
| 原文 | English |
|---|---|
| 主出版物標題 | 12th IEEE International Conference on Renewable Energy Research and Applications, ICRERA 2023 |
| 發行者 | Institute of Electrical and Electronics Engineers Inc. |
| 頁面 | 414-420 |
| 頁數 | 7 |
| ISBN(電子) | 9798350337938 |
| DOIs | |
| 出版狀態 | Published - 2023 |
| 事件 | 12th IEEE International Conference on Renewable Energy Research and Applications, ICRERA 2023 - Oshawa, Canada 持續時間: 29 8月 2023 → 1 9月 2023 |
出版系列
| 名字 | 12th IEEE International Conference on Renewable Energy Research and Applications, ICRERA 2023 |
|---|
Conference
| Conference | 12th IEEE International Conference on Renewable Energy Research and Applications, ICRERA 2023 |
|---|---|
| 國家/地區 | Canada |
| 城市 | Oshawa |
| 期間 | 29/08/23 → 1/09/23 |
UN SDG
此研究成果有助於以下永續發展目標
-
Affordable and clean energy
指紋
深入研究「Data Augmentation with ECAPA-TDNN Architecture for Automatic Speaker Recognition」主題。共同形成了獨特的指紋。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver