TY - JOUR
T1 - From manual to machine
AU - Guo, Meng
AU - Han, Lili
N1 - Publisher Copyright:
© 2024 John Benjamins Publishing Company. All rights reserved.
PY - 2024/3/5
Y1 - 2024/3/5
N2 - This study introduces a groundbreaking automated methodology for measuring ear-voice span (EVS) in simultaneous interpreting (SI). Traditionally, assessing EVS - a critical temporal metric in SI - has been hampered by labour-intensive and time-consuming manual methods that are prone to inconsistency. To overcome these challenges, our research harnesses state-of-the-art natural language processing (NLP) technologies, including automatic speech recognition (ASR), sentence boundary detection (SBD) and cross-lingual alignment, to automate EVS measurement. We deployed a comprehensive array of NLP models and evaluated the automated pipelines on a 20-hour English-to-Portuguese SI corpus which featured 57 varied audio pairings. The findings are encouraging: the most effective model combination achieved a median EVS error of less than 0.1 seconds across the corpus. Moreover, the automated pipelines exhibited a high level of accuracy, strong correlation and substantial agreement with manual measurements when assessing median EVS for individual audio pairs. Despite these satisfactory results, certain challenges persist with some NLP models, indicating clear avenues for future research. This study not only introduces a groundbreaking approach to large-scale EVS measurement but also propels the automation of process analysis in Interpreting Studies.
AB - This study introduces a groundbreaking automated methodology for measuring ear-voice span (EVS) in simultaneous interpreting (SI). Traditionally, assessing EVS - a critical temporal metric in SI - has been hampered by labour-intensive and time-consuming manual methods that are prone to inconsistency. To overcome these challenges, our research harnesses state-of-the-art natural language processing (NLP) technologies, including automatic speech recognition (ASR), sentence boundary detection (SBD) and cross-lingual alignment, to automate EVS measurement. We deployed a comprehensive array of NLP models and evaluated the automated pipelines on a 20-hour English-to-Portuguese SI corpus which featured 57 varied audio pairings. The findings are encouraging: the most effective model combination achieved a median EVS error of less than 0.1 seconds across the corpus. Moreover, the automated pipelines exhibited a high level of accuracy, strong correlation and substantial agreement with manual measurements when assessing median EVS for individual audio pairs. Despite these satisfactory results, certain challenges persist with some NLP models, indicating clear avenues for future research. This study not only introduces a groundbreaking approach to large-scale EVS measurement but also propels the automation of process analysis in Interpreting Studies.
KW - automatic speech recognition
KW - cross-lingual alignment
KW - ear-voice span
KW - sentence boundary detection
KW - simultaneous interpreting
UR - http://www.scopus.com/inward/record.url?scp=85187948926&partnerID=8YFLogxK
U2 - 10.1075/intp.00100.guo
DO - 10.1075/intp.00100.guo
M3 - Article
AN - SCOPUS:85187948926
SN - 1384-6647
VL - 26
SP - 24
EP - 54
JO - Interpreting
JF - Interpreting
IS - 1
ER -