From manual to machine

Meng Guo, Lili Han

Research output: Contribution to journalArticlepeer-review


This study introduces a groundbreaking automated methodology for measuring ear-voice span (EVS) in simultaneous interpreting (SI). Traditionally, assessing EVS - a critical temporal metric in SI - has been hampered by labour-intensive and time-consuming manual methods that are prone to inconsistency. To overcome these challenges, our research harnesses state-of-the-art natural language processing (NLP) technologies, including automatic speech recognition (ASR), sentence boundary detection (SBD) and cross-lingual alignment, to automate EVS measurement. We deployed a comprehensive array of NLP models and evaluated the automated pipelines on a 20-hour English-to-Portuguese SI corpus which featured 57 varied audio pairings. The findings are encouraging: the most effective model combination achieved a median EVS error of less than 0.1 seconds across the corpus. Moreover, the automated pipelines exhibited a high level of accuracy, strong correlation and substantial agreement with manual measurements when assessing median EVS for individual audio pairs. Despite these satisfactory results, certain challenges persist with some NLP models, indicating clear avenues for future research. This study not only introduces a groundbreaking approach to large-scale EVS measurement but also propels the automation of process analysis in Interpreting Studies.

Original languageEnglish
Pages (from-to)24-54
Number of pages31
Issue number1
Publication statusPublished - 5 Mar 2024


  • automatic speech recognition
  • cross-lingual alignment
  • ear-voice span
  • sentence boundary detection
  • simultaneous interpreting


Dive into the research topics of 'From manual to machine'. Together they form a unique fingerprint.

Cite this