The potential of ChatGPT in translation evaluation: A case study of the Chinese-Portuguese machine translation

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

The integration of artificial intelligence (AI) in translation assessment represents a significant evolution in the field, transcending traditional human-only scoring approaches. This study specifically examines the role of ChatGPT, a multilingual, transformer-based large language model developed by OpenAI, in the automated evaluation of machine translations between Portuguese and Mandarin. Despite ChatGPT's burgeoning reputation for its advanced Natural Language Processing (NLP) capabilities, research on its application in translation evaluation, particularly for this language pair, remains unexplored. To fill this gap, our research employed three prevalent machine translation tools to translate a set of twenty sentences from Chinese into Portuguese. Translated target text versions provided by professional Chinese-Portuguese translators were also included to estimate if the machine-translated target texts have achieved a certain degree of human parity. We then assessed these translations using both GPT models (ChatGPT 3.5 and ChatGPT 4.0) and five human raters to offer a comprehensive scoring analysis. The study's findings reveal that, particularly ChatGPT 4.0, exhibits substantial promise in evaluating translations across varied text types. However, this potential is tempered by notable inconsistencies and limitations in its performance. Through both quantitative analysis and qualitative insights, this research highlights the synergy between ChatGPT's automated scoring and traditional human assessment. It uncovers some key benefits of this automated approach: (1) exploring viability of automated translation evaluation, particularly in Chinese-Portuguese language pair; (2) fostering critical supplement to human evaluation, and (3) uncovering the astonishing capability of cutting-edge machine translation tools in Chinese-Portuguese language pair. Our findings contribute to a more detailed comprehension of ChatGPT's role in translation assessment and underscore the need for a balanced approach that leverages both human expertise and AI capabilities.

Original languageEnglish
Article numbere98613
JournalCadernos de Traducao
Volume44
Issue number1
DOIs
Publication statusPublished - 31 Dec 2024

Keywords

  • ChatGPT
  • automatic scoring
  • evaluation metric
  • human assessment
  • machine translation (MT)

Fingerprint

Dive into the research topics of 'The potential of ChatGPT in translation evaluation: A case study of the Chinese-Portuguese machine translation'. Together they form a unique fingerprint.

Cite this