Evaluating translation quality: a qualitative and quantitative assessment of machine and LLM-driven Arabic–English translations

Mohammed, Tawffeek A. S.

Evaluating translation quality: a qualitative and quantitative assessment of machine and LLM-driven Arabic–English translations

Files

mohammed_evaluating_translation_quality_2025.pdf (15.72 MB)

Date

2025

Authors

Mohammed, Tawffeek A. S.

Publisher

Multidisciplinary Digital Publishing Institute (MDPI)

Abstract

This study investigates translation quality between Arabic and English, comparing traditional rule-based machine translation systems, modern neural machine translation tools such as Google Translate, and large language models like ChatGPT. The research adopts both qualitative and quantitative approaches to assess the efficacy, accuracy, and contextual fidelity of translations. It particularly focuses on the translation of idiomatic and colloquial expressions as well as technical texts and genres. Using well-established evaluation metrics such as bilingual evaluation understudy (BLEU), translation error rate (TER), and character n-gram F-score (chrF), alongside the qualitative translation quality assessment model proposed by Juliane House, this study investigates the linguistic and semantic nuances of translations generated by different systems. This study concludes that although metric-based evaluations like BLEU and TER are useful, they often fail to fully capture the semantic and contextual accuracy of idiomatic and expressive translations. Large language models, particularly ChatGPT, show promise in addressing this gap by offering more coherent and culturally aligned translations. However, both systems demonstrate limitations that necessitate human post-editing for high-stakes content. The findings support a hybrid approach, combining machine translation tools with human oversight for optimal translation quality, especially in languages with complex morphology and culturally embedded expressions like Arabic.

Keywords

Machine translation, Large language models, Neural-based, Rule-based, Google Translate

Citation

Mohammed, T. A. S. (2025) Evaluating Translation Quality: A Qualitative and Quantitative Assessment of Machine and LLM-Driven Arabic–English Translations. Information (Basel). [Online] 16 (6), 440.

URI

https://doi.org/10.3390/info16060440
https://hdl.handle.net/10566/22193

Collections

Research Articles (Foreign Languages)

Full item page

Evaluating translation quality: a qualitative and quantitative assessment of machine and LLM-driven Arabic–English translations

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections