Google Says There’s a Better Way to Create High-Quality Training Data for AI Translation
In an October 14, 2024 paper, Google researchers highlighted the potential of AI translations refined by humans or human translations refined by large language models (LLMs) as alternatives to traditional human-only references.
Talking to Slator, Zhongtao Liu, a Software Engineer at Google, explained that their study addresses a growing challenge in the translation industry: scaling the collection of high-quality data needed for fine-tuning and testing machine translation (MT) systems.
With translation demand expanding across multiple languages, domains, and use cases, traditional methods that rely solely on human translators have become increasingly expensive, time-consuming, and hard to scale.
To address this challenge, the researchers explored more efficient approaches to collect high-quality translation data. They compared 11 different approaches — including human-only, machine-only, and hybrid methods — to determine the most effective and cost-efficient one.
Human-only workflows involved either a single human translation step or included an additional one or two human review steps. Machine-only workflows ranged from single-step AI translations using top AI systems — MT systems or LLMs — to more complex workflows, where AI translations were refined by an LLM. Hybrid workflows combined human expertise and AI efficiency; in some cases, AI translations were refined by humans (i.e., post-editors), while in others, human translations were refined by LLMs.
They found that combining human expertise and AI efficiency can achieve translation quality comparable to, or even better than, traditional human-only workflows — all while significantly reducing costs. “Our findings demonstrate that human-machine collaboration can match or even exceed human-only translation quality while being more cost-efficient,” the researchers said.
The best combination of quality and cost appears to be human post-editing of AI translations. This approach delivered top-tier quality at only 60% of the cost of traditional human-only methods, while maintaining the same level of quality.
“This indicates that human-machine collaboration can be a faster, more cost-efficient alternative to traditional collection of translations from humans, optimizing both quality and resource allocation by leveraging the strengths of both humans and machines,” they noted.
The researchers emphasized that the quality improvements stem from the complementary strengths of human and AI collaboration, rather than from the superior capability of either the AI or the human (post-editor) alone, underscoring the importance of leveraging both human and AI strengths to achieve optimal translation quality.
They noted that LLMs were less effective than human post-editors at identifying and correcting errors in AI-generated translations. On the other hand, human reviewers tended to make fewer changes when reviewing human-generated translations, possibly overlooking certain errors. Interestingly, even additional rounds of human review did not substantially improve the quality. This observation supports the argument for human-machine collaboration, where each component helps address the other’s blind spots, according to the researchers.
“These findings highlight the complementary strengths of human and machine post-editing methods, indicating that a hybrid method is likely the most effective strategy,” they said.
Authors: Zhongtao Liu, Parker Riley, Daniel Deutsch, Alison Lui, Mengmeng Niu, Apu Shah, and Markus Freitag
No comments:
Post a Comment