Posts

Google Just Launched Translation Hub, Its Enterprise-Scale Translation Service

Image
  Google Just Launched Translation Hub, Its Enterprise-Scale Translation Service On October 11, 2022 at Google Cloud Next ’22,  Sundar Pichai , CEO of Google and Alphabet,  announced the launch  of Translation Hub, the company’s enterprise-scale document translation service. Pichai described  Translation Hub  as “Google Cloud’s AI agent that helps companies translate content in over 135 languages.” He added, “It takes full documents including images and translates them while preserving layouts and formatting.” Among those targeted by the service, according to the  Google  CEO, are researchers who want to share their findings with a global audience, product and service providers that want to reach underserved markets, and governments. In a  blog post  marking the event, June Yang, VP of Google Cloud AI and Industry Solutions, said Translation Hub runs on  machine translation  and  AutoML  to translate content from Goog...

Can ‘Huge Amounts’ of Synthetic In-Domain Data Improve Machine Translation

Image
  With the many noteworthy advances in machine translation ( MT ) and natural language processing ( NLP ), it is no wonder that large and small-scale users alike now expect each new MT iteration to measurably outperform its predecessor. From a functional perspective, MT does get better and better — thanks in no small part to research and all the large datasets freely available for training equally large MT engines. However, domain-specific MT ( check out a recent example ) remains a work very much in progress. Researchers Yasmin Moslem, Rejwanul Haque, John D. Kelleher, and Andy Way from Adapt Center, Dublin City University, National College of Ireland, and Technological University Dublin set out to tackle this domain-specific problem with an experiment using three different setups. In a  paper  published in August 2022, this group of NLP specialists defined the problem as “in-domain data scarcity […] common in translation settings due to the lack of specialized datasets ...