Thursday, January 9, 2025

US Government RFP Seeks Translation Into Four Native American Languages

The United States government has issued an unusual RFP for translation services: The target languages are all indigenous to the US.


The contracting agency is the Office of Indian Economic Development (OIED), which falls under the Bureau of Indian Affairs that governs programs concerning federally recognized American Indian Tribes. OIED has allied with the Department of Agriculture, or USDA, in this contract. This will provide a means whereby diverse agencies can request translation into Native languages.

The RFP features a set-aside for Indian Small Business Economic Enterprises, meaning that only companies meeting certain revenue and ownership requirements may apply. OIED would prefer to award a single contractor work for all four languages.

"This is a one-year project that will respond to federal agency requests for ongoing and diverse Native Language translation that will be specific to the federal agency needs," the RFP states, noting the contract may be extended more than once, but only for an additional period of up to six months. Work covered under the contract is between January 20, 2025 – January 19, 2026.

The ultimate goal is to make available the range of content from official documents, and signage, to Web sites of the "widest possible audience of the Tribal Nations."

There are 574 federally recognized Tribal Nations. Of those, 229 are located in the state of Alaska. The other 345 Tribal Nations are spread across 35 other states.

This would, in turn deal with "more prevalent native languages", most likely the ones which are spoken more frequently.

Stats and Translation Requests

The four target languages are Yup’ik (Central dialect), Cherokee (Western dialect), Ojibwe (Western dialect), and Navajo. The contract estimates that each language will require 610 hours of translation — a somewhat uncommon way of pricing translation — for a total of 2,440 hours.

According to the American Community Survey for 2009-2013, Navajo is the most-spoken indigenous language in the US, with nearly 167,000 speakers, 35,250 of whom self-report as speaking English less than very well. The latter would be considered individuals with limited English proficiency (LEP). 

The other three languages have fewer speakers overall, and fewer individuals with LEP, including about 6,000 speakers of the Alaska Native language Yup’ik; 1,460 speakers of Cherokee; and 1,100 speakers of Ojibwe. 

With relatively small populations of people with LEP, the impetus for the RFP goes beyond numbers.

Indeed, the outgoing Biden-Harris Administration issued on December 9, 2024 a “10-year National Plan on Native Language Revitalization,” described as charting “a path to help address the United States government’s role in the loss of Native languages across the continental United States, Alaska, and Hawai’i.”

Some Tribal Nations have resources to handle (certain) translations on their own. The Cherokee Nation Translation Department, for instance, offers free translations for nonprofit uses related to education, health, and legal services. But there are limits. 

“Due to the large volume of requests, Cherokee Nation Translation does not accept unsolicited documents such as poetry, scripts, screenplays, and book manuscripts for translation,” its website states. Nor does it translate tattoos or “names in Cherokee for children, family members, [or] pets”. 

For up-to-date information about language services and technology tenders, subscribe to our Growth, Pro, or Enterprise plan and get access to the RFP Center.


Wednesday, January 8, 2025

Sony Aims to Improve AI Translation for Indian Language Entertainment Content

In an December 29, 2024 paper by Sony Research India researchers Pratik Rakesh Singh, Mohammadi Zaki, and Pankaj Wasnik comes a framework specifically designed to "improve entertainment content translations" in Indian languages.


They "believe it is the first of its kind," using an amalgamation of context awareness along with style adaptation to produce not only accurate translations but also entertaining for the targeted audience.

The researchers explained that traditional machine translation MT systems usually struggle to handle entertainment content because they mostly translate sentences in isolation. It leads to "disconnected" translations that can't really capture the emotional depth or cultural references behind the original dialogue. This has a particular pronounced effect in entertainment, where all these interconnected conversations and subtle cues in the narrative are so vital.

The challenge, in entertainment translation, lies in preserving the context, mood, and style of the original content while also including creativity and considerations of regional dialects, idioms, and other linguistic nuances," researchers explained.

To tackle this challenge, the researchers developed CASAT: the Context and Style Aware Translation, which combines the two concepts during the translation process.

The CASAT framework starts with segmenting the input text — like dialogues from movies or series — into smaller sections known as "sessions." Sessions are dialogues that are consistent in their genre or mood, such as comedy or drama. This segmentation allows CASAT to focus on the specific emotional and narrative elements of each session.

For every session, CASAT estimates two critical components: context and style. The former is said to be the narrative framework that wraps the dialogue, while the latter denotes the emotional tone and cultural nuances, like seriousness, excitement, or even humor. Understanding these, the framework will be able to make translations that effectively reach the deep recesses of the target audience's psyche.

To facilitate this, CASAT adopts a context retrieval module that gets relevant scenes or dialogues based on the relevant vector database retrieved, so this translation is grounded in appropriate narrative frameworks, and it applies a domain adaptation module to infer insights from sessions and sentences-based dialogues to realize the intended emotion tone and the intent.

Once the context and style are estimated, CASAT generates a customized prompt that is a combination of these elements. The customized prompt is then passed to an LLM that generates translations not only accurate but also carrying the intended emotional tone and cultural nuances of the original content.

Superior Performance

Metrics for CASAT's effectiveness, such as COMET scores and win ratios, have been used to test its performance. CASAT, on the other hand, surpassed baseline LLMs and MT systems like IndicTrans2 and NLLB, providing much better translations in terms of content and context.
"Our method exhibits superior performance by consistently incorporating plot and style information compared to directly prompting creativity in LLMs," the researchers said.

They found that context alone substantially improves translation quality, while including style alone has a minimal improvement. Combining the two improves quality the most.

The researchers noted that CASAT is language and model-agnostic. "Our method is both language and LLM-agnostic, making it a general-purpose tool," they concluded.

Monday, December 30, 2024

The Most Popular Language Industry Stories of 2024

As 2024 comes to a close, it is time to reflect on the most popular stories, trends, innovations, and themes that made the Slator headlines throughout the year, highlighting key developments in the language industry.

Here is a selection of stories that attracted the most attention and engagement from our readers around the world.


Will Large Language Models Edge Linguists Out of the Language Industry?

One of Slator’s most-read stories in 2024 detailed a May 2024 paper from the University of Zurich and Georgetown University that explored the role of linguists in the evolving field of machine translation (MT). The entrance of large language models (LLMs) has reduced the reliance on linguists for grammar and semantic coherence while designing a system. 

However, the authors concluded, there are a number of points in the process where linguistic expertise is still essential. These include building parallel corpora for MT; developing technology for low-resource languages; and identifying linguistic phenomena that may present challenges for a system. Linguists can be especially helpful as humans and machines interface, for example, by designing effective human evaluations and reliably assessing advancements in the field.

Google Translate Ditches Tool for Detailed Human Feedback

Google retired its longstanding human feedback tool, Contribute, which allowed users to press a button and submit an alternative translation. 

Slator reported in April 2024, Google’s announcement, in which the company acknowledged Contribute’s role in improving Google Translate, explained that since launching the tool in 2014, “our systems have significantly evolved, allowing us to phase out Contribute.” 

Users can, however, still submit feedback by rating a given translation “good” or “poor,” and, for the latter, selecting a reason from a drop-down menu — a less involved process that speakers of low-resource languages worry might halt improvement of MT for their languages. 

Live Speech-to-Speech AI Translation Goes Commercial

Just one month into 2024, an increasing number of language AI researchers — from academia to private companies — had already begun to focus on live speech-to-speech translation (S2ST). 

This only accelerated the adoption of live S2ST across multiple commercial applications thanks to LLMs, which kicked off in mid-2023, with models such as Meta’s SeamlessM4T and Google’s AudioPaLM.

Slator’s rundown of real-world use cases included business meetings, where Microsoft Translator, integrated with the Teams meeting app, provides real-time speech translation in more than 30 languages through Azure AI services. KUDO and Interprefy specialize in real-time AI speech translation for live events and conferences.

Even the high-stakes world of healthcare presents an opportunity for expansion, especially for providers already offering voice technology for healthcare clients. Orion Labs, for instance, offers live speech translation via its Push-to-Talk 2.0 platform. 

Introducing Revamped New Translation Quality ISO Standard 5060

Published in February 2024, ISO 5060 applies not only to language services providers (LSPs), but also to in-house translation departments and individual translators. While it specifically provides guidance for human evaluation of translation output, it can be used for workflows involving human and machine translation, with or without subsequent post-editing. 

The International Organization for Standardization (ISO) established a framework based on “bilingual examination of target language content against source language content,” with the goal of standardizing evaluations so they do not differ significantly from rater to rater. 

There are seven main categories of errors, which can be classified as critical, major, or minor: terminology, accuracy, linguistic conventions, style, locale conventions, audience appropriateness, and design and markup. 

Translation AI Agency Lengoo Files for Bankruptcy

In March 2024, Lengoo filed for bankruptcy in a Berlin court, with German news sources pegging Lengoo’s accumulated losses between USD 8-16m.

Christopher Kränzler, Alexander Gigga, and Philipp Koch-Büttner founded Lengoo in 2014, originally as an online platform for automating project management and administrative tasks. 

Starting in 2018, investors such as RedalpineCreathor Ventures, Piton Capital, Inkef Capital, Techstars, and Polipo Ventures expressed confidence in Lengoo’s developing proprietary translation system, with Lengoo raising USD 34m by February 2021 — making it a long way for the LSP to fall.

Amazon Flags Risks of Training LLMs on Web-Scraped MT 

Training LLMs at scale relies on massive amounts of training data scraped from the web. A January 2024 research paper from Amazon investigating the prevalence and quality of MT on the web found that a “shocking amount of the web is machine translated” into many languages. 

And oftentimes, that MT output is low-quality, raising concerns about the quality of training data for LLMs. Researchers also noted a selection bias toward “shorter and more predictable sentences,” potentially from low-quality English content machine translated into many lower-resource languages.

The pervasiveness of low-quality MT in training data, the authors warned, could lead to less fluent models with more hallucinations, particularly for low-resource languages. 

Translators by Any Other Name

Slator’s January 2024 roundup of five polls from 2023 was crowned by the most voted-on — and perhaps introspective — question: Will the term “translator” disappear in the next five years? Close to half of respondents said no, with just over 30% saying it will “definitely” or “possibly” disappear in that time period. 

Inspired by a SlatorPod interview with ASAP-translation.com CEO Jakub Absolon, another poll asked whether readers agreed with Absolon, who suggested the term “full post-editing” should not be used, and should be priced as human translation.

More than 65% of readers agree that the term should not be used, while 18.7% want to keep using it. The remaining 16% are happy to use whatever term the client prefers. 

Other polls touched on inflation, with nearly half of respondents reporting flat rates; ChatGPT, which 80% of readers reporting they do not use it for translation; and the beloved Microsoft Language Portal, used “often” by 46.5% of respondents. 

Real-Time Speech Translation Stars in Biggest OpenAI Release Since ChatGPT

OpenAI has not slowed down since being credited with unleashing accessible AI to the masses. The company’s May 2024 release of GPT-4o offered a range of new or improved capabilities. The single new model was trained end-to-end across text, vision, and audio, with all inputs processed by the same neural network, reportedly with enhanced performance in around 50 languages. 

A demo of GPT-4o featured a brief conversation with OpenAI CTO Mira Murati asking the system a question in Italian, to which GPT-4o responds in English. Cue the hot takes of ‘RIP translators’ and shares in language learning resource Duolingo dipping 3%. OpenAI planned to launch support for GPT-4o’s new audio and video capabilities to a small group of trusted partners in an API within a few weeks.

EU Parliament Issues a New 2024 Call for Tenders for Translation Services

February 2024 notice posted for translation services would cover translation of single and multiple source language documents in 24 languages for four European institutions: the European Parliament’s Directorate-General for Translation; the European Court of Auditors; the Committee of the Regions of the European Union; and the European Economic and Social Committee. 

While the notice did not mention MT, it did specify output metrics for source and target languages, and contracts — with one lot per language, assigned to a primary contractor and up to four secondary contractors — are estimated to last up to 60 months. No specific budget was listed. Once awarded, the contract will become effective January 1, 2025.

Bankrupt Dutch LSP, WCS Group, Quickly Bought by France’s Powerling 

In a provisional January 2024 ruling, a Dutch judge suspended payments by LSP WCS Group to its creditors, appointing an administrator to negotiate until a later hearing a few months later. Of 14 companies under the WCS Group, only one was listed as in “suspension of payment” status; all others are listed under “bankruptcy” status. At the time, WCS Group’s website listed 3,247 active freelancers, whose next steps were unclear. 

Just a few days later, French LSP Powerling acquired WCS Group for an undisclosed amount. Powerling, which already had a presence in the Netherlands — plus France, Hong Kong, and the US — said the move was in line with the company’s goal of clearing EUR 25m in revenues by the end of 2024 through acquisitions in Powerling’s main markets.

Thursday, December 26, 2024

Does the Machine Translation Post-Editing Activity Require a Lot of Time and Effort?

For the language industry, the year 2024 will go down as a year that had multiple developments and innovations at a fast pace, but this growth came with some distinct trends on the technological front that included translation feature as a service (TaaF), the emergence of multimodal AI, and retrieval augmented generation (RAG) and the use of large language models (LLM) enabled applications. 

The integration of AI tools and human skill was in the central place in the deliberations of the industry specialists even as the different size companies had their perspectives. The responses of the readers and viewers as revealed in the weekly Slator polls are snapshots of the sentiments, preferences and scopes across the industry. 

1. Is it Time for Language Service Providers to Change Their Mindset? 

The language service sector has survived difficult times in the past but it was not business as usual for an industry that started 2024 on the wrong foot as reports of some firms filing for bankruptcy around that time surfaced. This is a pointer that the adequate provision of funding and accessing the latest generation of ai tools does guarantee permanence in the business. The German AI company Lengoo, for example, went bankrupt in March 2024 after the WCS Group of Holland went bankrupt in December 2023 (which was purchased by Powerling later).

In the poll carried out by Slator Weekly, a figure of 52.1% of the respondents stated that they believe that there will be more bankruptcies of LSPs, while only 3.4% feel that this is not likely to happen. During the poll, more than half of the respondents in this particular poll held the opinion that there would be an inevitable increase in the number of LSP bankruptcies in 2024. Respondents who expressed the feeling that recruitments would also be impressive but really insufficient were almost close to 31.1%, while 13.4% stated that it was barely possible. 

2. What is Post-editing of Machine Translation in the Market? 

The phrase-matching interpretation of translation, however, is somewhat reduced in importance as MT broadens into more creative areas such as literature and marketing - It stands to reason that the volume of MTPE activities has now eclipsed human translation editing in volume. This statement, made by the Société française des traducteurs (SFT), was not more than a few hours old and was retracted immediately, but not before it acknowledged that 70% of its members regard post editing as unnecessary owing to somewhat the low pay given to the job and the boring nature of the work. 

LawBuilder.ai supports this claim; in the polls of July, 2024, for instance, 61.2% of the participants affirmed that most of the time post editing is just boring and dull work - 23.5% said they do so “once in a while” when they feel it is needed while 10.2% said anything related to MTPE was annoying because the tools were not up to their standard with 5.1% showing even some iota of interest in carrying out the work.

 

3. Would Shakespeare Grudge the Other Bard’s Translations?

 William Shakespeare’s works reflect British culture, and thus his works have not been translated during his lifetime, however, Gemini which was initially called Google Bard is capable of translating all of Shakespeare's works into several foreign languages within minutes. Although in the research of Burns and Swerve translation conducted in the year of 2021 scholars showed a bias in preferring human literation translation, the progress in LLMs technology in January 2024 forecasts a competency switch in different translation techniques and style.

 It is a fact that AI translation is a threat of entering the creative fields of writers and the social media and publishers have begun to speak about its use and even promote it. However, readers were split on their expectations regarding the pace at which AI translation may be effectively deployed in creative works within a period of 2-3 years. To the question of whether MT use will become widespread in literary translation in 2-3 years, about two-thirds (31.9%) of the respondents viewed it as improbable. Around one-fifth (20.2%) held the contrary view leaving the rest to be evenly divided across likely (17.0%), uncertain (16.0%), and unlikely (14.9%). 

4. Is Translation Quality Evaluation a Solved Problem?

Despite the advancement in MT as well as the development and availability of automated metrics like BLEU and COMET, human evaluation is essential to determine the quality of translations. The new ISO 5060 Standard fulfils this need by indicating how translation output may be evaluated by humans regardless of its source.

The standard consists of seven quality categories including terminology and style, while error severity is also allocated to the mistakes. As much as ISO 5060 focuses on the harmonization of approaches toward evaluation, as of February 2024, only 6.8% of polled believe that this is a solved issue, with 72.7% believing the area needs further research and 20.5% who believe the language and the type of text in the translation process.

5. Has ChatGPT Changed Google Search Behavior?

OpenAI launched the prototype of SearchGPT in July 2024, offering a direct AI-powered "answer experience" instead of traditional search results. It, however, raises several questions on the accuracy of answers and self-referencing as AI-generated content becomes a part of search results.

Has this new release changed how Slator readers search? According to our poll, nearly two-thirds of respondents (65.7%) primarily use the Google search engine. Less than a quarter (22.9%) of readers reported using Google Search a bit, and ChatGPT more often. The rest (11.4%) said they mostly use ChatGPT for questions.

6. Has AI (LLMs, etc.) altered your work life over the last 24 months?

In episode #221 of SlatorPod, Spence Green, Lilt CEO, discussed how the need for AI in localization has become increasingly imperative. This involves the potential of custom LLMs, RAG, and AI orchestration to automate tasks, customize content, process huge volumes of data, and increase ROI.

As companies like Reddit truly and materially demonstrate their faith in AI localization, the general adoption and adaptation of the language industry are challenged. According to a Slator poll, more than one in three localization professionals (37.5%) have not introduced AI into their daily work practices yet, while more than one in five, namely 23.6%, have all-in. Two equal cohorts – at 19.4% each – use AI language tech "somewhat" or "a little," respectively. 

7. Will AI increase or decrease demand for language learning in the long term?

OpenAI introduced the multimodal GPT-4o model and demonstrated a live speech-to-speech translation (S2ST) demo with Italian and English. Reactions on X ran a storm, pronouncing an end to translators and language learning as we have known them. Social media jitters turned into quakes and caused some investors to sell shares of the language learning platform, Duolingo.

 

The responses of the readers to the question of whether AI would increase or decrease demand for language learning in the long term were rather split, with more than one-third (36.9%) predicting that demand will increase and another (35.4%) that it will decrease. A little over a quarter (27.7%) believe demand will stay the same.

8. What role would prepare you best to lead an LSP?

The world's largest LSP, TransPerfect, has been growing continuously on the basis of acquisitions and diverse offerings, including technology. But the real engine in the company is, of course, people such as Jin Lee, appointed as co-CEO in January 2024.

At that point, Lee was a 20-year veteran of the company, having joined as a project manager and having been Senior VP for Global Production before his co-CEO appointment. In this light, we asked readers what roles best prepare them for running an LSP, and the largest cohort (46.0%) believes it is project management. Other readers selected sales and language experts (14.3% each), and finance/admin and language ops (11.1% each).

9. Are you experiencing a summer slowdown in business?

There was no traditional summer slowdown for the language industry in 2024 — at least, not in the northern hemisphere. July alone was busy with significant investment activity, from early-stage funding to major acquisitions.

 

Capital kept pouring into some sectors, including AI dubbing and captioning, plus language tech. Yet, when we asked readers if they were seeing a summer slowdown, 40.3% said they were definitely experiencing a cooling period. For the rest, the summer was either fairly stable (33.9%) or busier than ever (25.8%).

 10. How has your business year gone so far?

News of bankruptcies hit the language industry at the start of 2024 and may even have shocked some Slator readers into action to avoid a similar fate. The Language Service Provider Index (LSPI) showed indications of stability for some companies and actual growth for the Super Agencies.

Although the LSPI only includes about 300 companies which volunteer their data for the survey, the February 2024 edition seemed to foreshadow the mixed bag LSPs experienced for the balance of 2024. While some companies were indeed pushed out of business, acceleration in M&A was also clear. Readers self-reported that, as of February 2024, business had thus far been great (27.6%) or good (25.9%), flat (20.7%), not great (15.5%), or bad (10.3%).

Sunday, December 22, 2024

The Year in Review and 2025 Predictions!

Hosted by Florian Faes and Esther Bond, with guest Anna Wyndham, in their SlatorPod year-end 2024 episode, key language industry trends over the course of the past year, including trends, drivers, and predictions, 2025, will be discussed.


First, language industry news of the week: LXT acquired clickworker with the goal of doubling revenues by 2025 by expanding its AI data capabilities. Esther also shares how EzDubs, a speech translation startup, raised USD 4.2m in seed funding.

Florian comments that RWS published revenues for 2024 that are stable with £180m from AI-powered products and services. Additionally, YouTube announced the rollout of AI dubbing, enabling content creators to reach new language-speaking audiences, but admitted limitations at this point, including poor voice quality.

https://youtu.be/CtrVDikK7lE

In their discussion, the trio talked about the UK House of Lords inquiry into court interpreting and translation, highlighting pay issues for interpreters, quality issues, and how AI is being deployed for quality assurance.

Reflecting on 2024, Anna outlines three major trends: speech-to-speech translation, "translation as a feature," where translation capabilities are integrated into everyday software like project management tools, and the evolution of localization roles toward AI-driven skills.

Looking forward, Anna foresees rapid adoption of AI by the public sector given the cost constraints and the need for scalability, whereas Florian envisions further breakthroughs in machine translation quality estimation and, possibly, IPOs in the language tech industry. Esther predicts higher levels of M&A activity in the industry, where niche providers seek stability and scalability in a competitive market.

Friday, December 20, 2024

Stoquart Buys Peer Belgian LSP ETC Europe

Stoquart, an language services provider based in Belgium, has acquired Brussels-based ETC Europe, which holds the status of being a translation agency accredited by the European Union and other governmental and international organizations.


The transaction was closed on 24 October 2024 after Stoquart's takeover of French competitor Version Internationale in 2023.

The founding managing director of Stoquart Translation Services, Dimitri Stoquart, found contact person ETC Europe General Manager Angelina Janssen due to meetings with the Belgian Association of Translation Companies or BQTA.

He stated that Janssen suggested Stoquart form a consortium with ETC Europe and another language service provider, VerbiVis, to respond to the European Commission's TRAD23 RFP. This resulted in Stoquart achieving second place for English-French translation.

In 2024, he mentioned that Janssen wanted to step back and suggested that Stoquart assume control of ETC Europe. Before the acquisition, shares of ETC Europe were divided among three shareholders; Stoquart has taken over all the shares.

"It was worth joining forces," Stoquart explained. "We have gained both institutional and private clients, along with an increasing number of multilingual projects."

In doing so, ETC Europe further creates new sources of income for Stoquart. The LSP, which now operates as ETC Europe or Stoquart, has recently entered three sizeable contracts with a number of Europe's biggest institutions.

This bodes well for Stoquart, which has faced an accumulated revenue decline of 30% in both 2023 and 2024.

"With this acquisition and the revenues from the European Parliament contract, we will be able to regain our 2022 revenue levels," Stoquart stated. 

Strong In-House Resources and Powerful Brands

Stoquart now has around 50 people working for her globally. Janssen will stay until the end of 2024 and will remain available as needed in the near future. (Besides nearly 30 in-house linguists, Stoquart engages between 150-180 freelancers monthly.)

Similar to Version Internationale, ETC Europe holds a strong reputation in the institutional sector. The company will retain its brand identity and limit integration with Stoquart to the essentials required for seamless operations, focusing primarily on activities in the LSP's main office.

Based on Stoquart's location, a big portion of its work is with all variants of French and Dutch, but the company also handles German, Italian, and Spanish. Stoquart now finds itself branching out into other European languages for institutional work, too.

Most clients are found in the US, Ireland, CzechiaSpainFrance, Belgium, the UKGermany, and Denmark. Stoquart said the LSP specializes in fields where human expertise is required, such as IT, financelegallife sciences, and the defense industry.

Stoquart's technology approach combines off-the-shelf tools, such as Studio and Phrase, and proprietary tools, including an app that allows users to access several machine translation engines. Stoquart is now expanding into additional European languages for institutional work as well.

US Government RFP Seeks Translation Into Four Native American Languages

The  United States  government has issued an unusual  RFP for translation  services: The target languages are all indigenous to the US. Th...