Wednesday, February 12, 2025

Researchers Present DOLFIN, a New Test Set for AI Translation for Financial Content

On February 5, 2025, a team of researchers from Grenoble Alpes University and Lingua Custodia, a France-based company specializing in AI and natural language processing (NLP) for the finance sector, introduced DOLFIN, a new test set designed to evaluate document-level machine translation (MT) in the financial domain.


The researchers say that the financial domain presents unique challenges for MT due to its reliance on precise terminology and strict formatting rules. They describe it as “an interesting use-case for MT” since key terms often shift meaning depending on context.

For example, the French word couverture means blanket in a general setting but hedge in financial texts. Such nuances are difficult to capture without larger translation units.

Despite strong research interest in document-level MT, specialized test sets remain scarce, the researchers note. Most datasets focus on general topics rather than domains such as legal and financial translation.

Given that many financial documents “contain an explicit definition of terms used for the mentioned entities that must be respected throughout the document,” they argue that document-level evaluation is essential. 

DOLFIN allows researchers to assess how well MT models translate longer texts while maintaining context. 

Unlike traditional test sets that rely on sentence-level alignment, DOLFIN structures data into aligned sections, enabling the evaluation of broader linguistic challenges, such as information reorganization, terminology consistency, and formatting accuracy.

Context-Sensitive

To build the dataset, they sourced parallel documents from Fundinfo, a provider of investment fund data, and extracted and aligned financial sections rather than individual sentences. The dataset covers English-French, English-German, English-Spanish, English-Italian, and French-Spanish, with an average of 1,950 segments per language pair. 

The goal, according to the researchers, was to develop “a test set rich in context-sensitive phenomena to challenge MT models.”

To assess the usefulness of DOLFIN, the researchers evaluated large language models (LLMs) including GPT-4o, Llama-3-70b, and their smaller counterparts. They tested these models in two settings: translating sentence by sentence versus translating full document sections. 

They found that DOLFIN effectively distinguishes between context-aware and context-agnostic models, while also exposing model weaknesses in financial translation.

Larger models benefited from more context, producing more accurate and consistent translations, while smaller models often struggled. “For some segments, the generation enters a downhill, and with every token, the model’s predictions get worse,” the researchers observed, describing how smaller LLMs failed to maintain coherence over longer passages.

DOLFIN also reveals persistent weaknesses in financial MT, particularly in formatting and terminology consistency. Many models failed to properly localize currency formats, defaulting to English-style notation instead of adapting to European conventions.

The dataset is publicly available on Hugging Face.

Authors: Mariam Nakhlé, Marco Dinarelli, Raheel Qader, Emmanuelle Esperança-Rodier, and Hervé Blanchon

Monday, February 10, 2025

Off-Screen Drama Pits AI Dubbing Against French Voice Actors

How do US actor Sylvester Stallone, France’s minister for gender equality Aurore Bergé, and a multilingual/multibillion voice AI company collide in a tense drama? Since early January 2024, multiple online media sources have highlighted a clash that began with news that “Armor,” a film starring Stallone, would feature AI dubbing.

For 50 years, Alain Dorval was the familiar voice of Stallone in French-dubbed films, but he passed away in February 2024. Minister Bergé happens to be Dorval’s daughter. Enter ElevenLabs, which in January 2024 reached a USD 3bn valuation, and found itself at the center of a weeks-long controversy over the cloning of Dorval’s voice.

Bergé publicly opposed (article in French) the use of her father’s digitally recreated voice, despite acknowledging a prior agreement to a test. “It was just a trial run, with an agreement strictly guaranteeing that my mother and I would have final approval before any use or publication. And that nothing could be done without our consent.”

According to Variety, which has followed the story since the partnership around “Armor” between Lumiere Ventures and ElevenLabs came to light, Bergé’s move galvanized the French actors’ guild (FIA, in French). 

FIA’s representative, Jimmy Shuman, called the voice cloning attempt a “provocation” in the Variety article. That is because the union is in the midst of “negotiating agreements on limits for artificial intelligence and dubbing.”

The controversy over Stallone’s French voice underscores the potential for AI to displace voice actors, often celebrities in their own right across Europe.

ElevenLabs CEO, Mati Staniszewski, told Variety that “Recreating Alain Dorval’s voice is a chance to show how technology can honor tradition while creating new possibilities in film production.” 

Like their US counterparts after a few notable actions, voice-over artists in several European countries are taking a proactive stance through their unions, including AI clauses in their contracts to restrict AI voice use to specific projects or outright banning work for studios that do not offer adequate protections.

Per the latest Variety article on the subject, voice actor Michel Vigné will be the voice of Stallone for the French release. According to IMDB, Vigné has already voiced Stallone in French in the past.

The larger issue remains: the film industry acknowledges that AI voice cloning technology is rapidly advancing and the drama around Armor’s French dubs serves as a symbol of things to come in Europe and beyond.

One decision that perhaps many voice actors will need to grapple with is whether they want their voice to be immortalized with AI or simply be replaced by it or by another actor.

Thursday, January 30, 2025

Why Interpreting Remains a Growth Market with Boostlingo CEO Bryan Forrester

Bryan Forrester, Co-founder and CEO of Boostlingo, returns to SlatorPod for round 2 to talk about the company’s growth, the US interpreting market, and the evolving role of AI.

Bryan shares how the company has tripled in size since he last appeared on the pod, driven by strategic acquisitions, including VoiceBoxer and Interpreter Intelligence, and a rebranding effort to unify its product portfolio.

Bryan explains how Boostlingo balances innovation with practicality, ensuring that new features align with customer needs. He highlights the company’s three-pronged strategy: retaining existing customers, enabling growth, and making long-term bets on emerging trends.

While tools like real-time captions and transcription enhance efficiency, Bryan stresses that AI alone cannot replace human interpreters in complex industries like healthcare. He highlights privacy, compliance, and the nuanced expertise of human interpreters as critical factors, positioning AI as a supportive tool rather than a replacement.

https://youtu.be/fMNcJ5EV2zk

Bryan discusses market dynamics and regulatory changes, including how those under the new US administration could influence language access demand, particularly in areas like healthcare and public services. 

He describes Boostlingo’s strategy of leveraging third-party AI models, optimizing them with proprietary data, and rigorously testing to ensure quality and reliability. Looking ahead, Boostlingo plans to expand internationally and integrate AI ethically and effectively into its offerings, guided by its newly formed AI Advisory Board. 

Sunday, January 19, 2025

AI in Interpreting: Slator Pro Guide

Slator's Pro Guide: AI in Interpreting is an absolute must-read for all providers of interpreting services and solutions. Here, the authors give a quick snapshot of what the newest applications of AI and large language models (LLMs) look like in interpreting.

This Slator Pro Guide will bring you up to speed on what value AI can bring to your company and the new interpreting workflows, service models, and speech AI capabilities now available to you.

The guide covers 10 one-page, actionable case studies-thematically designed and presented as vibrant infographics drawn from research and interviews with some of the leading interpreting service providers in the industry.

The ten use cases highlight new areas of growth, innovative models for service delivery, and novel workflows in interpretation made possible by the recent developments in LLMs, speech-to-text, and speech synthesis.

We will illustrate how AI speech translation solutions are being leveraged to open up language access across corporate, government, and healthcare settings, cutting across a wide variety of settings and service delivery models.

The guide also discusses AI as an interpreter tool and co-pilot, as well as its capability to optimize operations and extract insights from interpreted interactions.

Each use case describes the underlying concept and practical implications. An adoption and value-add score is also provided to reflect the industry's current level of uptake for the application as well as the additional value that it delivers to end clients.

We explain how the technology works and offer a brief list of leading AI solution providers currently on the market.

We expand on the new opportunities and benefits that each use case presents for interpreting stakeholders and carry out an impact analysis for the interpreting sector.

We also identify key risks and limitations to watch out for, which need to be considered in the adoption process.

The guide provides a higher-level overview of the key and impactful applications that can serve as a launching pad for stakeholders to make strategic decisions about adopting AI in interpreting technology and service models.

This Pro Guide is a must-read and time-saving briefing on how AI is revolutionizing the interpretation landscape.


Monday, January 13, 2025

LinkedIn Ranks 'Interpreter' Among Fastest-Growing Jobs in the UK


On January 2025, LinkedIn News UK released its "Job trends 2025: The 25 fastest-growing jobs in the UK"and the interpreters find themselves at #22. LinkedIn calls this "Jobs on the Rise" - positions that it considers to be pointers of areas of career opportunity based on data collected over the past three years.

In the list, it names both spoken and sign language interpreters and states the skills typical for a professional in the field as: interpreting, translation, and consecutive interpretation. The professionals are thus mainly in demand in translation and localization, museums, historical sites, zoos, and interestingly enough, transportation equipment manufacturing.

The LinkedIn data points to London, Manchester, and Glasgow as the top UK locations where 
the hiring of interpreters is taking place. The average experience required is 2.2 years, while most interpreters work remotely at 73%, or in hybrid position at 8%. The rest is assumed to work on-site; howeverthe figure is not included in this list.

Most interpreters in the UK and other countries work as public service interpreters, with a minority working as conference interpreters. Public service interpreters work at public institutions, such as the National Health Service (NHS), the Courts and Tribunals System, and Border Force and Immigration Enforcement.

Interpreting is one of the UK government
's regulated professions. "Interpreter" is included in the government list under "Chartered Linguist," which is a general term for various professions related to languages recognized by the bodies that subscribe to the standards of the Chartered Institute of Linguists (CIOL)such as the National Register of Public Service Interpreters (NRPSI).

Contrast to the upbeat LinkedIn ranking of interpreting as an area of opportunity is the at times contentious environment in the UK's public service interpreting sector, especially over the past two years. In that time frame, for instancethe NRPSI has sent multiple official communications to the Ministry of Justice regarding its policies as interpreters continue to protest work conditions and pay schedules in several cities.

Slator has 
also covered the development of UK public services interpreting over the same period in several articles, such as the review of court interpreter qualificationsnew credentials, legal probes into current contracts and future tenders, who pays for interpreter services, and discussions on the use of Al or lack thereof during government sessions.

Thursday, January 9, 2025

US Government RFP Seeks Translation Into Four Native American Languages

The United States government has issued an unusual RFP for translation services: The target languages are all indigenous to the US.


The contracting agency is the Office of Indian Economic Development (OIED), which falls under the Bureau of Indian Affairs that governs programs concerning federally recognized American Indian Tribes. OIED has allied with the Department of Agriculture, or USDA, in this contract. This will provide a means whereby diverse agencies can request translation into Native languages.

The RFP features a set-aside for Indian Small Business Economic Enterprises, meaning that only companies meeting certain revenue and ownership requirements may apply. OIED would prefer to award a single contractor work for all four languages.

"This is a one-year project that will respond to federal agency requests for ongoing and diverse Native Language translation that will be specific to the federal agency needs," the RFP states, noting the contract may be extended more than once, but only for an additional period of up to six months. Work covered under the contract is between January 20, 2025 – January 19, 2026.

The ultimate goal is to make available the range of content from official documents, and signage, to Web sites of the "widest possible audience of the Tribal Nations."

There are 574 federally recognized Tribal Nations. Of those, 229 are located in the state of Alaska. The other 345 Tribal Nations are spread across 35 other states.

This would, in turn deal with "more prevalent native languages", most likely the ones which are spoken more frequently.

Stats and Translation Requests

The four target languages are Yup’ik (Central dialect), Cherokee (Western dialect), Ojibwe (Western dialect), and Navajo. The contract estimates that each language will require 610 hours of translation — a somewhat uncommon way of pricing translation — for a total of 2,440 hours.

According to the American Community Survey for 2009-2013, Navajo is the most-spoken indigenous language in the US, with nearly 167,000 speakers, 35,250 of whom self-report as speaking English less than very well. The latter would be considered individuals with limited English proficiency (LEP). 

The other three languages have fewer speakers overall, and fewer individuals with LEP, including about 6,000 speakers of the Alaska Native language Yup’ik; 1,460 speakers of Cherokee; and 1,100 speakers of Ojibwe. 

With relatively small populations of people with LEP, the impetus for the RFP goes beyond numbers.

Indeed, the outgoing Biden-Harris Administration issued on December 9, 2024 a “10-year National Plan on Native Language Revitalization,” described as charting “a path to help address the United States government’s role in the loss of Native languages across the continental United States, Alaska, and Hawai’i.”

Some Tribal Nations have resources to handle (certain) translations on their own. The Cherokee Nation Translation Department, for instance, offers free translations for nonprofit uses related to education, health, and legal services. But there are limits. 

“Due to the large volume of requests, Cherokee Nation Translation does not accept unsolicited documents such as poetry, scripts, screenplays, and book manuscripts for translation,” its website states. Nor does it translate tattoos or “names in Cherokee for children, family members, [or] pets”. 

For up-to-date information about language services and technology tenders, subscribe to our Growth, Pro, or Enterprise plan and get access to the RFP Center.


Wednesday, January 8, 2025

Sony Aims to Improve AI Translation for Indian Language Entertainment Content

In an December 29, 2024 paper by Sony Research India researchers Pratik Rakesh Singh, Mohammadi Zaki, and Pankaj Wasnik comes a framework specifically designed to "improve entertainment content translations" in Indian languages.


They "believe it is the first of its kind," using an amalgamation of context awareness along with style adaptation to produce not only accurate translations but also entertaining for the targeted audience.

The researchers explained that traditional machine translation MT systems usually struggle to handle entertainment content because they mostly translate sentences in isolation. It leads to "disconnected" translations that can't really capture the emotional depth or cultural references behind the original dialogue. This has a particular pronounced effect in entertainment, where all these interconnected conversations and subtle cues in the narrative are so vital.

The challenge, in entertainment translation, lies in preserving the context, mood, and style of the original content while also including creativity and considerations of regional dialects, idioms, and other linguistic nuances," researchers explained.

To tackle this challenge, the researchers developed CASAT: the Context and Style Aware Translation, which combines the two concepts during the translation process.

The CASAT framework starts with segmenting the input text — like dialogues from movies or series — into smaller sections known as "sessions." Sessions are dialogues that are consistent in their genre or mood, such as comedy or drama. This segmentation allows CASAT to focus on the specific emotional and narrative elements of each session.

For every session, CASAT estimates two critical components: context and style. The former is said to be the narrative framework that wraps the dialogue, while the latter denotes the emotional tone and cultural nuances, like seriousness, excitement, or even humor. Understanding these, the framework will be able to make translations that effectively reach the deep recesses of the target audience's psyche.

To facilitate this, CASAT adopts a context retrieval module that gets relevant scenes or dialogues based on the relevant vector database retrieved, so this translation is grounded in appropriate narrative frameworks, and it applies a domain adaptation module to infer insights from sessions and sentences-based dialogues to realize the intended emotion tone and the intent.

Once the context and style are estimated, CASAT generates a customized prompt that is a combination of these elements. The customized prompt is then passed to an LLM that generates translations not only accurate but also carrying the intended emotional tone and cultural nuances of the original content.

Superior Performance

Metrics for CASAT's effectiveness, such as COMET scores and win ratios, have been used to test its performance. CASAT, on the other hand, surpassed baseline LLMs and MT systems like IndicTrans2 and NLLB, providing much better translations in terms of content and context.
"Our method exhibits superior performance by consistently incorporating plot and style information compared to directly prompting creativity in LLMs," the researchers said.

They found that context alone substantially improves translation quality, while including style alone has a minimal improvement. Combining the two improves quality the most.

The researchers noted that CASAT is language and model-agnostic. "Our method is both language and LLM-agnostic, making it a general-purpose tool," they concluded.

Trump Makes English the Only Official Language of the US, Revokes Clinton Language Access Order

A new Executive Order published on March 1, 2025, by the US White House designates English as the only official language of the United Stat...