Showing posts with label slatorpad. Show all posts
Showing posts with label slatorpad. Show all posts

Wednesday, February 12, 2025

Researchers Present DOLFIN, a New Test Set for AI Translation for Financial Content

On February 5, 2025, a team of researchers from Grenoble Alpes University and Lingua Custodia, a France-based company specializing in AI and natural language processing (NLP) for the finance sector, introduced DOLFIN, a new test set designed to evaluate document-level machine translation (MT) in the financial domain.


The researchers say that the financial domain presents unique challenges for MT due to its reliance on precise terminology and strict formatting rules. They describe it as “an interesting use-case for MT” since key terms often shift meaning depending on context.

For example, the French word couverture means blanket in a general setting but hedge in financial texts. Such nuances are difficult to capture without larger translation units.

Despite strong research interest in document-level MT, specialized test sets remain scarce, the researchers note. Most datasets focus on general topics rather than domains such as legal and financial translation.

Given that many financial documents “contain an explicit definition of terms used for the mentioned entities that must be respected throughout the document,” they argue that document-level evaluation is essential. 

DOLFIN allows researchers to assess how well MT models translate longer texts while maintaining context. 

Unlike traditional test sets that rely on sentence-level alignment, DOLFIN structures data into aligned sections, enabling the evaluation of broader linguistic challenges, such as information reorganization, terminology consistency, and formatting accuracy.

Context-Sensitive

To build the dataset, they sourced parallel documents from Fundinfo, a provider of investment fund data, and extracted and aligned financial sections rather than individual sentences. The dataset covers English-French, English-German, English-Spanish, English-Italian, and French-Spanish, with an average of 1,950 segments per language pair. 

The goal, according to the researchers, was to develop “a test set rich in context-sensitive phenomena to challenge MT models.”

To assess the usefulness of DOLFIN, the researchers evaluated large language models (LLMs) including GPT-4o, Llama-3-70b, and their smaller counterparts. They tested these models in two settings: translating sentence by sentence versus translating full document sections. 

They found that DOLFIN effectively distinguishes between context-aware and context-agnostic models, while also exposing model weaknesses in financial translation.

Larger models benefited from more context, producing more accurate and consistent translations, while smaller models often struggled. “For some segments, the generation enters a downhill, and with every token, the model’s predictions get worse,” the researchers observed, describing how smaller LLMs failed to maintain coherence over longer passages.

DOLFIN also reveals persistent weaknesses in financial MT, particularly in formatting and terminology consistency. Many models failed to properly localize currency formats, defaulting to English-style notation instead of adapting to European conventions.

The dataset is publicly available on Hugging Face.

Authors: Mariam Nakhlé, Marco Dinarelli, Raheel Qader, Emmanuelle Esperança-Rodier, and Hervé Blanchon

Tuesday, June 25, 2024

 Landexx, a language services provider based in Germany, has filed for bankruptcy.

According to a court filing reported by several German legal aggregation sites, the language services provider Landexx has filed for bankruptcy.

Information on the LSP is now difficult to find, as Landexx's website appears to be down. Additionally, as a private company, Landexx’s annual financial reports are not publicly accessible.

German Language Services Provider Landexx Files for Bankruptcy

Landexx, led by Managing Director Christel Stemmer, provided various services, including translation, interpreting, language training, and desktop publishing.

Language professionals have informed colleagues of the news. For example, German-English translator Jill Sommer advised freelancers in a blog post to contact the bankruptcy trustee, Stephan Höltershinken, about any unpaid invoices. She also warned others not to accept translation assignments from the company.

For at least a few years before the LSP's bankruptcy filing, there had been rumblings about Landexx among freelancers online.

“They were an excellent client, always very professional, EXCEPT that I always had to chase late payments. They often paid 3 to 4 months late,” read a November 2020 complaint by one translator who said she began working with Landexx in 2009.

The freelancer explained that she had to "chase payments by email and phone " and wait a year, even hiring a lawyer, to receive payment for six outstanding invoices.

"[T]hey have told my lawyer that they will no longer be sending me any work, even though I did nothing wrong," she added. "I believe other freelance translators must be in the same position."

Another translator wrote on Reddit in May 2024 that they were pursuing legal action against Landexx through a court order for payment "[a]fter countless reminder e-mails that have been ignored."

Landexx also had a mixed reputation on the job board ProZ, where the LSP’s "Blue Board affiliation" — based on freelancers’ willingness to work with the company again — stood at two out of five stars. A December 2022 staff note indicated that "[t]his outsourcer has been banned from posting jobs at ProZ.com." Details regarding Landexx’s specific case were not disclosed.

“Use of the site dishonestly or fraudulently will result in termination of site use and associated privileges,” states ProZ’s termination policy, which also specifies that ProZ "reserves the right to refuse access to this site to any party without giving a reason."


Language Discordance Raises Risk of Hospital Readmissions, U.S. Study Finds

  A June 2024 meta-analysis published in   BMJ Quality & Safety   was recently brought back into the spotlight by Dr. Lucy Shi, who disc...