Thursday, January 11, 2024

 Future of Translation: Crowdin’s Approach and the Role of LLM

I. Introduction

I recently came across an article by Arnaud Rinquin of Slite entitled “How we skipped conventional translation tools“. This article caught my attention because Crowdin is a translation tool. For now, let’s just use one fact from the article: it has been proven that LLM can produce 95% of ready-to-publish translations in real-life UI localization projects. Conventional NMT did about 30%.

II. Evolution of Translation Technology

Traditionally, translation involved manual efforts, but the advent of Machine Translation (MT) marked a significant shift. Language Models (LMs) further enhanced the translation process, paving the way for Large Language Models (LLMs) like GPT-3. The evolution of these technologies has transformed the translation landscape.

III. Crowdin's Unique Approach

Crowdin, a leading player in the translation industry, has embraced LLMs to revolutionize its approach. By integrating these powerful models into its platform, Crowdin ensures a more accurate and efficient translation process. This section delves into the key aspects of Crowdin's approach and the advantages it offers.

IV. Advantages of LLMs in Translation

The use of LLMs brings forth a plethora of benefits in the translation domain. Improved accuracy, faster turnaround times, and cost-effectiveness are just a few advantages that LLMs offer. Crowdin's strategic implementation of LLMs positions it as a frontrunner in providing top-notch translation services.

V. Challenges and Solutions

While LLMs offer remarkable advantages, challenges arise in dealing with language nuances and context-specific translations. Crowdin addresses these challenges by employing sophisticated algorithms that understand the subtleties of languages, ensuring accurate and contextually appropriate translations.

VI. User Experience with Crowdin

Businesses worldwide testify to the positive impact Crowdin's approach has on their translation needs. Testimonials and success stories highlight the efficiency and reliability of Crowdin's platform, showcasing its contribution to seamless global collaborations.

VII. Future Prospects of LLM in Translation

Looking ahead, the future of translation lies in the continuous advancements of LLM technology. As AI and Natural Language Processing (NLP) continue to evolve, the integration of these technologies into translation services will play a pivotal role in shaping global communication.

VIII. Conclusion

In conclusion, Crowdin's innovative approach, leveraging the power of LLMs, signifies a transformative era in translation. The combination of cutting-edge technology and linguistic expertise positions Crowdin at the forefront of the evolving landscape, redefining how businesses communicate on a global scale.


Tuesday, December 19, 2023

 

Argos Multilingual designates Alexander Ulichnowski as the Chief Executive Officer (CEO)

In a strategic move that has sent ripples through the language services industry, Argos Multilingual recently announced the appointment of Alexander Ulichnowski as its new Chief Executive Officer (CEO). This decision marks a significant milestone in the company's journey and is poised to shape the future trajectory of Argos Multilingual.

Introduction

Argos Multilingual, a leading player in the language services sector, has long been recognized for its commitment to excellence in translation and localization services. With a global presence and a reputation for delivering top-notch solutions to clients across diverse industries, the company's decision to bring Alexander Ulichnowski on board as CEO has generated widespread interest and speculation.



Alexander Ulichnowski's Background

Alexander Ulichnowski brings a wealth of experience to his new role as CEO of Argos Multilingual. His impressive professional journey is marked by notable achievements and a proven track record in the language services industry. Having held key positions in previous organizations, Ulichnowski's expertise aligns seamlessly with the demands of his new leadership role.

Argos Multilingual's Decision

The decision to appoint Alexander Ulichnowski as CEO was not arbitrary. Argos Multilingual carefully considered various factors, including Ulichnowski's leadership skills, industry knowledge, and strategic vision. This move reflects the company's commitment to staying at the forefront of innovation and adapting to the evolving needs of its clients.

Industry Impact

The appointment of a new CEO always has far-reaching implications for an organization and its industry. In the language services sector, where precision and cultural nuance are paramount, stakeholders are keenly observing how this change in leadership will influence Argos Multilingual's market standing and competitiveness.

Alexander Ulichnowski's Vision

As he takes the helm at Argos Multilingual, Alexander Ulichnowski has outlined his vision for the company's future. Focused on leveraging technology and fostering a collaborative environment, Ulichnowski aims to position Argos Multilingual as a pioneer in innovative language solutions.

Company Culture and Leadership

Understanding the dynamics of a company's internal culture is crucial in evaluating its potential for success under new leadership. Argos Multilingual has been known for its inclusive and dynamic culture, and Alexander Ulichnowski's leadership style is expected to align seamlessly with these values.

Challenges and Opportunities

While a change in leadership often brings about positive transformations, it also presents challenges. Argos Multilingual faces the task of ensuring a smooth transition and addressing any potential hurdles that may arise. Simultaneously, this change opens up new opportunities for the company to explore uncharted territories and expand its market reach.

Stakeholder Perspectives

The announcement of Alexander Ulichnowski as CEO has sparked reactions from various stakeholders. Employees, clients, and industry experts are expressing their views on this strategic move. The general sentiment is a mix of curiosity, optimism, and anticipation as Argos Multilingual enters a new chapter under Ulichnowski's leadership.

Future Outlook

Looking ahead, the future of Argos Multilingual under Alexander Ulichnowski's leadership seems promising. The company is expected to build on its legacy of excellence while embracing innovation and adapting to the changing landscape of the language services industry. Long-term goals include solidifying market presence, expanding service offerings, and fostering strategic partnerships.

Conclusion

In conclusion, the appointment of Alexander Ulichnowski as CEO marks a significant moment for Argos Multilingual. With a leader of Ulichnowski's caliber at the helm, the company is poised for growth, innovation, and continued success in the dynamic language services industry.

For More News : https://slator.com/news/

Wednesday, December 13, 2023

 

Intense Duel in Automated Speech Translation

Automated Speech Translation (AST) stands at the forefront of technological advancements, revolutionizing the way we communicate in a globalized world. From historical developments to the challenges faced by developers, this article explores the intense duel in automated speech translation, navigating the complexities of perplexity and burstiness.

Evolution of Automated Speech Translation

In the early days, automated speech translation was a distant dream. However, rapid technological advancements, particularly in machine learning and artificial intelligence, have propelled us into an era where seamless language translation is not just a possibility but a reality.

Key Players in the Industry

Several major companies have invested heavily in the development of automated speech translation technologies. From industry giants to innovative startups, each plays a crucial role in shaping the landscape of AST.

Challenges in Automated Speech Translation

While AST has come a long way, it is not without its challenges. Linguistic nuances, technical limitations, and the need for real-world applications pose significant hurdles. Developers grapple with finding solutions that balance precision and practicality.

Perplexity in Automated Speech Translation

Perplexity, a measure of how well a language model predicts a sample, is a critical factor in the effectiveness of automated speech translation. Understanding perplexity and its role in language models sheds light on the intricacies of AST.

Burstiness in Automated Speech Translation

Burstiness, a phenomenon in language processing, adds another layer of complexity. The sporadic nature of language requires AST systems to handle bursts of information effectively, ensuring accuracy and coherence.

Balancing Perplexity and Burstiness

Achieving a delicate equilibrium between perplexity and burstiness is the key to optimizing AST performance. Developers employ sophisticated strategies, drawing from linguistic insights and technological innovations.

The Human Touch in Automated Speech Translation

In the quest for perfection, the human touch remains irreplaceable. Human input is vital in refining language models, ensuring that AST systems not only understand words but also grasp the nuances of human communication.

Real-World Applications

The impact of AST extends beyond language barriers. From facilitating smooth business transactions to enhancing healthcare communication, the real-world applications are diverse and transformative.

User Experience and Feedback

Continuous improvement is driven by user experience and feedback. Addressing common concerns and challenges faced by users ensures that AST systems evolve to meet the dynamic needs of a diverse user base.

Advancements in Neural Machine Translation (NMT)

Neural Machine Translation has emerged as a game-changer in advancing AST. The integration of NMT techniques enhances the accuracy and efficiency of language translation, pushing the boundaries of what is possible.

Future Trends and Innovations

As technology evolves, so does the landscape of AST. Predictions for the future include cutting-edge technologies and innovative approaches that promise to make language translation even more accessible and effective.

Ethical Considerations

With great power comes great responsibility. Ethical considerations in AST involve addressing concerns related to privacy, data security, and responsible development practices. Striking a balance between innovation and ethics is crucial for the sustainable growth of AST.

Conclusion

In conclusion, the intense duel in automated speech translation is a fascinating journey through the evolution of technology, linguistic challenges, and the quest for precision. AST holds the potential to bridge communication gaps on a global scale, and as it continues to evolve, the transformative impact on our interconnected world becomes increasingly evident.

Read More : https://slator.com/epic-battle-in-automatic-speech-translation/

Thursday, May 18, 2023

How Large Language Models Prove Chomsky Wrong with Steven Piantadosi

Joining SlatorPod this week is Steven Piantadosi, Associate Professor of Psychology at UC Berkeley. Steven also runs the computation and language lab (colala) at UC Berkeley, which studies the basic computational processes involved in human language and cognition.


Steven talks about the emergence of large language models (LLMs) and how it has reshaped our understanding of language processing and language acquisition.

Steven breaks down his March 2023 paper, “Modern language models refute Chomsky’s approach to language”. He argues that LLMs demonstrate a wide range of powerful language abilities and disprove foundational assumptions underpinning Noam Chomsky’s theories and, as a consequence, negate parts of modern.

Steven shares how he prompted ChatGPT to generate coherent and sensible responses that go beyond its training data, showcasing its ability to produce creative outputs. While critics argue that it is merely an endless sequence of predicting the next token, Steven explains how the process allows the models to discover insights about language and potentially the world itself.

Steven acknowledges that LLMs operate differently from humans, as models excel at language generation but lack certain human modes of reasoning when it comes to complex questions or scenarios. He unpacks the BabyLM Challenge which explores whether models can be trained on human-sized amounts of data and still learn syntax or other linguistic aspects effectively.

Despite industry advancements and the trillion-dollar market opportunity, Steven agrees with Chomsky’s ethical concerns, including issues such as the presence of harmful content, misinformation, and the potential impact on job displacement.

teven remains enthusiastic about the potential of LLMs and believes the recent advancements are a step forward to achieving artificial general intelligence, but refrains from making any concrete predictions.

Thursday, May 11, 2023

Why Large Language Models Hallucinate When Machine Translating ‘in the Wild’

 Large language models (LLMs) have demonstrated impressive machine translation (MT) capabilities, but new research shows they can generate different types of hallucinations compared to traditional models when deployed in real-world settings. 

The findings, published in a paper on March 28, 2023, included evidence that the hallucinations were more prevalent when translating into low-resource languages and out of English and that they can introduce toxic text.

Hallucinations present a critical challenge in MT, as they may damage user trust and pose serious safety concerns, according to a 2022 research paper, though studies to improve the detection and mitigation of hallucinations in MT have been limited to small models trained on a single English-centric language pair.

This has left “a gap in our understanding of hallucinations […] across diverse translation scenarios,” explained Nuno M. Guerreiro and Duarte M. Alves from the University of Lisbon, Jonas Waldendorf, Barry Haddow, and Alexandra Birch from the University of Edinburgh, Pierre Colombo from the Université Paris-Saclay, and André F. T. Martin, Head of Research at Unbabel, in the newly published research paper.

Looking to fill that gap, the researchers conducted a comprehensive analysis of various massively multilingual translation models and LLMs, including ChatGPT. The study covered a broad spectrum of conditions, spanning over 100 translation directions across various resource levels and going beyond English-centric language pairs.

According to the authors, this research provides key insights into the prevalence, properties, and mitigation of hallucinations, “paving the way towards more responsible and reliable MT systems.”

Detach from the Source 

The authors found that hallucinations are more frequent when translating into low-resource languages and out of English, leading them to conclude that “models tend to detach more from the source text when translating out of English.”

In terms of type of hallucinations, oscillatory hallucinations — erroneous repetitions of words and phrases — are less prevalent in low-resource language pairs, while detached hallucinations — translations that bear minimal or no relation at all to the source — occur more frequently. 

According to the authors, “this reveals that models tend to rely less on the source context when translating to or from low-resource languages.”

The rate of hallucinations exceeded 10% in some language pairs, such as English-Pashto, Tamil-English, Azerbaijani-English, English-Azerbaijani, Welsh-English, English-Welsh, and English-Asturian. However, the authors suggest that hallucination rates can be reduced by increasing the size of the model (scaling up) or using smaller distilled models.

Hallucinations and Toxicity

The authors also found that hallucinations may contain toxic text, mainly when translating out of English and into low-resource languages, and that scaling up the model size may not reduce hallucinations. 

This indicates that hallucinations might be attributed to toxic patterns in the training data and underlines the need to filter the training data rigorously to ensure the safe and responsible use of these models in real-world applications.

The authors emphasize that while massive multilingual models have significantly improved the translation quality for low-resource languages, the latest findings underscore potential safety concerns and the need for improvement.

To mitigate hallucinations and improve overall translation quality, they explored fallback systems, finding that hallucinations can be “sticky and difficult to reverse when using models that share the same training data and architecture.” 

However, external tools, such as NLLB, can be leveraged as fallback systems to improve translation quality and eliminate pathologies such as oscillatory hallucinations.

ChatGPT Surprise

The authors also found that ChatGPT produces different hallucinations compared to traditional MT models. These errors may include off-target translations, overgeneration, or even failed attempts to translate. 

Furthermore, unlike traditional MT models, which frequently produce oscillatory hallucinations, ChatGPT does not generate any such hallucinations under perturbation. “This is further evidence that translation errors, even severely critical ones, obtained via prompting an LLM are different from those produced by traditional machine translation models,” explained the authors.

Moreover, the results revealed that ChatGPT generates more hallucinations for mid-resource languages than for low-resource languages, highlighting that “it surprisingly produces fewer hallucinations for low-resource languages than any other model.”

The authors note that while the majority of the hallucinations can be reversed with further sampling from the model, this does not necessarily indicate a defect in the model’s ability to generate adequate translations, but rather may be a result of “bad luck” during generation, as Guerreiro, Martins, and Elena Voita, AI Research Scientist at Meta, wrote in a 2022 research paper.

To facilitate future research in this area, the authors have made their code openly available and released over a million translations and detection results across several models and language pairs.

Saturday, January 28, 2023

Tencent Pits ChatGPT Translation Quality Against DeepL and Google Translate

 


Since OpenAI launched ChatGPT in November 2022, headlines have asked whether workers in a range of fields should worry about being replaced by the advanced AI chatbot. Now, a January 2023 paper from a Chinese tech company, Tencent, asks the question on behalf of the language industry: Is ChatGPT A Good Translator?

The Tencent team goes about answering the question by reviewing, shall we say, a limited set of data. The team said “obtaining the translation results from ChatGPT is time-consuming since it can only be interacted with manually and can not respond to large batches. Thus, we randomly sample 50 sentences from each set for evaluation.” So, let’s see what insights the team gathered by evaluating those 50 sentences.

According to the paper, ChatGPT performs “competitively” with commercial machine translation (MT) products, such as Google TranslateDeepL, and Tencent’s own system, on high-resource European languages, but struggles with low-resource or unrelated language pairs.

In other words, one observer on Twitter quipped, “Potential alternative headline/interpretation: ‘ChatGPT was trained for translation on common publicly available parallel corpora.’”

For this “preliminary study,” Tencent AI Lab researchers, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Xing Wang, and Zhaopeng Tu evaluated translation prompts, multilingual translation, and translation robustness.

Meta Moment

The experiment started with a “meta” moment when the team asked ChatGPT itself for prompts or templates that would trigger its MT ability. The prompt that produced the best Chinese–English translations was then used for the rest of the study — 12 directions total between Chinese, English, German, and Romanian.

Researchers were curious as to how ChatGPT’s performance might vary by language pair. While ChatGPT performed “competitively” with Google Translate and DeepL for English–German translation, its BLEU score for English–Romanian translation was 46.4% lower than that of Google Translate.

The team attributed the poor performance to the pronounced difference in monolingual data for English and Romanian, which “limits the language modeling capability of Romanian.”

Romanian–English translation, on the other hand, “can benefit from the strong language modeling capability of English such that the resource gap of parallel data can be somewhat compensated,” for a BLEU score just 10.3% below Google Translate.

Beyond the Family

Beyond resource differences, the authors wrote, translating between language families is considered more difficult than translating within language families. The difference in the quality of ChatGPT’s output for German–English versus Chinese–English translation seems to bear this out.  

Researchers observed an even greater performance gap between ChatGPT and commercial MT systems for low-resource language pairs from different families, such as Romanian–Chinese. 

“Since ChatGPT handles different tasks in one model, low-resource translation tasks not only compete with high-resource translation tasks but also with other NLP tasks for the model capacity, which explains their poor performance,” they wrote.

Google Translate and DeepL both surpassed ChatGPT in translation robustness on two out of three test sets: WMT19 Bio (Medline abstracts) and WMT20 Rob2 (Reddit comments), likely thanks to their continuous improvement as real-world applications fed by domain-specific and noisy sentences. 

However, ChatGPT outperformed Google Translate and DeepL “significantly” on the WMT20 Rob3 test set, which contained a crowdsourced speech recognition corpus. The authors believe this finding suggests that ChatGPT is “capable of generating more natural spoken languages than these commercial translation systems,” hinting at a possible future area of study.

Also Read:

We Prompted ChatGPT to be a Translation Manager

Language Discordance Raises Risk of Hospital Readmissions, U.S. Study Finds

  A June 2024 meta-analysis published in   BMJ Quality & Safety   was recently brought back into the spotlight by Dr. Lucy Shi, who disc...