Monday, July 1, 2024

eBay Launches New In-House Large Language Model for E-commerce with Translation Capabilities

In a June 17, 2024 papereBay introduced its series of large language models (LLMs), tailored specifically for the e-commerce sector.

eBay’s New In-House Large Language Model for E-commerce Can Also Translate

These models, named LiLiuM 1B, 7B, and 13B, were developed in-house to meet eBay’s specific needs across various applications, including translation, in the e-commerce domain, providing full control over licenses, data, vocabulary, and architecture.

The authors said that “these models are meant to eliminate dependency on third-party LLMs within eBay.”

eBay explained that using foundation models like the LLaMA-2 models, which can be accessed and adjusted for specific purposes, poses risks related to licensing, data security, and future-proofing. They noted that these models are generally trained on English-centric data and are quite generic.

To address these concerns, eBay developed its Large Language Models (LLMs) entirely in-house from scratch. These models were trained on a vast dataset containing 3 trillion tokens, which included both general texts and specific e-commerce content in multiple languages. They utilized the ParaCrawl corpus alongside a smaller proprietary corpus from the e-commerce domain. This approach ensures robustness in handling diverse languages and tasks specific to e-commerce.

Additionally, eBay created its own custom tokenizer and model vocabulary tailored specifically for e-commerce applications. According to eBay, this approach offers several advantages: full control over the vocabulary, including special tokens; enhanced support for multilingual capabilities; and better adaptation to the specific use cases of e-commerce.

Eliminating Dependencies

According to the authors, their models perform on par with, or better than, the popular LLaMA-2 models, particularly excelling in non-English machine translation, as well as natural language understanding (NLU) tasks and e-commerce-specific applications.

The authors explained that the improved performance is primarily due to the extensive inclusion of non-English and e-commerce-specific data during pretraining. This inclusion enhances the models' understanding and performance across languages other than English. Additionally, the use of a customized vocabulary tailored for e-commerce tasks significantly accelerates text generation speed, surpassing LLaMA-2 by up to 34%.

The authors anticipate these models will serve as a foundational base for fine-tuning and instruction-tuning, reducing reliance on external models.

Future endeavors will concentrate on enhancing the data pipeline by integrating more eBay-specific data, training larger models, and exploring the Mixture-of-Experts architecture to enhance efficiency.



No comments:

Post a Comment

US Government RFP Seeks Translation Into Four Native American Languages

The  United States  government has issued an unusual  RFP for translation  services: The target languages are all indigenous to the US. Th...