Showing posts with label ebay. Show all posts
Showing posts with label ebay. Show all posts

Monday, July 1, 2024

eBay Launches New In-House Large Language Model for E-commerce with Translation Capabilities

In a June 17, 2024 papereBay introduced its series of large language models (LLMs), tailored specifically for the e-commerce sector.

eBay’s New In-House Large Language Model for E-commerce Can Also Translate

These models, named LiLiuM 1B, 7B, and 13B, were developed in-house to meet eBay’s specific needs across various applications, including translation, in the e-commerce domain, providing full control over licenses, data, vocabulary, and architecture.

The authors said that “these models are meant to eliminate dependency on third-party LLMs within eBay.”

eBay explained that using foundation models like the LLaMA-2 models, which can be accessed and adjusted for specific purposes, poses risks related to licensing, data security, and future-proofing. They noted that these models are generally trained on English-centric data and are quite generic.

To address these concerns, eBay developed its Large Language Models (LLMs) entirely in-house from scratch. These models were trained on a vast dataset containing 3 trillion tokens, which included both general texts and specific e-commerce content in multiple languages. They utilized the ParaCrawl corpus alongside a smaller proprietary corpus from the e-commerce domain. This approach ensures robustness in handling diverse languages and tasks specific to e-commerce.

Additionally, eBay created its own custom tokenizer and model vocabulary tailored specifically for e-commerce applications. According to eBay, this approach offers several advantages: full control over the vocabulary, including special tokens; enhanced support for multilingual capabilities; and better adaptation to the specific use cases of e-commerce.

Eliminating Dependencies

According to the authors, their models perform on par with, or better than, the popular LLaMA-2 models, particularly excelling in non-English machine translation, as well as natural language understanding (NLU) tasks and e-commerce-specific applications.

The authors explained that the improved performance is primarily due to the extensive inclusion of non-English and e-commerce-specific data during pretraining. This inclusion enhances the models' understanding and performance across languages other than English. Additionally, the use of a customized vocabulary tailored for e-commerce tasks significantly accelerates text generation speed, surpassing LLaMA-2 by up to 34%.

The authors anticipate these models will serve as a foundational base for fine-tuning and instruction-tuning, reducing reliance on external models.

Future endeavors will concentrate on enhancing the data pipeline by integrating more eBay-specific data, training larger models, and exploring the Mixture-of-Experts architecture to enhance efficiency.



What Are Language Solutions Integrators and Language Technology Platforms?

  Florian and Esther welcome Slator’s Anna Wyndham and Alex Edwards to SlatorPod to explain the rationale behind the new industry framework ...