Yandex YaLM

5 comments
Unlock the power of YaLM 100B, a GPT-like neural network that generates and processes text with 100 billion parameters. Free for developers and researchers worldwide.0
Visit website

What is Yandex YaLM?

YaLM 100B is a GPT-like neural network for generating and processing text. It can be used freely by developers and researchers from all over the world.

The model leverages 100 billion parameters. It took 65 days to train the model on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian.

Training details and best practices on acceleration and stabilizations can be found on Medium (English) and Habr (Russian) articles.

They used DeepSpeed to train the model and drew inspiration from Megatron-LM example. However, the code in this repo is not the same code that was used to train the model. Rather it is stock example from DeepSpeed repo with minimal changes needed to infer the model.

More information on Yandex YaLM

Launched
2023
Pricing Model
Free
Starting Price
Global Rank
Country
Month Visit
<5k
Tech used
Yandex YaLM was manually vetted by our editorial team and was first featured on September 4th 2024.
Aitoolnet Featured banner

Yandex YaLM Alternatives

Load more Alternatives
  1. YandexGPT 2, an AI language model, has shown significant improvements in language modeling but may still provide answers and suggestions that are not based

  2. GPT-NeoX-20B is a 20 billion parameter autoregressive language model trained on the Pile using the GPT-NeoX library.

  3. Alfred-40B-0723 is a finetuned version of Falcon-40B, obtained with Reinforcement Learning from Huma

  4. Ongoing research training transformer models at scale

  5. The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.