New open weights models: GigaChat-3.1-Ultra-702B and GigaChat-3.1-Lightning-10B-A1.8B

r/LocalLLaMA
Machine Learning Generative AI Open Source AI AI Research

Hey, folks! We've released the weights of our GigaChat-3.1-Ultra and Lightning models under MIT license at our HF. These models are pretrained from scratch on our hardware and target both high resource environments (Ultra is a large 702B MoE) and local inference (Lightning is a tiny 10B A1.8B MoE). Why? Because we believe that having open weights models is better for the ecosystem Because we want to create a good, native for CIS language model about the models: - Both models are pretrained from scratch using our own data and compute -- thus, it's not a DeepSeek finetune.