Request: Training a pretrained, MoE version of Mistral Nemo
r/LocalLLaMA
•
Machine Learning
Generative AI
AI Hardware
Open Source AI
AI Research
I converted Mistral Nemo from a dense model into a sixteen expert MoE model: The core problem is that I am a student with budget constraints and can’t afford full parameter or extended fine tuning. I did my best to re coherence, and it worked, but the model currently gets a lot of things wrong and ignores instructions half the time. I can’t offer anything for it but I hope someone takes interest in this model, I worked pretty hard on it but I am kinda hit the limit of what I can do with my budget and a rental.