Mistral medium 3.5 128B, MLX 4bit, ~70 GB
r/LocalLLaMA
•
Generative AI
Open Source AI
AI Research
This model seems utterly broken for now. I do not recommend downloading or using it, unless you are planning to help troubleshoot it. This is not a problem with the conversion, but with the model itself. I converted Mistral medium 3.5 128B to MLX 4bit. Eagle model for speculative decoding is not yet ed by MLX. Vision encoder included (full BF16 unquantized. Thinking mode works (reasoning_effort="high" gives you the [THINKTHINK] chain), tool calling works, 256K context. There was a bug in mlx-vlm's mistral3 sanitize function: it wasn't stripping the model.