Mistral Medium 3.5 128B and Qwen 3.5 122B A10B on 4x RTX 3080 20GB
r/LocalLLaMA
•
Generative AI
Open Source AI
AI Tools
Mistral Medium 3.5 128B with 4x3080 20GB with layer split: CUDA_VISIBLE_DEVICES=0,1,2,3./build/bin/llama-bench --model /data/huggingface/Mistral-Medium-3.5-GGUF/Mistral-Medium-3.5-128B-IQ4_XS-00001-of-00003.