Mistral Medium 3.5 on AMD Strix Halo
r/LocalLLaMA
•
Generative AI
Open Source AI
TLDR; it's slow as heck. Run overnight. I asked it a question about codebase architecture. For an end-to-end prompt of 48k tokens + 4k thinking tokens, it took about 2 hours. llama-server -hf unsloth/Mistral-Medium-3.