Terrible speeds with LM Studio? (Is LM Studio bad?)
r/LocalLLaMA
•
Open Source AI
I've decided to try LM Studio today, and using quants of Qwen 3.5 that should fit on my 3090, I'm getting between 4 and 8 tok/s. Going from other people's comments, I should be getting about 30 - 60 tok/s. Is this an issue with LM Studio or am I just somehow stupid? Tried so far: Qwen3.5-35B-A3B-UD-Q5_K_XL.gguf Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf Qwen3.5-27B-UD-Q5_K_XL.gguf It's true that I've got slower ECC RAM, but that's why I chose lower quants. Task manager does show that the VRAM gets used too.