What tokens/sec do you get when running Qwen 3.5 27B?

r/LocalLLaMA
Open Source AI

I have a 4090 with just 32gb of ram. I wanted to get an idea what speeds other users get when using 27B. I see many posts about people saying X tokens/sec but not the max context they use. My setup is not optimal. I'm using LM studio to run the models. I have tried Bartowski Q4KM and Unsloth Q4KXL and speeds are almost similar for each. But it depends on the context I use. If I use a smaller context under 50k, I can get between 32-38 tokens/sec.