Local Qwen 3.5 on 16GB GPU vs Kimi K2.5 on the cloud

r/LocalLLaMA
AI Hardware Open Source AI

Kimi K2.5 is a great model, and I'm happy they released the weights, but I decided to give Qwen 3.5 a spin on my local machine with a 16 GB AMD RX 9070 XT using the unsloth q2_k_xl with 64k context, and it nailed the car wash question that Kimi struggled with with a sweet 120 t/s speed. The Linux distro is Bazzite Deck KDE. LM Studio is running it locally with the Vulkan engine set. Here's the prompt to copy-paste: "I need to wash my car. The car wash is only 50 meters from my home.