running Qwen 3.6 35b A3B on 2x 5060TI
r/LocalLLaMA
•
AI Hardware
Open Source AI
I ran Qwen 3.6 35b A3B two 5060TI 16gb ( 32 gb vram also i have 32gb dram but i don't like offloading ) i used Q4 on LM Studio to get full context and i get 90t/s any tricks to optimze this to upgrade to Q6 or Q8? thanks! another thing if you recommend somthing for cooling because i am using 2 stacked gpus with 0 gap ( ihave and mATX motherboard ) now the second gpu it not that hot but hotter then the bottom one submitted by /u/chocofoxy [link] [comments]