LM Studio CPU thread pool size vs. tk/s with some MoE layers offloaded to CPU

r/LocalLLaMA
Generative AI

Submitted by /u/bonobomaster [link] [comments]