LM Studio CPU thread pool size vs. tk/s with some MoE layers offloaded to CPU
r/LocalLLaMA
•
Generative AI
Submitted by /u/bonobomaster [link] [comments]