Best Llama Config for Turboquant_Plus? (Stats below)
r/LocalLLaMA
•
Generative AI
Open Source AI
So I'm running the below and I've seen guys run this setup with TurboQuant_plus and get 35 tokens/second. I find the speeds I'm getting acceptable but if I could hit 30-35 I'd be soooooo happy.