PSA: If you haven’t updated Llama.cpp for a couple of days and find MTP to not be performing well, update llamacpp.
r/LocalLLaMA
•
Generative AI
Open Source AI
I thought it had horrible performance and was a nothingburger and had spent like an hour benchmarking it. Updated it yesterday and received a like 1.5-1.8x token boost. They even mostly fixed the pp issue. Now my pp is really big submitted by /u/Borkato [link] [comments]