Lemonade SDK on Strix Halo
r/LocalLLaMA
•
Generative AI
Open Source AI
Just for whoever might find it useful, I recently converted over from base setup llama.cpp to Lemonade SDK on my AMD Strix Halo and it instantly feels so much better. I’m seeing on average 20% bumps in tokens per second running the same models on the same hardware. AMD specific, and might take some tweaking but it’s been a huge quality of life improvement for me. Like actually going back and forth with agents, deep research running smooth, a lot of things that felt like they could hang it up before are moving much cleaner and faster. Either way, just sharing.