I checked Strix Halo (Ryzen ai max+ 395) performance test as context length increases
r/LocalLLaMA
•
Generative AI
Open Source AI
AI Research
Hi all, I saw a lot of test videos and postings for how exactly good Strix Halo machine(GTR9 PRO) is for Local LLM as long context length. So I put together a small benchmark project for testing how local llama.cpp models behave as context length increases on an AMD Strix Halo 128GB machine.