Strix Halo ROCm + MTP Notes (May 2026)
r/LocalLLaMA
•
Generative AI
AI Hardware
Open Source AI
With the MTP merge into mainline llama.cpp I wanted to try out some other optimizations i could think of. Ended up tested backends, mtp, and bumping to ROCm nightlies. What's changed: ROCm 7.13 works on gfx1151 (7.2.2 could see the GPU but couldn't compile shaders) MTP merged to llama.cpp main yesterday (May 16) I ran 3 models x 2 backends x 3 prompt lengths + a full-context decode test The headline: ROCm drops 64% at full context, but MTP recovers most of it. Vulkan barely drops.