You can run LLMs on your AMD NPU on Linux!

If you have a Ryzen™ AI 300/400-series PC and run Linux, we have good news! You can now run LLMs directly on the AMD NPU in Linux at high speed, very low power, and quietly on-device. Not just small s, but real local inference. Get Started 🍋 Lemonade Server Lightweight Local server for running models on the AMD NPU. Guide: GitHub: ⚡ FastFlowLM (FLM) Lightweight runtime optimized for AMD NPUs.