vLLM ROCm has been added to Lemonade as an experimental backend
r/LocalLLaMA
•
Generative AI
Open Source AI
AI Tools
VLLM has the ability to run.safetensors LLMs before they are converted to GGUF and represents a new engine to explore. I personally had never tried it out until u/krishna2910-amd/ u/mikkoph and u/sa1sr1 made it as easy as running llama.cpp in Lemonade: lemonade backends install vllm:rocm lemonade run Qwen3.5-0.8B-vLLM This is an experimental backend for us in the sense that the essentials are implemented, but there are known rough edges. We want the community's feedback to see where and how far we should take this.