AI RESEARCH

New in llama.cpp: Model Management

Hugging Face Blog

Quick Start Features Examples Chat with a specific model List available models Manually load a model Unload a model to free VRAM Key Options Also available in the Web UI Join the Conversation llama.cpp server now ships with router mode, which lets you dynamically load, unload, and switch between multiple models without restarting.