What llamacpp's webui has and what it lacks
r/LocalLLaMA
•
Generative AI
I've been on a quest testing chat UI's for development. So far out of Jan.ai, AnythingLLM, librechat, and Open Webui, llamacpp's webui is my favourite. The killer feature Counting my context used. I don't need to guess when my context is full by the model suddenly becoming dumb. The token counter you get during prefil and response is way better than the loading spinner every other ui gives you. What's missing If a tool call fails, it kills the entire conversation. I sort of work around this by forking conversations regularly but it would sure be nice if I didn't have to.