VRAM.cpp: Running llama-fit-params directly in your browser

r/LocalLLaMA
Generative AI Open Source AI

Lots of people are always asking on this subreddit if their system can run a certain model. A lot of the "VRAM calculators" that I've found only provide either very rough estimates or are severely limited in the number of models they can estimate the usage for. These are both due to the complexity of figuring out how much memory is utilized for the numerous types of attention on the market today.