Unlimited tokens through sharing GPUs

r/LocalLLaMA
Generative AI AI Hardware

Sllm is an experiment in sharing GPUs between developers. I think everyone at some point in their agentic development journey thinks about hosting their own LLM. And if you can afford it, great, but I looked pretty deep into the economics and it's actually incredibly wasteful. Most of the time your GPU is sitting idle. So I built sllm to see if it's possible to share a single LLM node between hundreds of developers and give everyone unlimited tokens at a flat rate. Honestly, I'm not sure how well this will work.