Mapping GPUs to LLMs (and back): A bandwidth-based estimator for local inference
r/LocalLLaMA
•
Generative AI
Submitted by /u/alexp_lt [link] [comments]