Are the rich RAM /poor GPU people wrong here?
r/LocalLLaMA
•
AI Hardware
Hello Guys, I know everyone has his definition of local models, but for me i see 2 "reasonable" type of frontier local models. a dense one that barely fit in a 32GB ou 24GB of gpu for the most "reasonable" GPU wealthy guys and a MOE in the 100B params, the 100ish B billion params can be run on hybrid offload with a decent speed on a 128GB ram, since 128GB is the max a standard motherboard can. Again it's cheap but common people can still afford it, it's still cheaper than a car