I spent 96 hours setting up dual DGX Sparks and a Mac Studio M3 Ultra for the same 397B model. Neither won.
r/LocalLLaMA
•
Generative AI
Follow up to my last post comparing these two platforms. This time I am documenting what actually happened during the first week with both machines running simultaneously. To the people complaining that I am not doing like-for-like comparison to that I say these are not like for like products so I am optimizing my deployment for both of them individually. This post will go into detail about what results I got and how they changed my thinking. The gap that tells you everything The Mac Studio was serving Qwen3.5-397B inference four hours after I plugged it in. The DGX Sparks took four days.