Anybody using LMStudio on an AMD Strix 395 AI Max (128GB unified memory)? I keep on getting errors and it always loads to RAM.

r/LocalLLaMA
Generative AI AI Hardware Open Source AI

Hey all, I have a Framework AI Max+ AMD 395 Strix system, the one with 128GB of unified RAM that can have a huge chunk dedicated towards its GPU. I'm trying to use LMStudio but I can't get it to work at all and I feel as if it is user error. My issue is two-fold. First, all models appear to load into RAM. For example, a Qwen3 model that is 70GB will load into RAM and then try to load to GPU and fail. If I type something into the chat, it fails. I can't seem to get it to stop loading the model into RAM despite setting the GPU as the llama.cpp.