Gemma 4 26B on oMLX with OpenCode, M4 Max, 64GB unified - am I doing something wrong/miscalibrated on capabilities here?

r/LocalLLaMA
Generative AI Open Source AI

So this might very well be user error on my end but please let me know if whatever I am doing is somehow wrong: M4 Max (highest core count version), 64GB of unified memory Using oMLX 0.3.5dev1 version for serving, gemma 4bit it 26-a4b (200k context) Opencode harness for running the model - no custom instructions for now Consistently I see the LLM not doing what it is said to do. For example - I have some here: Don't see it thinking all the time.