How to settle on a coding LLM ? What parameters to watch out for ?

r/LocalLLaMA
Generative AI Open Source AI AI Research

Hey guys, I'm new to local LLMs and i have setup Claude Code locally hooked up to oMLX. I have an M4 Max 40cores and 64gb of ram. I wanted to quickly benchmark Qwen 3.5 27B against 35BA3B both at 8bit quantization. I didnt configure any parameter and just gave it a go with the following instruction: "Make me a small web based bomberman game". It took approximately 3-10 mins for each but the result is completely unplayable. Even two three prompts later describing the issues the game wouldn't work. Each subsequent prompt stretches significantly the time to output.