(Interactive)OpenCode Racing Game Comparison Qwen3.6 35B vs Qwen3.5 122B vs Qwen3.5 27B vs Qwen3.5 4B vs Gemma 4 31B vs Gemma 4 26B vs Qwen3 Coder Next vs GLM 4.7 Flash

You can play them here: This started out as a simple test for Qwen3 Coder Next vs Qwen3.5 4B because they have similar benchmark numbers and then I just kept trying other models and decided I might as well share it even if I'm not that happy with how I did it. Read the "How this works" in the top right if you want to know how it was but the TLDR is: Disabled vision, sent same initial prompt in Plan mode, enabled Playwright MCP and sent the same start prompt, and then spent 3 turns testing the games and pointing out what issues I saw to the LLMs.