Benchmarked MiniMax M2.7 through 2 benchmarks. Here's how it did
r/LocalLLaMA
•
Generative AI
AI Research
MiniMax just dropped M2.7, their best model yet. I work with the Kilo Code team and we always test new models when they come out, so we ran M2.7 against Qwen3.5-plus, GLM-5, Kimi K2.5, and Qwen3.5-397b across two benchmarks: PinchBench OpenClaw agent benchmark, Kilo Bench, an 89-task evaluation that tests autonomous coding across everything from git operations to cryptanalysis to QEMU automation. TL;DR: M2.7 scores 86.2% on PinchBench, placing 5th overall and within 1.2 points of Claude Opus 4.6.