I Tested GPT-5.4 vs Claude Opus 4.6 on 20 Real Tasks — The #1 Model on LMSYS Isn't What You Think

Towards AI
Generative AI

Two days ago, Claude Opus 4.6 quietly took the spot on the LMSYS Chatbot Arena with an Elo score of 1504 - the highest any model has…