Introducing KellyBench (2 minute read)

TLDR AI
Generative AI

KellyBench evaluates sequential decision-making in sports betting by simulating the 2023-24 English Premier League season. Models like Claude Opus 4.6 and GPT-5.4 struggled, with none achieving positive returns, highlighting their limitations in adapting to long-term strategies. This underscores a need for complex environments enabling agents to learn from experience under uncertainty.