⚡️The End of SWE-Bench Verified — Mia Glaese & Olivia Watkins, OpenAI Frontier Evals & Human Data

Latent Space
Generative AI

It's time to take the next step up in frontier agent evals.