On a difficult new SWE benchmark, ProgramBench, GPT5.5 high/xhigh solves a task for first time, significantly outperforms Opus 4.7

r/singularity
Generative AI AI Research

Link to tweets: Link to GitHub: Link to ProgramBench website: submitted by /u/socoolandawesome [link] [comments]