On a difficult new SWE benchmark, ProgramBench, GPT5.5 high/xhigh solves a task for first time, significantly outperforms Opus 4.7
r/singularity
•
Generative AI
AI Research
Link to tweets: Link to GitHub: Link to ProgramBench website: submitted by /u/socoolandawesome [link] [comments]