New AI math benchmark finds GPT-5.4 Pro has made progress on two unsolved math problems

Through a new AI math benchmark of 100 unsolved math problems, Oxford researchers find that GPT-5.4 pro has made progress beyond humans on two of them. "After reasoning for roughly an hour, GPT 5.4 Pro beats AlphaEvolve's baseline on a Kakeya-type problem by ~4.9% via an optimized triangle overlap and uses a quintic correction to drop the constant of the diagonal Ramsey bound by ~2.7%. We are validating these with experts now." Paper link: Twitter thread: Disclaimer: this is our work. So feel free to ask questions here. submitted by /u/armytricks [link] [comments.