GPT-5.5 Ties Claude Mythos in Enterprise Cyber Attack Tests, AISI Finds
Dev.to AI
•
Generative AI
UK AISI finds GPT-5.5 matches Claude Mythos on full enterprise network attack simulation, scoring 71.4% on expert tasks vs 68.6%. UK AISI found GPT-5.5 matches Claude Mythos Preview in autonomously solving a full enterprise network attack simulation. OpenAI's model scored 71.4% on expert-level capture-the-flag tasks, edging out Anthropic's 68.6%. Key facts GPT-5.5 scored 71.4% on expert CTF tasks vs Mythos 68.6%. Only second model to fully solve enterprise network simulation TLO. GPT-5.5 succeeded in 2 of 10 TLO attempts; Mythos in 3 of 10.