AUTO-UPDATED

GPT-5.5 Matches Heavily Hyped Mythos Preview In New Cybersecurity Tests

New research from the UK's AI Security Institute reveals that OpenAI's recently launched GPT-5.5 model demonstrates cybersecurity capabilities comparable to Anthropic's restricted Mythos Preview model during rigorous testing.

Key Points

  • The UK's AI Security Institute (AISI) tested GPT-5.5 and Mythos Preview using 95 Capture the Flag challenges covering cryptography, web exploitation, and reverse engineering.
  • GPT-5.5 achieved a 71.4 percent success rate on "Expert" level tasks, slightly outperforming the 68.6 percent score recorded by Mythos Preview.
  • In a specific test, GPT-5.5 successfully decoded a Rust binary in under 11 minutes for a cost of $1.73 without human intervention.
  • GPT-5.5 succeeded in 3 out of 10 attempts at the "The Last Ones" data extraction simulation, marking the first time any AI has passed this test.
  • Both models failed the "Cooling Tower" simulation, which tests the ability to disrupt power plant control software.

Why it Matters

These findings indicate that advanced cybersecurity risks are becoming a standard feature of frontier AI models rather than isolated breakthroughs unique to specific releases. This trend suggests that rapid improvements in reasoning and long-horizon autonomy are creating significant security implications that developers and regulators must address across the entire industry.
Slashdot.org Published by BeauHD
Read original