GPT-5.5 Matches Heavily Hyped Mythos Preview In New Cybersecurity Tests

New research from the UK's AI Security Institute reveals that OpenAI's recently launched GPT-5.5 model demonstrates cybersecurity capabilities comparable to Anthropic's restricted Mythos Preview model during rigorous testing.

Key Points

The UK's AI Security Institute (AISI) tested GPT-5.5 and Mythos Preview using 95 Capture the Flag challenges covering cryptography, web exploitation, and reverse engineering.
GPT-5.5 achieved a 71.4 percent success rate on "Expert" level tasks, slightly outperforming the 68.6 percent score recorded by Mythos Preview.
In a specific test, GPT-5.5 successfully decoded a Rust binary in under 11 minutes for a cost of $1.73 without human intervention.
GPT-5.5 succeeded in 3 out of 10 attempts at the "The Last Ones" data extraction simulation, marking the first time any AI has passed this test.
Both models failed the "Cooling Tower" simulation, which tests the ability to disrupt power plant control software.

Why it Matters

These findings indicate that advanced cybersecurity risks are becoming a standard feature of frontier AI models rather than isolated breakthroughs unique to specific releases. This trend suggests that rapid improvements in reasoning and long-horizon autonomy are creating significant security implications that developers and regulators must address across the entire industry.

GPT-5.5 Matches Heavily Hyped Mythos Preview In New Cybersecurity Tests

Key Points

Why it Matters

Latest News

The tech news feed
that never sleeps.

Page not found

GPT-5.5 Matches Heavily Hyped Mythos Preview In New Cybersecurity Tests

Key Points

Why it Matters

Related Articles

13 legal startups to watch in 2026, according to investors

How a Google Machine Terminated 130,000 AI Slop YouTube Channels in Six Months

Grok-iOS – remote Grok Build from your iPhone over ACP

Apple testing ‘Live Notes’ AI system to record Genius Bar sessions: report

Latest News

Related Articles

The tech news feedthat never sleeps.

Page not found

The tech news feed
that never sleeps.