Anthropic Keeps Its Most Powerful AI Model Secret Due to Unprecedented Cybersecurity Risks

2026-04-14 data

San Francisco, Tuesday, 14 April 2026.
Anthropic’s decision to withhold its advanced Mythos AI model represents a watershed moment in artificial intelligence governance. The model demonstrated the ability to autonomously discover and exploit thousands of previously unknown vulnerabilities in major operating systems and browsers, including bugs that were decades old. Government officials and banking executives are now in urgent discussions about the implications, as Mythos can perform cybersecurity tasks that would typically require expert hackers weeks to complete—all within hours and for minimal cost.

A Model Too Dangerous for Public Release

Anthropic announced on April 6, 2026, that its Claude Mythos Preview model would not be made generally available to the public due to unprecedented cybersecurity capabilities [1][2]. The San Francisco-based company revealed that Mythos can identify and exploit zero-day vulnerabilities in major operating systems and web browsers, including discovering bugs that are decades old [2]. In internal testing, the model found vulnerabilities spanning up to 27 years, with over 99% of the discovered security flaws remaining unpatched as of the announcement date [2]. The decision marks a rare instance of an AI company voluntarily restricting access to its most advanced technology due to safety concerns.

Extraordinary Technical Capabilities Reshape Cybersecurity Landscape

The technical prowess of Mythos Preview has fundamentally altered expectations for AI-driven cybersecurity operations. In comparative testing, the model outperformed Claude Opus 4.6 by developing working exploits for Firefox vulnerabilities 181 times versus just 2 times for the earlier model [2]. More remarkably, Mythos Preview achieved full control flow hijack on ten separate, fully patched targets in OSS-Fuzz corpus testing, while older models barely reached tier 3 performance [2]. The model demonstrated its capability by autonomously identifying and exploiting a 17-year-old remote code execution vulnerability (CVE-2026-4747) in FreeBSD, allowing attackers to gain root access [2]. These exploits, which expert penetration testers estimate would take weeks to develop, were completed by Mythos Preview in mere hours [2].

Government and Financial Sector Response

The cybersecurity implications of Mythos Preview triggered immediate high-level government attention across multiple countries. US Treasury Secretary Scott Bessent convened a meeting of senior American bankers in Washington on April 6, 2026, specifically to discuss the Mythos model and its potential threats to financial infrastructure [1][4]. Officials from the United States, Canada, and the United Kingdom have held discussions with banking officials regarding Mythos-related threats [4]. The UK government’s AI Security Institute issued a warning on April 14, 2026, stating that Mythos represented a “step up” over previous models in terms of cyber threat capabilities [1]. David Solomon, Goldman Sachs chief executive, acknowledged on April 6, 2026, that his firm was “hyper-aware of the enhanced capabilities of these new models” and working closely with Anthropic to improve cyber protection [6].

Project Glasswing: A Defensive Strategy

Rather than releasing Mythos to the public, Anthropic launched Project Glasswing, a collaborative initiative uniting major technology companies and financial institutions to strengthen cybersecurity defenses [3]. The project includes Amazon Web Services, Apple, Broadcom, Cisco, Crowdstrike, Google, JPMorgan Chase, The Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks [3]. Anthropic committed up to $100 million in usage credits for Mythos Preview access and $4 million in direct donations to open-source security organizations as part of the initiative [3]. Access to Mythos Preview has been extended to over 40 organizations that build or maintain critical software infrastructure [3]. JPMorgan Chase described its participation as “a unique, early-stage opportunity to evaluate next-generation AI tools for defensive cybersecurity across critical infrastructure” [4]. Within 90 days of April 14, 2026, Anthropic plans to report publicly on lessons learned, vulnerabilities fixed, and improvements made through the project [3].

Bronnen

AI safety model governance