Anthropic’s tightly controlled rollout of Claude Mythos has taken an awkward turn. After spending weeks insisting the AI model is so capable at cybersecurity that it is too dangerous to release publicly, it appears the model fell into the wrong hands anyway.
According to Bloomberg, a “small group of unauthorized users” has had access to Mythos — whose existence was first revealed in a leak — since the day Anthropic announced plans to offer it to a select group of companies for testing. Anthropic says it is investigating. That’s a rough look for a company that has built its brand on taking AI safety seriously while touting the cybersecurity prowess of its latest model.
From a technological standpoint, the Mythos breach is embarrassingly unsophisticated. Bloomberg reports the group accessed Mythos by making “an educated guess about the model’s online location,” using information about Anthropic’s other models exposed in the breach of Mercor — a company that makes AI training data — along with access one member had through contract work evaluating Anthropic models. The group got unauthorized access to Mythos through a combination of insider knowledge and a lucky guess, not some sophisticated technological exploit or wholesale theft of the model.
Security vulnerabilities are inevitable, and it was Mercor, not Anthropic, that revealed the information the hackers used to guess Mythos’ location. Pia Hüsch, a research fellow at the British think tank Royal United Services Institute (RUSI), told me that no company is ever completely secure and humans are often the weakest link, though it “does initially seem a bit lucky” that there were no serious consequences.
Anthropic failed to anticipate an ‘entirely imaginable’ kind of failure
But it’s not entirely bad luck. These kinds of educated guesses are a very standard hacking technique, and the Mercor breach was already known before Mythos’ release. Security researcher Lukasz Olejnik described it to me as an “entirely imaginable” kind of failure that the cybersecurity industry has been routinely dealing with for the last 20 years. So Anthropic should have anticipated it and should have prepared accordingly, particularly knowing that its information had been compromised.
Anthropic also appears to have had the means to spot the breach. The company is able to “log and track model use,” Olejnik said, which should make it possible to stop unauthorized or malicious access, especially since the Mythos rollout was supposed to be highly limited. Evidently, Anthropic wasn’t monitoring closely enough — and given how dangerous it says the model is, it’s reasonable to ask why.
By Bloomberg’s account, the group was not using Mythos for cybersecurity tasks, partly because they just wanted to mess around with the new model and partly because doing so could have tipped Anthropic off. If Anthropic’s messaging surrounding Mythos is to be taken seriously, that is a lucky break. The company has framed Mythos as a “watershed moment for security,” claiming it found vulnerabilities in “every major operating system and web browser,” and said its release must be coordinated to allow time to “reinforce the world’s cyber defenses.”
Anthropic has a habit of using dramatic, alarming-sounding language that can be tough to interrogate cleanly, including flirting with the idea that its Claude model might be conscious. Even so, early reports from parties with access suggest Mythos is particularly adept in cybersecurity. Mozilla CTO Bobby Holley said it found hundreds of bugs in Firefox 150 and may finally give defenders a chance at complete victory over attackers. Unsurprisingly, governments and financial institutions around the world have been eager to get their hands on it. The NSA and other US agencies reportedly have access despite Anthropic’s designation as a supply chain risk, though the rollout appears to have bypassed the US cybersecurity agency, CISA, so far.
“Anthropic claims to be at the absolute forefront of all these technologies, but also positions itself as the responsible actor in all of this.”
The fact that the breach was uncovered by a reporter rather than Anthropic also raises the obvious question of whether it is an isolated incident. It “really illustrates how wide the circle of people who may be able to do this is, even if they don’t have super technically sophisticated means,” Hüsch said. Anthropic will likely comb through its supply chain to see how this happened and plug gaps, but she said there is a wide range of actors who would want access to a model like this, some of them with a great deal of money behind them. There is no reason to assume anyone else who gained access would be as restrained as the group Bloomberg reported on.
Anthropic has, to some extent, shot itself in its own foot. The company has built its identity around taking AI safety more seriously than its rivals, creating sky-high expectations for model security that jar with its apparent carelessness; the fact that Mythos was exposed through such a basic and predictable failure only underscores that. Worse still, by hyping Mythos as an unusually powerful tool too dangerous for public release, Anthropic turned it into an obvious target, whether for malicious actors or hackers simply looking for a challenge.
This isn’t even the first awkward security incident around Mythos. The model’s existence was accidentally revealed before release through an “unsecured data trove” on a central system containing content for its website. Now, that model has been secretly accessed via a wholly predictable vulnerability Anthropic didn’t think to patch. Perfection is impossible, but for a company that has anointed itself the vanguard of AI safety, such a basic misstep is hard to justify, even with some of the bad luck it’s had.
To Hüsch, the whole episode can be summed up in one word: humiliation. “Anthropic claims to be at the absolute forefront of all these technologies, but also positions itself as the responsible actor in all of this,” she said. “The fact that this has now been accessed through unauthorized means so quickly, and through such an unsophisticated attempt, is really a humiliation for them.”
