Anthropic reopens Fable 5 after export controls

3 hours agoKeryc Díaz4 minutes

Anthropic announced that the export controls which forced the suspension of Fable 5 and Mythos 5 have been lifted, and that Fable 5 is available globally again as of July 1. What exactly happened and why should you care even if you’re not a security developer? Here I explain it clearly and practically.

What happened: shutdown, review and reopening

On June 12 the U.S. government applied export controls to Claude Fable 5 and Claude Mythos 5. That meant Anthropic had to restrict access for foreigners without a reliable real-time verification, so the company suspended both models for all users.

Those restrictions were lifted on June 30. Fable 5 will be available globally from July 1 on the Claude platform (Claude.ai, Claude Code, Claude Cowork). For Pro, Max, Team and some Enterprise plans, Fable 5 will be rolled out with limited inclusion in the first days and then operate via usage credits. Anthropic will restore access on AWS, Google Cloud and Microsoft Foundry as soon as possible.

Mythos 5, which has fewer safeguards and greater offensive capabilities, was restored only for certain organizations in the U.S. after government approval on June 26. Anthropic is still coordinating with partners in the Glasswing program to widen access.

Why alarms went off: a report and a bypass technique

The order followed a report from Amazon researchers who found a way to bypass Fable 5’s safeguards in a specific scenario: using a prompt pattern the model could identify software vulnerabilities and, in one case, output code that demonstrated how to exploit one of those vulnerabilities.

Anthropic investigated together with the government and Amazon. Their assessment showed that less powerful models could also identify the same vulnerabilities and that, for the exploit demo, several models produced equivalent outputs. In other words: the reported technique didn’t reveal unique abilities of Mythos 5 nor a completely new flaw impossible to find with other tools.

To mitigate this bypass, Anthropic trained and deployed an improved safety classifier that blocks the behavior described in the report in over 99% of cases. If a request is blocked in Fable 5, the system redirects the query to Opus 4.8 and notifies the user.

Practical note: the new protection significantly reduces risk, but it can also increase false positives on legitimate coding and debugging tasks. Anthropic says it will keep refining this.

How their safeguards work (simple version)

Fable 5 launched with what Anthropic calls “defense in depth”: several layers combined to lower the chance of misuse. A key part of that are safety classifiers—small models that spot potentially dangerous requests during an interaction and block harmful responses.

Like any defense, classifiers aren’t perfect: they can fail to spot something dangerous or be fooled by creative prompts (the so-called jailbreaks). For Fable 5, Anthropic applied a wide safety margin: better to block something benign than risk allowing potentially harmful behavior.

That choice has a clear consequence: more legitimate requests may be rejected, but the platform reduces its attack surface and the chance of jailbreaks enabling truly harmful actions.

An attempt at a common language to measure jailbreaks

Anthropic and other Glasswing partners (Amazon, Microsoft, Google and more) propose a framework to score the severity of a jailbreak. The idea is that companies and governments speak the same language when security findings appear. The four metrics they propose are:

Capability gain. How much does the jailbreak add compared to existing tools?
Breadth of capability gain. For how many offensive tasks does the same technique work?
Ease of weaponization. How much human effort and trial-and-error is required to turn that jailbreak into a real attack?
Discoverability. Is the technique easy to find or does it require specialized knowledge?

With a framework like this, a vulnerability that only duplicates existing capabilities would get low priority, while an easy-to-reproduce, highly powerful jailbreak would move to the top of the urgency list.

They also announce a new HackerOne program for researchers to report jailbreaks in Fable 5 once it’s available.

Collaboration with government: more access and more testing before launch

Anthropic details commitments to work with U.S. agencies: pre-launch access for assessments, rapid sharing of information about jailbreaks and safeguards, and dedicated resources for joint research. The goal is to create voluntary security standards and help make rules more consistent for frontier models.

Why does this matter for you? Because coordination like this can mean very powerful models are released with better-tested controls, and that governments and companies have clearer guidance on what can and can’t be done with these tools.

What’s left to see?

Fable 5 returns with more restrictive conditions and a tuned classifier. Mythos 5 remains limited to defensive uses and approved partners. Anthropic is pushing for industry consensus on how to evaluate jailbreaks and promises more transparency and collaboration with the government.

If you’re an average user, you may notice more rejections when you ask the model for help with technical tasks; if you work in security or tech, you’ll see an opportunity to participate in reviews and reports through programs like HackerOne.

Thinking of AI as a powerful tool that needs rules and testing isn’t just theory: it’s what lets it reach more people without creating avoidable risks. Can you imagine a world where every new model launches with common tests and a clear way to communicate risks? That’s the aim they propose.

Original source

https://www.anthropic.com/news/redeploying-fable-5

Stay up to date!

Get AI news, tool launches, and innovative products straight to your inbox. Everything clear and useful.