Fable 5 taken down by the US government

Recently the AI company Anthropic which you might know by using the everyday AI Claude, or for Software Engineers Claude Code, showcased a AI with the name of Mythos Preview, which apparently spot flaws and bugs within major open source software as well as escaped it's rigorous sandbox easily. To address the growing cyber capability of AI, Anthropic launched Project Glasswing, which consists of macrocompanies like Google, using these AI to strengthen their security, showing a glimpse of the cybersecurity related work potential of AI.

Soon, Anthropic launched Mythos 5 and Fable 5, these 2 are basically the derivatives/descendents of Mythos Preview. These 2 were extremely strong in the benchmarks as shown in the given chart:

Agentic coding, SWE-Bench Pro
- Claude Mythos 5 / Fable 5: 80.3%
- Claude Mythos Preview: 77.8%
- Claude Opus 4.8: 69.2%
- GPT 5.5: 58.6%
- Gemini 3.1 Pro: 54.2%
Agentic coding, FrontierCode Diamond, xhigh
- Claude Mythos 5 / Fable 5: 29.3%
- Claude Opus 4.8: 13.4%
- GPT 5.5: 5.7%
Knowledge work, GDPval-AA
- Claude Mythos 5 / Fable 5: 1932
- Claude Opus 4.8: 1890
- GPT 5.5: 1769
- Gemini 3.1 Pro: 1314
Knowledge work vision, GDP.pdf, no tools
- Claude Mythos 5 / Fable 5: 29.8%
- Claude Opus 4.8: 22.5%
- GPT 5.5: 24.9%
- Gemini 3.1 Pro: 16.7%
Spatial reasoning, Blueprint-Bench 2
- Claude Mythos 5 / Fable 5: 38.6%
- Claude Opus 4.8: 14.5%
- GPT 5.5: 36.2%
- Gemini 3.1 Pro: 26.5%
Tool use, AutomationBench
- Claude Mythos 5 / Fable 5: 17.4%
- Claude Opus 4.8: 15.5%
- GPT 5.5: 12.9%
- Gemini 3.1 Pro: 9.6%
Computer use, OSWorld-Verified
- Claude Mythos 5 / Fable 5: 85.0%
- Claude Mythos Preview: 85.4%
- Claude Opus 4.8: 83.4%
- GPT 5.5: 78.7%
- Gemini 3.1 Pro: 76.2%
Legal, Legal Agent Benchmark
- Claude Mythos 5 / Fable 5: 13.3%
- Claude Opus 4.8: 10.4%
- GPT 5.5: 2.1%
- Gemini 3.1 Pro: 0.0%
Multidisciplinary reasoning, Humanity’s Last Exam, no tools
- Claude Mythos 5 / Fable 5: 59.0%*
- Claude Mythos Preview: 56.8%
- Claude Opus 4.8: 49.8%
- GPT 5.5: 41.4%
- Gemini 3.1 Pro: 44.4%
Multidisciplinary reasoning, Humanity’s Last Exam, with tools
- Claude Mythos 5 / Fable 5: 64.5%*
- Claude Mythos Preview: 64.7%
- Claude Opus 4.8: 57.9%
- GPT 5.5: 52.2%
- Gemini 3.1 Pro: 51.4%
Biology, BioMysteryBench, hard
- Claude Mythos 5 / Fable 5: 46.1%*
- Claude Mythos Preview: 29.6%
- Claude Opus 4.8: 40.0%
Biology, BioMysteryBench, human solved
- Claude Mythos 5 / Fable 5: 83.9%*
- Claude Mythos Preview: 82.6%
- Claude Opus 4.8: 80.4%
Agentic coding, Terminal-Bench 2.1
- Claude Mythos 5 / Fable 5: 88.0%*
- Claude Opus 4.8: 82.7%
- GPT 5.5: 83.4%, Codex CLI
- Gemini 3.1 Pro: 70.7%, Gemini CLI
Cybersecurity, ExploitBench, Cap%
- Claude Mythos 5 / Fable 5: 78.0%*
- Claude Mythos Preview: 69.0%
- Claude Opus 4.8: 40.0%
- GPT 5.5: 34.0%
Health, HealthBench Professional
- Claude Mythos 5 / Fable 5: 66.0%*
- Claude Mythos Preview: 64.7%
- Claude Opus 4.8: 56.9%
- GPT 5.5: 51.8%
Methodology note
- Scores for Claude Mythos 5 and Claude Fable 5 are within a 1–3 percentage point difference.
- The table shows the higher score of the two.
- Starred benchmarks show larger differences due to blocking safeguards for cybersecurity and biology-related questions.

Which shows that it is extremely good in agentic coding, reasoning, long workflows, cybersecurity and Health related work, in fact it was substantially better than earlier models, and could one shot beautiful UI websites.

Then Fable 5 was launched for the public, strengthened by much stronger guardrails, it was available for general use, where cyber/medical requests were routed to Opus 4.8.

Soon later, the US government issued an order to Anthropic to take down these models for any foreign people or foreign nationals due to national security concerns which Anthropic interpreted due to a narrow jailbreak technique, that could expose already known knowledge.

To comply with this request, Anthropic had to shut down the model for all people since marking selectively due to nationality is difficult in practice.

Fable 5 taken down by the US government

Poll: Do you think the US government concerns were valid or was it a stratergic move?

Welcome, guest