On June 12, Anthropic, the artificial intelligence lab behind the Claude series, abruptly suspended access to its newest models—Fable 5 and Mythos 5—just three days after their public release. The suspension followed an “export control directive” from the US government that barred non-US nationals from using the systems.
Mythos 5, Anthropic’s most advanced “frontier” model, had been previewed in April with the company’s own warning that it was too capable at hacking to release broadly. Instead, it was restricted to a small group of US tech firms for patching vulnerabilities in critical digital infrastructure. Fable 5, a version of the same model with additional safeguards to prevent cybersecurity misuse, was the public release—and it lasted less than a week.
A deepening rift with Washington
The shutdown is the latest flashpoint in a growing confrontation between Anthropic and the Trump administration. Since early 2025, the White House has accused the company of producing “woke AI” and labeled CEO Dario Amodei an “ideological lunatic.” Disputes initially centered on AI regulation and semiconductor export policy, but escalated sharply when Anthropic refused to allow the Pentagon to use its models for domestic surveillance or fully autonomous weapons systems. In response, the Department of Defense threatened to designate Anthropic a “supply chain risk,” a move that would force military contractors to cut ties.
The US government has not publicly explained the June 12 directive, but Anthropic believes it stemmed from a discovered jailbreak—a method to bypass Fable 5’s safety filters. These filters classify user requests as safe or unsafe before routing them to the model; when triggered, they redirect to a less powerful system. The government’s concern, according to Anthropic, was that the safeguards could be circumvented to extract information useful for cyberattacks.
Guardrails for large language models are inherently fragile. They rely on the model’s own ability to interpret user intent, a task complicated by a large online community—which researchers call the Undersphere—actively working to break them. Anthropic acknowledges that “perfect jailbreak resistance is not achievable for any current model provider.”
Anthropic has pointed to research by engineers at Amazon, both a rival and a significant investor, as the likely source of the government’s intelligence. But another jailbreak emerged within 48 hours of Fable 5’s release: a researcher using the pseudonym “Pliny the Liberator” published the model’s full system prompt on X and GitHub. The system prompt is a hidden set of instructions that shapes AI behavior; its exposure has drawn intense interest in the Undersphere, though its practical utility remains unclear.
The opacity problem
The deeper challenge in securing models like Fable 5 is that even their creators do not fully understand how they work. According to Maximilian Kasy, an economist and machine learning expert at Oxford University, these systems perform far better than they “should.” Large language models contain billions of internal parameters trained on vast datasets using machine learning. Kasy notes that conventional theory would predict “overfitting”—good at reproducing training data but poor at generalizing. Yet modern systems like Claude and ChatGPT do generalize. Kasy likens current AI development to alchemy: successful through trial and error, not grounded in systematic theory.
This opacity makes regulation extremely difficult. Governments lack independent access to the data, infrastructure, and expertise needed to evaluate proprietary frontier models. The Trump administration’s recent executive order on AI security, published two weeks ago, reflects this realization. After an initial hands-off posture, the order now asks developers to share models for review before release—an implicit admission that the administration does not trust companies to fully assess what their own models can do or how they might be misused. The public sees even less, and a 2024 survey across 25 countries found people are more than twice as concerned about AI as they are excited.
The standoff between Anthropic and Washington echoes broader tensions in the Indo-Pacific, where governments are grappling with how to govern AI without stifling innovation. Singapore’s AI neutrality model cracks under US-China pressure, highlighting the difficulty of maintaining balanced frameworks. Meanwhile, Anthropic’s Claude Mythos preview crossed an AI red line in autonomous cyber attacks, underscoring the risks that regulators fear.
Looking ahead, the future of AI safety cannot rely solely on regulations, which will always lag behind technology, nor on guardrails, which will be bypassed. What is needed is a governance framework built for failure—one that can predict and address consequences. Such a framework must be global, participatory, and founded on reciprocal trust. These are qualities the current US administration has shown little capacity to generate, leaving the region and the world to navigate an increasingly volatile AI landscape.


