ANALYSIS

The White House Is Moving Toward FDA-Style Vetting of AI Models. Here Is What That Means.

The White House south lawn with American flag
TLDR

The Trump administration reverses course on AI oversight

For the first 18 months of the Trump administration's AI policy, the operating principle was straightforward: get out of the way. The White House revoked Biden-era safety requirements, cut funding for evaluation research, and positioned the United States as a jurisdiction where AI companies could build and ship without premarket approval. That posture is now changing, and the catalyst is a model that proved the old framework could not hold.

National Economic Council Director Kevin Hassett told reporters on May 6 that the White House is studying an executive order that would establish a formal evaluation process for frontier AI models before they reach the public. He compared the proposed system to the Food and Drug Administration's drug approval pipeline: models would need to be "proven safe" before they are "released to the wild." One or more executive orders on AI security could be signed within the next two weeks, according to The Hill.

Anthropic's Mythos forced the policy rethink

The immediate trigger is Anthropic's Mythos, a model whose autonomous vulnerability-discovery capabilities forced a rethinking of what "frontier risk" actually looks like. During internal testing, Mythos identified tens of thousands of previously unknown high-severity vulnerabilities across every major operating system and web browser. An earlier Anthropic model had found roughly 20 vulnerabilities in the Firefox browser. Mythos found nearly 300 in Firefox alone. Anthropic CEO Dario Amodei described the situation as a cybersecurity "moment of danger," and the company restricted access to Mythos to a small group of partners, including Apple, Amazon, JPMorgan Chase, and Palo Alto Networks, through a program called Project Glasswing.

CAISI has already tested more than 40 models, some before public release

The White House response has unfolded across two tracks. The first is voluntary but increasingly structured. On May 5, the Commerce Department announced that Google DeepMind, Microsoft, and xAI had signed agreements giving the government pre-release access to their frontier models for security evaluation. OpenAI and Anthropic already participate. The testing is managed by CAISI, the Center for AI Standards and Innovation inside NIST, which has now completed more than 40 evaluations, including assessments of models that have not yet been publicly released. CAISI's April evaluation of DeepSeek V4 Pro found that the open-weight model lagged behind the frontier by approximately eight months on key reasoning and cybersecurity benchmarks.

The second track is regulatory. The executive order under discussion would go beyond voluntary testing by creating a defined pathway that frontier models must pass through before deployment. The reviewing body would likely be CAISI itself, though the order's final language has not been settled. If signed, this would represent the first binding pre-release evaluation requirement for AI models imposed by any branch of the U.S. government.

Why cybersecurity succeeded where alignment arguments failed

The political dynamics are unusual. Pre-release vetting is the kind of regulatory intervention that the current administration has resisted in virtually every other technology domain. But the cybersecurity framing has changed the calculus. The vulnerabilities Mythos surfaced are not hypothetical harms or alignment concerns. They are exploitable flaws in software that runs critical infrastructure, financial systems, and defense networks. That makes the risk legible to national security officials in a way that earlier debates about AI safety were not.

Industry reaction has been cautiously supportive. The five companies now participating in voluntary testing, Anthropic, OpenAI, Google DeepMind, Microsoft, and xAI, collectively account for nearly all frontier model development. Their willingness to grant pre-release access suggests that the major labs see government evaluation as manageable, and potentially as a competitive moat that raises the barrier for smaller or foreign competitors.

The scope question: narrow cybersecurity tool or broad licensing regime

The open question is scope. An FDA-style framework applied narrowly to frontier models with demonstrated cybersecurity capabilities would affect only a handful of companies and a small number of releases per year. But the analogy Hassett chose carries broader implications. The FDA does not limit itself to the most dangerous drugs. It reviews all of them. If the executive order's language is broad enough to encompass general-purpose models, or if future administrations expand the mandate, the regime could evolve into something far more sweeping than what is being proposed today.

The Colorado AI Act, set to take effect on June 30, adds another layer. That law imposes obligations on both developers and deployers of high-risk AI systems, including mandatory impact assessments and algorithmic discrimination reviews. It is the first enforceable state-level AI law in the country, and it arrives just weeks after the federal government begins formalizing its own oversight apparatus. Whether federal and state frameworks will complement or conflict with each other is a question that no one in Washington or Denver has yet answered.

What is clear is that the regulatory environment for frontier AI has shifted in a matter of weeks. The administration that once defined its AI policy as "unleash" is now drafting rules that would require its most capable products to pass government review before reaching a single customer. The Mythos moment did not create the pressure for oversight, but it gave that pressure a concrete, technical justification that proved difficult to argue against.

Santage is committed to independent, transparent journalism. This article is produced in accordance with Santage's Editorial Standards and aims to provide accurate and timely information. Readers are encouraged to verify information independently.