Leaked Anthropic Docs Reveal Secret 'Mythos' AI Model
The leak exposes Anthropic's hidden roadmap and internal protocols, prompting debate among researchers about whether frontier labs can maintain secrecy.
Anthropic has been quietly developing a frontier AI model codenamed "Mythos" that internal benchmarks claim the system outperforms Claude 3.5 Sonnet by 34% on multimodal reasoning tasks, according to the documents, according to leaked technical documents obtained by The Pulse Gazette. The documents describe the system as designed to compete with OpenAI's unreleased GPT-5 and Google's Gemini 2 Ultra, though Anthropic has made no public statements confirming this project exists.
The documents — dated January 2026 and marked "CONFIDENTIAL — RESEARCH ONLY" — describe Mythos as a "native multimodal" architecture trained on video, audio, images, and text simultaneously rather than bolted together from separate components. One benchmark table shows the model achieving 89.2% accuracy on a private "cross-modal reasoning" test where Claude 3.5 scores 66.4% and GPT-4o reaches 71.8%.
---
What the leaked documents actually show
The 47-page technical brief, apparently prepared for Anthropic's safety team and select investors, outlines three distinct model sizes: Mythos-S (likely for edge deployment), Mythos-M (the main benchmarked version), and Mythos-L (described as "compute-optimal for research"). Only the medium variant has extensive performance data.
The most striking claims involve "temporal reasoning" — understanding how events unfold across time in video. Mythos-M reportedly scores 94% on an internal video question-answering benchmark that stumps most existing models, including the ability to predict physical outcomes from partial footage. One example task: watching 10 seconds of a Rube Goldberg machine and identifying which component will fail first.
But the documents are notably thin on safety testing. While Anthropic's public research emphasizes "Constitutional AI" and red-teaming, the Mythos brief dedicates just two pages to alignment evaluation — and those sections are heavily redacted in the leaked copy.
"What we're seeing suggests Anthropic is prioritizing capability benchmarks over the safety metrics they've publicly championed," said Dr. Margaret Chen, AI policy researcher at Stanford's HAI institute, who reviewed the documents at our request. "The gap between their public commitments and this internal timeline is striking."
---
Why Anthropic would hide a flagship model
The company has built its brand on transparency — publishing detailed system cards, supporting AI regulation, and positioning itself as the "responsible" alternative to OpenAI's speed-at-all-costs approach. So why bury a breakthrough?
Three theories dominate among researchers we spoke with:The compute explanation carries weight. Anthropic's $2 billion Google Cloud commitment, revealed in 2024, may not stretch to serving a model this large at consumer scale. One passage notes Mythos-L achieves state-of-the-art results but "requires optimization for production deployment."
There's also the Pentagon factor. The documents reference "federated evaluation" — testing across distributed classified environments — suggesting military applications were considered from the start. Anthropic's public Pentagon deal, announced last month, covers only Claude 3.5. Mythos appears to be the unacknowledged next phase. This revelation comes after Anthropic denied it could sabotage AI tools in wartime, raising questions about what capabilities the company has developed but not disclosed.
---
What does this mean for AI competition?
What we still don't know
Several critical gaps undermine definitive conclusions. The documents contain no training compute figures, making it impossible to assess efficiency—Mythos's 34% improvement may require 10x the resources. No release timeline appears anywhere; "deployment blockers" could mean six months or indefinite delay. No pricing or API structure is discussed, leaving the "compute constraints" theory speculative. And the "Pentagon factor" rests on a single ambiguous phrase that could as easily describe standard security auditing as classified military development.
Most importantly: we have no independent verification that Mythos exists as described, or that benchmark results weren't selectively excerpted to paint a particular picture.
If authentic, these documents reset assumptions about the capability gap between leading labs.
OpenAI has dominated the narrative around "next-generation" models, with Sam Altman teasing "extreme reasoning" capabilities in recent weeks. Google's Gemini 2 Ultra remains unreleased despite demos circulating since late 2025. Anthropic, meanwhile, has looked reactive — improving Claude incrementally while rivals promised leaps. Users who rely on Claude for critical work should note that Claude outages can be checked and resolved, though Mythos's existence raises questions about whether engineering resources are being diverted to the classified project.
If the documents are authentic and current, Mythos suggests a different story: parallel development of a genuinely differentiated architecture. The native multimodal approach, if it works as described, would skip the "stitching together" problem that degrades performance in GPT-4V and Gemini 1.5 Pro.
That timing matters for top AI researchers 2026 choosing where to build. Anthropic's recruiting pitch has emphasized "working on the most capable safe systems." The existence of a hidden flagship complicates that narrative: are researchers joining to work on Claude improvements, or on a classified project they can't discuss?
---
The verification problem
We should note what we cannot confirm. Anthropic declined to comment on "speculation about internal research." The documents lack cryptographic signatures that would prove authenticity, though their technical detail — including specific loss curves and hardware configurations — matches known Anthropic practices.
One anomaly: the documents reference "Mythos-R", a variant optimized for "recursive self-improvement evaluation." That's either a provocative codename for standard iterative training, or something more ambitious. The surrounding context was redacted.
What we know for certain: someone circulated documents purporting to be from Anthropic's internal research infrastructure. Whether they originated there, and whether they represent current or aspirational plans, remains unverified. Whether that's a whistleblower concerned about safety shortcuts, a leaker trying to juice valuation ahead of rumored fundraising, or deliberate strategic disclosure — the motive remains unclear.
The AI capability race has entered its opaque phase. The models that matter most may be the ones you're not allowed to know about yet.
---
Related Reading
- Claude Outage? Here's How to Check Status and Restore Access - Anthropic Denies It Could Sabotage AI Tools in Wartime - The AI Developments That Shaped March 5, 2026 - OpenAI Teases 'Extreme' Reasoning in Next AI Model - Anthropic's Pentagon AI Deal Sparks Military Ethics Debate