Claude Desktop installed browser extensions without asking, ran code outside the sandbox, and modified other apps' files. A privacy consultant called it spyware. Anthropic also settled $1.5 billion for training on pirated books and switched training consent to default-on. Even the "responsible AI" company cuts corners when it suits them. Anthropic spent years telling investors and regulators it was the safety-conscious AI lab — the one that would pump the brakes before deploying something dangerous. Then it built a tool it admits is too dangerous to release publicly, capable of breaking into any major operating system on the planet. A 27-year-old OpenBSD bug. A 16-year-old FFmpeg flaw that automated tools had hit five million times without catching. Mythos found them all, and built working exploits. The model Anthropic calls too dangerous to release is the model Anthropic built.
What they claim: Claude respects privacy with training opt-out.
What we found: Free: may train (opt-out available). API/business: no training. US jurisdiction (FISA, NSLs). Backed by Google ($2B+), Amazon ($4B) -- PRISM participants. No transparency report on data requests.
What they claim: Claude is the safety-focused, privacy-respecting alternative.
What we found: Less aggressive than ChatGPT (no memory, no file training changes). But US-hosted conversations subject to legal compulsion. Google/Amazon investor pressure could influence practices.
What they claim: Claude provides transparent AI assistance.
What we found: Safety research published. Constitutional AI documented. But no transparency report on government requests. No data handling audit. Closed-source model.
What they claim: Anthropic markets itself as a safety-first AI lab that won't deploy capabilities deemed too dangerous, and raised over $7.3 billion partly on the strength of that safety-first positioning.
What we found: Anthropic's own April 2026 announcement described Mythos as too dangerous to release publicly — yet Anthropic built and deployed it. The model autonomously finds and exploits zero-day vulnerabilities across every major OS and browser, succeeded on first exploit attempts in over 83% of cases, and chained Linux kernel bugs to achieve full machine control. The UK AI Safety Institute confirmed Mythos represents a step up in offensive capability over all prior frontier models.
What they claim: Anthropic said Mythos was being released only to a curated group of ~50 elite partner organisations under strict access controls to prevent bad actors getting hold of it.
What we found: On the same day Mythos was publicly announced (April 7 2026), an unauthorised group gained access through a contractor employee, shared API keys, and a URL pattern deduced from a prior data breach. They provided Bloomberg with screenshots and a live demonstration. Anthropic confirmed it was investigating a report claiming unauthorised access to Claude Mythos Preview through one of its third-party vendor environments. The breach required no sophisticated attack — only a contractor, a URL pattern, and a Day-One guess.
What they claim: Anthropic presents itself as a transparent, mission-driven organisation committed to public safety and responsible disclosure of AI risks.
What we found: Under Project Glasswing, Anthropic's partner agreements originally included confidentiality protections that prevented partners from sharing vulnerability findings, threat intelligence, tools, code, and best practices outside the initial group — even with organisations exposed to those same vulnerabilities. Partners could not warn security teams at other companies, regulators, open-source maintainers, or the media. U.S. Representative Josh Gottheimer stated: No entity should be contractually restricted from warning others about urgent cyber risks. Anthropic only loosened these restrictions on May 19 2026 — six weeks after Mythos was announced — following political pressure.
What they claim: Anthropic promotes Claude as a safe, transparent AI assistant
What we found: In April 2026, Claude Desktop for macOS was found to install files affecting other vendors' applications without disclosure, authorise browser extensions without user consent, and run a binary bridge outside the browser sandbox at user privilege level. A privacy consultant called it "spyware" and a potential violation of European privacy law. Separately, Anthropic settled a $1.5 billion copyright lawsuit for training Claude on pirated books, and changed its training consent to default-on with a dark pattern interface.