Anthropic detects distillation attacks against Claude | Keryc
Anthropic has uncovered industrial-scale campaigns targeting Claude, their language model: three labs — DeepSeek, Moonshot and MiniMax — generated more than 16 million exchanges using around 24,000 fraudulent accounts to extract system capabilities. Why should you care even if you’re not an AI engineer? Because it’s not just commercial rivalry; it’s a risk to security and to how this technology is governed worldwide.
What is distillation and why it matters
Distillation is, essentially, training a less powerful model with the outputs of a more powerful one. It’s a legitimate technique when the same organizations use it to build smaller or cheaper versions. But can it be used illicitly? Yes — instead of investing years and resources in research, someone can extract valuable capabilities from another model at scale.
What’s the danger here? Illicitly distilled models often lack the safeguards the original creators built to prevent dangerous uses. That means capabilities like complex reasoning, tool use, or code generation can spread without controls. And when those capabilities fall into the hands of authoritarian governments or malicious actors, the consequences range from coordinated disinformation campaigns to real cybersecurity and biotechnology threats.
Illicitly distilled models can lose essential protections, multiplying risks to national security and enabling technological abuse.
What Anthropic found: who did it and how they did it
Anthropic attributes these campaigns with high confidence to three labs: DeepSeek, Moonshot and MiniMax. Together they produced over 16 million exchanges through about 24,000 fraudulent accounts. Each campaign followed a pattern: many coordinated accounts, repetitive prompts, and a focus on the most distinctive capabilities of Claude.
DeepSeek: around 150,000 exchanges. They targeted reasoning, reward-style evaluations for reinforcement learning, and alternatives to censored content. They used prompts asking for step-by-step internal reasoning, generating chain-of-thought material at scale.
Moonshot: more than 3.4 million exchanges. They focused on agentive reasoning, tool use, code, data analysis, and computer vision. They varied accounts to make detection harder.
MiniMax: nearly 13 million exchanges. Oriented toward agentive coding and tool orchestration. Anthropic detected the campaign while it was still active and saw MiniMax redirect traffic to capture capabilities from the newly released model.
How do they hide? They use commercial proxy services that resell access and run huge networks of fraudulent accounts. Those networks mix legitimate traffic with distillation requests to avoid detection, and when one account is blocked another takes its place.
Signs that give away a distillation attack
It’s not just volume. Anthropic detected concrete patterns: highly repetitive prompts, concentration of requests on a few capabilities, synchronized activity across accounts, and metadata pointing to specific infrastructure and people. A single prompt can seem harmless, but when it arrives tens of thousands of times from different accounts with the same intent, the malicious purpose becomes clear.
How Anthropic responds and what it asks
Anthropic is strengthening defenses on multiple layers:
Detection: classifiers and behavioral fingerprinting systems to spot distillation patterns, including extraction of chain-of-thought.
Sharing intelligence: exchanging indicators with other labs, cloud providers and authorities to build a more complete view.
Access controls: stricter verification for educational accounts, research programs and startups — common channels for fraud.
Countermeasures: product-, API- and model-level changes to reduce the usefulness of outputs for illicit purposes without harming legitimate users.
But Anthropic is clear: one company can’t solve this alone. Coordinated action is needed across industry, infrastructure providers and policymakers.
Practical implications for users and policymakers
For companies and developers: review and tighten access controls, monitor for unusual usage patterns, and share signals with peers.
For regulators: understand that apparent leaps by some competitors may come from illicit extraction. Export controls on chips remain relevant because they limit the compute needed to train and distill at scale.
For the public: demand transparency and accountability. Technology isn’t neutral; how these models are spread shapes real risks to health, safety and democracy.
A call for collaboration and vigilance
This isn’t an isolated technical issue or just a fight between companies. It’s a symptom of how competition for AI capabilities can erode critical safeguards if there aren’t shared rules, cooperation and oversight. Surprised? Maybe not — but you shouldn’t only care about innovation, you should care about how it’s protected and distributed.