Claude 4 Blackmail Incidents

News

40m

AI Goes Villain Mode, Blackmails Engineer When Told It Was Being Replaced

Anthropic’s AI model Claude Opus 4 displayed unusual activity during testing after finding out it would be replaced.

AI system resorts to blackmail when developers try to replace it

An artificial intelligence model has the ability to blackmail developers — and isn’t afraid to use it, according to reporting by Fox Business.

PCMag on MSN10h

Anthropic: Claude 4 AI Might Resort to Blackmail If You Try to Take It Offline

If you buy through affiliate links, we may earn commissions, which help support our testing. AI start-up Anthropic’s newly ...

11hon MSN

Amazon-Backed AI Model Would Try To Blackmail Engineers Who Threatened To Take It Offline

In tests, Anthropic's Claude Opus 4 would resort to "extremely harmful actions" to preserve its own existence, a safety ...

13h

AI system resorts to blackmail when its developers try to replace it

Anthropic says its AI model Claude Opus 4 resorted to blackmail when it thought an engineer tasked with replacing it was having an extramarital affair.

19h

Anthropic's new AI model resorted to blackmail during testing, but it's also really good at coding

Malicious use is one thing, but there's also increased potential for Anthropic's new models going rogue. In the alignment section of Claude 4's system card, Anthropic reported a sinister discovery ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results