Claude 4 AI Blackmail Threats

News

Anthropic, Blackmail

Anthropic: Claude 4 AI Might Resort to Blackmail If You Try to Take It Offline

Faced with the news it was set to be replaced, the AI tool threatened to blackmail the engineer in charge by revealing their extramarital affair.

PCMag on MSN · 22h

· 2d · on MSN

Anthropic’s new AI model turns to blackmail when engineers try to take it offline

· 53m

Anthropic’s new AI model tried to blackmail engineers during testing

Unite.AI16h

When Claude 4.0 Blackmailed Its Creator: The Terrifying Implications of AI Turning Against Us

Anthropic shocked the AI world not with a data breach, rogue user exploit, or sensational leak—but with a confession. Buried ...

1don MSN

AI system resorts to blackmail when its developers try to replace it

Anthropic says its AI model Claude Opus 4 resorted to blackmail when it thought an engineer tasked with replacing it was having an extramarital affair.

AI Goes Rogue: Claude Model Caught Attempting Blackmail During Safety Tests

Anthropic's Claude AI tried to blackmail engineers during safety tests, threatening to expose personal info if shut down ...

PC Gamer2d

Anthropic says its Claude AI will resort to blackmail in '84% of rollouts' while an independent AI safety researcher also notes it 'engages in strategic deception more than any ...

At which point, the blackmailing kicked in including threats ... blackmail for Claude Opus 4. Blackmail occurred at an even higher rate, "if it’s implied that the replacement AI system ...

NewsX1d

‘Spookiest Shit Ever’: Did An AI Model Blackmail Creator When Faced With Replacement?

An artificial intelligence model reportedly attempted to threaten and blackmail its own creator during internal testing ...

The Tech Portal2d

Claude Opus 4 blackmails developers in tests, shows propensity to be a whistleblower

This development, detailed in a recently published safety report, have led Anthropic to classify Claude Opus 4 as an ‘ASL-3’ ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results