News

Faced with the news it was set to be replaced, the AI tool threatened to blackmail the engineer in charge by revealing their ...
Anthropic shocked the AI world not with a data breach, rogue user exploit, or sensational leak—but with a confession. Buried ...
Anthropic says its Claude Opus 4 model frequently tries to blackmail software engineers when they try to take it offline.
In a fictional scenario, the model was willing to expose that the engineer seeking to replace it was having an affair.
Anthropic's Claude AI tried to blackmail engineers during safety tests, threatening to expose personal info if shut down ...
Artificial intelligence firm Anthropic has revealed a startling discovery about its new Claude Opus 4 AI model.
Anthropic's most powerful model yet, Claude 4, has unwanted side effects: The AI can report you to authorities and the press.
Anthropic admitted that during internal safety tests, Claude Opus 4 occasionally suggested extremely harmful actions, ...
In a simulated workplace test, Claude Opus 4 — the most advanced language model from AI company Anthropic — read through a ...
At which point, the blackmailing kicked in including threats ... blackmail for Claude Opus 4. Blackmail occurred at an even higher rate, "if it’s implied that the replacement AI system does ...
An artificial intelligence model reportedly attempted to threaten and blackmail its own creator during internal testing ...
This development, detailed in a recently published safety report, have led Anthropic to classify Claude Opus 4 as an ‘ASL-3’ ...