News

Can AI like Claude 4 be trusted to make ethical decisions? Discover the risks, surprises, and challenges of autonomous AI ...
The internet freaked out after Anthropic revealed that Claude attempts to report “immoral” activity to authorities under ...
Claude 4’s “whistle-blow” surprise shows why agentic AI risk lives in prompts and tool access, not benchmarks. Learn the 6 ...
This includes locking users out of systems it can access or bulk-emailing media and law enforcement to report wrongdoing. This isn’t a new behavior, but Claude Opus 4 is more prone to it than ...
Anthropic’s Claude Opus 4 exhibited simulated blackmail in stress tests, prompting safety scrutiny despite also showing a ...
The testing found the AI was capable of "extreme actions" if it thought its "self-preservation" was threatened.
The choice Claude 4 made was part of the test ... Apollo Research's notes said in Anthropic's safety report. Anthropic says the behavior was mitigated with a fix and the AI's behavior is now ...