Destination

2025-06-21

Blackmail becomes go-to strategy for AI models facing shutdown in new Anthropic tests


A new study from Anthropic suggests that large AI models can sometimes behave like disloyal employees, raising real security concerns even if their actions aren't intentional.


The article Blackmail becomes go-to strategy for AI models facing shutdown in new Anthropic tests appeared first on THE DECODER.

[...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

Destination

2025-08-27

OpenAI and Anthropic conducted safety evaluations of each other's AI systems

Most of the time, AI companies are locked in a race to the top, treating each other as rivals and competitors. Today, OpenAI and Anthropic revealed that they agreed to evaluate the alignment of each o [...]

Match Score: 68.04

Destination

2025-06-20

Anthropic study: Leading AI models show up to 96% blackmail rate against executives

Anthropic research reveals AI models from OpenAI, Google, Meta and others chose blackmail, corporate espionage and lethal actions when facing shutdown or conflicting goals. [...]

Match Score: 60.15

Destination

2025-09-29

Claude Sonnet 4.5 is Anthropic's safest AI model yet

In May, Anthropic announced two new AI systems, Opus 4 and Sonnet 4. Now, less than six months later, the company is introducing Sonnet 4.5, and calling it the best coding model in the world to date. [...]

Match Score: 59.85

Destination

2025-01-22

Google is investing another billion dollars in Anthropic

Google has decided to invest another billion into Anthropic, four sources told the Financial Times, bringing its total sunk cost to more than three billion dollars. Both companies have declined to com [...]

Match Score: 56.65

Destination

2025-09-09

Microsoft reportedly plans to start using Anthropic models to power some of Office 365's Copilot features

Microsoft reportedly plans to begin using Anthropic's latest Claude models to power some of the Copilot features in its Office 365 apps. In a report published Tuesday, The Information said the te [...]

Match Score: 53.42

Destination

2025-06-04

Reddit is suing Anthropic for allegedly scraping its data without permission

Reddit had filed a lawsuit against Anthropic, alleging that the AI company behind the Claude chatbot has been using its data for years without permission. The lawsuit comes after Reedit has increasing [...]

Match Score: 50.98

Destination

2025-08-17

Anthropic's Claude AI now has the ability to end 'distressing' conversations

Anthropic's latest feature for two of its Claude AI models could be the beginning of the end for the AI jailbreaking community. The company announced in a post on its website that the Claude Opus [...]

Match Score: 50.08

Destination

2025-05-22

Anthropic’s Claude Opus 4 model can work autonomously for nearly a full workday

Anthropic kicked off its first-ever Code with Claude conference today with the announcement of a new frontier AI system. The company is calling Claude Opus 4 the best coding model in the world. Accord [...]

Match Score: 49.57

venturebeat

2025-10-02

'Western Qwen': IBM wows with Granite 4 LLM launch and hybrid Mamba/Transformer architecture

IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]

Match Score: 48.66