Destination

2025-06-21

Blackmail becomes go-to strategy for AI models facing shutdown in new Anthropic tests


A new study from Anthropic suggests that large AI models can sometimes behave like disloyal employees, raising real security concerns even if their actions aren't intentional.


The article Blackmail becomes go-to strategy for AI models facing shutdown in new Anthropic tests appeared first on THE DECODER.

[...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

2025-10-15

Anthropic is giving away its powerful Claude Haiku 4.5 AI for free to take on OpenAI

Anthropic released Claude Haiku 4.5 on Wednesday, a smaller and significantly cheaper artificial intelligence model that matches the coding capabilities of systems that were considered cutting-edge ju [...]

Match Score: 191.55

venturebeat

2025-10-27

Anthropic rolls out Claude AI for finance, integrates with Excel to rival Microsoft Copilot

Anthropic is making its most aggressive push yet into the trillion-dollar financial services industry, unveiling a suite of tools that embed its Claude AI assistant directly into Microsoft Excel and c [...]

Match Score: 144.48

venturebeat

2025-10-16

How Anthropic’s ‘Skills’ make Claude faster, cheaper, and more consistent for business workflows

Anthropic launched a new capability on Thursday that allows its Claude AI assistant to tap into specialized expertise on demand, marking the company's latest effort to make artificial intelligenc [...]

Match Score: 117.78

venturebeat

2025-10-29

Anthropic scientists hacked Claude’s brain — and it noticed. Here’s why that’s huge

When researchers at Anthropic injected the concept of "betrayal" into their Claude AI model's neural networks and asked if it noticed anything unusual, the system paused before respondi [...]

Match Score: 97.16

Destination

2025-08-27

OpenAI and Anthropic conducted safety evaluations of each other's AI systems

Most of the time, AI companies are locked in a race to the top, treating each other as rivals and competitors. Today, OpenAI and Anthropic revealed that they agreed to evaluate the alignment of each o [...]

Match Score: 63.82

Destination

2025-06-20

Anthropic study: Leading AI models show up to 96% blackmail rate against executives

Anthropic research reveals AI models from OpenAI, Google, Meta and others chose blackmail, corporate espionage and lethal actions when facing shutdown or conflicting goals. [...]

Match Score: 58.19

Destination

2025-09-29

Claude Sonnet 4.5 is Anthropic's safest AI model yet

In May, Anthropic announced two new AI systems, Opus 4 and Sonnet 4. Now, less than six months later, the company is introducing Sonnet 4.5, and calling it the best coding model in the world to date. [...]

Match Score: 55.92

Destination

2025-01-22

Google is investing another billion dollars in Anthropic

Google has decided to invest another billion into Anthropic, four sources told the Financial Times, bringing its total sunk cost to more than three billion dollars. Both companies have declined to com [...]

Match Score: 52.97

venturebeat

2025-10-28

IBM's open source Granite 4.0 Nano AI models are small enough to run locally directly in your browser

In an industry where model size is often seen as a proxy for intelligence, IBM is charting a different course — one that values efficiency over enormity, and accessibility over abstraction.The 114-y [...]

Match Score: 51.93