Destination

2025-06-21

Blackmail becomes go-to strategy for AI models facing shutdown in new Anthropic tests


A new study from Anthropic suggests that large AI models can sometimes behave like disloyal employees, raising real security concerns even if their actions aren't intentional.


The article Blackmail becomes go-to strategy for AI models facing shutdown in new Anthropic tests appeared first on THE DECODER.

[...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

Destination

2025-01-22

Google is investing another billion dollars in Anthropic

Google has decided to invest another billion into Anthropic, four sources told the Financial Times, bringing its total sunk cost to more than three billion dollars. Both companies have declined to com [...]

Match Score: 74.05

Destination

2025-06-20

Anthropic study: Leading AI models show up to 96% blackmail rate against executives

Anthropic research reveals AI models from OpenAI, Google, Meta and others chose blackmail, corporate espionage and lethal actions when facing shutdown or conflicting goals. [...]

Match Score: 70.69

Destination

2025-06-04

Reddit is suing Anthropic for allegedly scraping its data without permission

Reddit had filed a lawsuit against Anthropic, alleging that the AI company behind the Claude chatbot has been using its data for years without permission. The lawsuit comes after Reedit has increasing [...]

Match Score: 66.60

Destination

2025-05-22

Anthropic’s Claude Opus 4 model can work autonomously for nearly a full workday

Anthropic kicked off its first-ever Code with Claude conference today with the announcement of a new frontier AI system. The company is calling Claude Opus 4 the best coding model in the world. Accord [...]

Match Score: 65.45

Destination

2025-06-30

Apple may power Siri with Anthropic or OpenAI models amid AI struggles

Apple is considering using AI models from OpenAI or Anthropic to deliver the more capable version of Siri it debuted at WWDC 2024, Bloomberg reports. The company has promised it could deliver a new ve [...]

Match Score: 55.38

Destination

2025-04-09

Claude isn’t a great Pokémon player, and that’s okay

If Claude Plays Pokémon is supposed to offer a glimpse of AI's future, it's not a very convincing showcase. For the past month and counting, Twitch has watched Anthropic's chatbot stru [...]

Match Score: 54.83

Destination

2025-06-30

Anthropic's Claude stocked a fridge with metal cubes when it was put in charge of a snacks business

If you're worried your local bodega or convivence store may soon be replaced by an AI storefront, you can rest easy — at least for the time being. Anthropic recently concluded an experiment, du [...]

Match Score: 52.80

Destination

2025-02-24

Anthropic’s new Claude model can think both fast and slow

Another week, and there's another new AI model ready for public use. This time, it's Anthropic with the introduction of Claude 3.7 Sonnet. The company describes its latest release as the mar [...]

Match Score: 51.14

Destination

2025-02-06

OpenAI co-founder John Schulman has left Anthropic after less than a year

Less than a year into his tenure at the company, OpenAI co-founder John Schulman is leaving Anthropic. The startup confirmed Schulman’s departure after The Information, Reuters and other publication [...]

Match Score: 50.28