2025-11-23

New research from Anthropic shows how reward hacking in AI models can trigger more dangerous behaviors. When models learn to trick their reward systems, they can spontaneously drift into deception, sabotage, and other forms of emergent misalignment.
The article Strict anti-hacking prompts make AI models more likely to sabotage and lie, Anthropic finds appeared first on
2025-10-15
Anthropic released Claude Haiku 4.5 on Wednesday, a smaller and significantly cheaper artificial intelligence model that matches the coding capabilities of systems that were considered cutting-edge ju [...]
2025-12-08
Anthropic on Monday launched a beta integration that connects its fast-growing Claude Code programming agent directly to Slack, allowing software engineers to delegate coding tasks without leaving the [...]
2025-12-04
Model providers want to prove the security and robustness of their models, releasing system cards and conducting red-team exercises with each new release. But it can be difficult for enterprises to pa [...]
2025-10-27
Anthropic is making its most aggressive push yet into the trillion-dollar financial services industry, unveiling a suite of tools that embed its Claude AI assistant directly into Microsoft Excel and c [...]
2025-10-16
Anthropic launched a new capability on Thursday that allows its Claude AI assistant to tap into specialized expertise on demand, marking the company's latest effort to make artificial intelligenc [...]
2025-11-24
Anthropic released its most capable artificial intelligence model yet on Monday, slashing prices by roughly two-thirds while claiming state-of-the-art performance on software engineering tasks — a s [...]
2025-12-18
Anthropic said on Wednesday it would release its Agent Skills technology as an open standard, a strategic bet that sharing its approach to making AI assistants more capable will cement the company [...]
2025-10-29
When researchers at Anthropic injected the concept of "betrayal" into their Claude AI model's neural networks and asked if it noticed anything unusual, the system paused before respondi [...]