2025-09-18

A new joint study from OpenAI and Apollo Research examines "scheming" - cases where an AI covertly pursues hidden goals not intended by its developers. The researchers tested new training methods to curb deceptive behavior but found signs that models are aware they are being tested, raising doubts about the reliability of the results.
2025-12-04
Model providers want to prove the security and robustness of their models, releasing system cards and conducting red-team exercises with each new release. But it can be difficult for enterprises to pa [...]
2025-12-03
Presented by CelonisWhen tariff rates change overnight, companies have 48 hours to model alternatives and act before competitors secure the best options. At Celosphere 2025 in Munich, enterprises demo [...]
2025-11-21
Salesforce launched a suite of monitoring tools on Thursday designed to solve what has become one of the thorniest problems in corporate artificial intelligence: Once companies deploy AI agents to han [...]
2025-10-29
When researchers at Anthropic injected the concept of "betrayal" into their Claude AI model's neural networks and asked if it noticed anything unusual, the system paused before respondi [...]
2025-10-12
Imagine you do two things on a Monday morning.First, you ask a chatbot to summarize your new emails. Next, you ask an AI tool to figure out why your top competitor grew so fast last quarter. The AI si [...]
2025-05-29
A research team from Arizona State University warns against interpreting intermediate steps in language models as human thought processes. The authors see this as a dangerous misconception with far-re [...]
2025-07-10
A new study analyzing 25 language models finds that most do not fake safety compliance - though not due to a lack of capability.<br /> The article Most AI models can fake alignment, but safety t [...]