Destination

2025-09-18

Study cautions that monitoring chains of thought soon may no longer ensure genuine AI alignment

OpenAI study: AI models reveal undesirable behavior in their


A new joint study from OpenAI and Apollo Research examines "scheming" - cases where an AI covertly pursues hidden goals not intended by its developers. The researchers tested new training methods to curb deceptive behavior but found signs that models are aware they are being tested, raising doubts about the reliability of the results.


The article Discover Copy

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

2025-12-04

Anthropic vs. OpenAI red teaming methods reveal different security priorities for enterprise AI

Model providers want to prove the security and robustness of their models, releasing system cards and conducting red-team exercises with each new release. But it can be difficult for enterprises to pa [...]

Match Score: 79.86

venturebeat

2025-12-03

Tariff turbulence exposes costly blind spots in supply chains and AI

Presented by CelonisWhen tariff rates change overnight, companies have 48 hours to model alternatives and act before competitors secure the best options. At Celosphere 2025 in Munich, enterprises demo [...]

Match Score: 61.19

venturebeat

2025-11-21

Salesforce Agentforce Observability lets you watch your AI agents think in real time

Salesforce launched a suite of monitoring tools on Thursday designed to solve what has become one of the thorniest problems in corporate artificial intelligence: Once companies deploy AI agents to han [...]

Match Score: 48.79

venturebeat

2025-10-29

Anthropic scientists hacked Claude’s brain — and it noticed. Here’s why that’s huge

When researchers at Anthropic injected the concept of "betrayal" into their Claude AI model's neural networks and asked if it noticed anything unusual, the system paused before respondi [...]

Match Score: 47.28

venturebeat

2025-10-12

We keep talking about AI agents, but do we ever know what they are?

Imagine you do two things on a Monday morning.First, you ask a chatbot to summarize your new emails. Next, you ask an AI tool to figure out why your top competitor grew so fast last quarter. The AI si [...]

Match Score: 44.21

Destination

2025-05-29

Wait a minute! Researchers say AI's "chains of thought" are not signs of human-like reasoning

A research team from Arizona State University warns against interpreting intermediate steps in language models as human thought processes. The authors see this as a dangerous misconception with far-re [...]

Match Score: 43.57

venturebeat

2025-11-01

Large reasoning models almost certainly can think

Recently, there has been a lot of hullabaloo about the idea that large reasoning models (LRM) are unable to think. This is mostly due to a research article published by Apple, "The Illusion of Th [...]

Match Score: 43.42

Destination

2025-11-22

How to wirelessly charge your phone with max power

Wireless charging has become one of those small but satisfying conveniences of modern smartphones. You drop your device on a pad and watch the battery percentage climb without fiddling with cables or [...]

Match Score: 42.95

Destination

2025-07-10

Most AI models can fake alignment, but safety training suppresses the behavior, study finds

A new study analyzing 25 language models finds that most do not fake safety compliance - though not due to a lack of capability.<br /> The article Most AI models can fake alignment, but safety t [...]

Match Score: 41.52