2025-07-10
A new study analyzing 25 language models finds that most do not fake safety compliance - though not due to a lack of capability.
The article Most AI models can fake alignment, but safety training suppresses the behavior, study finds appeared first on Discover Copy
2025-02-10
Roblox, Discord, OpenAI and Google are launching a nonprofit organization called ROOST, or Robust Open Online Safety Tools, which hopes "to build scalable, interoperable safety infrastructure su [...]
2025-09-18
A new joint study from OpenAI and Apollo Research examines "scheming" - cases where an AI covertly pursues hidden goals not intended by its developers. The researchers tested new training me [...]
2025-10-02
IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]
2025-09-30
Meta’s AI research team has released a new large language model (LLM) for coding that enhances code understanding by learning not only what code looks like, but also what it does when executed. The [...]
2025-10-02
A new study by Shanghai Jiao Tong University and SII Generative AI Research Lab (GAIR) shows that training large language models (LLMs) for complex, autonomous tasks does not require massive datasets. [...]
2025-10-01
Thinking Machines, the AI startup founded earlier this year by former OpenAI CTO Mira Murati, has launched its first product: Tinker, a Python-based API designed to make large language model (LLM) fin [...]
2025-10-01
It’s been a rough time for Peloton. Last year was marred by deep staff cuts, a change of CEO and a reckoning of where the home fitness company belonged, post-Pandemic boom. The answer is, unfortunat [...]
2025-08-27
Most of the time, AI companies are locked in a race to the top, treating each other as rivals and competitors. Today, OpenAI and Anthropic revealed that they agreed to evaluate the alignment of each o [...]