2025-05-13
OpenAI has released a new benchmark for testing AI systems in healthcare. Called HealthBench, it's designed to evaluate how well language models handle realistic medical conversations. According to OpenAI, its latest models outperform doctors on the test.
The article OpenAI says its latest models outperform doctors in medical benchmark appeared first on THE DECODER. [...]
2025-10-03
OpenAI will host more than 1,500 developers at its largest annual conference on Monday, as the company behind ChatGPT seeks to maintain its edge in an increasingly competitive artificial intelligence [...]
2025-10-02
IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]
2025-07-20
ARC-AGI-3 aims to test how well AI systems can handle brand new problems. While people breeze through the challenges, the latest AI models still come up short.<br /> The article New ARC-AGI-3 be [...]
2025-05-01
Microsoft is expanding its Phi series of compact language models with three new variants designed for advanced reasoning tasks.<br /> The article Microsoft's Phi-4-reasoning models outperfo [...]
2025-01-29
OpenAI claims that Chinese startups are persistently trying to copy the technology of American AI companies. Aligned with that, OpenAI says it and partner Microsoft have been banning accounts suspecte [...]