2025-05-13
OpenAI has released a new benchmark for testing AI systems in healthcare. Called HealthBench, it's designed to evaluate how well language models handle realistic medical conversations. According to OpenAI, its latest models outperform doctors on the test.
The article OpenAI says its latest models outperform doctors in medical benchmark appeared first on THE DECODER. [...]
2025-01-29
OpenAI claims that Chinese startups are persistently trying to copy the technology of American AI companies. Aligned with that, OpenAI says it and partner Microsoft have been banning accounts suspecte [...]
2025-07-20
ARC-AGI-3 aims to test how well AI systems can handle brand new problems. While people breeze through the challenges, the latest AI models still come up short.<br /> The article New ARC-AGI-3 be [...]
2025-02-27
In what has already been a busy past few days for new model releases, OpenAI is capping off the week with a research preview of GPT-4.5. The company is touting the new system as its largest and best m [...]
2025-05-01
Microsoft is expanding its Phi series of compact language models with three new variants designed for advanced reasoning tasks.<br /> The article Microsoft's Phi-4-reasoning models outperfo [...]
2025-05-06
OpenAI has abandoned its controversial restructuring plan. In a dramatic reversal, the company said Monday it would no longer try to separate control of its for-profit arm from the non-profit board th [...]