2025-05-13

OpenAI has released a new benchmark for testing AI systems in healthcare. Called HealthBench, it's designed to evaluate how well language models handle realistic medical conversations. According to OpenAI, its latest models outperform doctors on the test.
The article OpenAI says its latest models outperform doctors in medical benchmark appeared first on THE DECODER. [...]
2025-11-20
Elon Musk's frontier generative AI startup xAI formally opened developer access to its Grok 4.1 Fast models last night and introduced a new Agent Tools API—but the technical milestones were imm [...]
2025-11-14
OpenAI researchers are experimenting with a new approach to designing neural networks, with the aim of making AI models easier to understand, debug, and govern. Sparse models can provide enterprises w [...]
2025-10-28
In an industry where model size is often seen as a proxy for intelligence, IBM is charting a different course — one that values efficiency over enormity, and accessibility over abstraction.The 114-y [...]
2025-11-06
Even as concern and skepticism grows over U.S. AI startup OpenAI's buildout strategy and high spending commitments, Chinese open source AI providers are escalating their competition and one has e [...]
2025-10-03
OpenAI will host more than 1,500 developers at its largest annual conference on Monday, as the company behind ChatGPT seeks to maintain its edge in an increasingly competitive artificial intelligence [...]
2025-10-02
IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]
2025-11-07
The developers of Terminal-Bench, a benchmark suite for evaluating the performance of autonomous AI agents on real-world terminal-based tasks, have released version 2.0 alongside Harbor, a new framewo [...]
2025-10-09
OpenAI’s annual developer conference on Monday was a spectacle of ambitious AI product launches, from an app store for ChatGPT to a stunning video-generation API that brought creative concepts to li [...]
2025-10-29
Enterprises, eager to ensure any AI models they use adhere to safety and safe-use policies, fine-tune LLMs so they do not respond to unwanted queries. However, much of the safeguarding and red teamin [...]