2025-12-10
There's no shortage of generative AI benchmarks designed to measure the performance and accuracy of a given model on completing various helpful enterprise tasks — from coding to instruction following to agentic web browsing and tool use. But many of these benchmarks have one major shortcoming: they measure the AI's ability to complete specific problems and requests, not how factual the model is in its outputs — how well it generates objectively correct information tied to real-world data — especially when dealing with information contained in imagery or graphics.
< [...]2025-10-01
In the race to deploy generative AI for coding, the fastest tools are not winning enterprise deals. A new VentureBeat analysis, combining a comprehensive survey of 86 engineering teams with our own ha [...]
2025-12-09
There is no shortage of AI benchmarks in the market today, with popular options like Humanity's Last Exam (HLE), ARC-AGI-2 and GDPval, among numerous others.AI agents excel at solving abstract ma [...]
2025-12-16
Zoom Video Communications, the company best known for keeping remote workers connected during the pandemic, announced last week that it had achieved the highest score ever recorded on one of artificia [...]
2025-12-11
A new benchmark from Google Deepmind aims to measure AI model reliability more comprehensively than ever before. The results reveal that even top-tier models like Gemini 3 Pro and GPT-5.1 are far from [...]
2025-11-18
After more than a month of rumors and feverish speculation — including Polymarket wagering on the release date — Google today unveiled Gemini 3, its newest proprietary frontier model family and th [...]
2025-10-09
The friction of having to open a separate chat window to prompt an agent could be a hassle for many enterprises. And AI companies are seeing an opportunity to bring more and more AI services into one [...]
2025-11-29
As AI systems enter production, reliability and governance can’t depend on wishful thinking. Here’s how observability turns large language models (LLMs) into auditable, trustworthy enterprise syst [...]
2025-12-22
While Silicon Valley debates whether artificial intelligence has become an overinflated bubble, Salesforce's enterprise AI platform quietly added 6,000 new customers in a single quarter — a 48% [...]