2025-12-09
There is no shortage of AI benchmarks in the market today, with popular options like Humanity's Last Exam (HLE), ARC-AGI-2 and GDPval, among numerous others.
AI agents excel at solving abstract math problems and passing PhD-level exams that most benchmarks are based on, but Databricks has a question for the enterprise: Can they actually handle the document-heavy work most enterprises need them to do?
The answer, according to new research from the data and AI platform company, is sobering. Even the best-performing AI agents achieve less than 45% accuracy on tasks that mirror real enterprise workloads, exposing a critical gap between academic benchmarks a [...]
2025-11-14
There is a lot of enterprise data trapped in PDF documents. To be sure, gen AI tools have been able to ingest and analyze PDFs, but accuracy, time and cost have been less than ideal. New technology fr [...]
2025-10-01
Many enterprises running PostgreSQL databases for their applications face the same expensive reality. When they need to analyze that operational data or feed it to AI models, they build ETL (Extract, [...]
2025-07-23
2024 was an awful year for Sonos. Its long-awaited entry into a crowded headphones market was eclipsed by a bungled app launch which had a knock-on effect that impacted everything the company had plan [...]
2025-11-04
The intelligence of AI models isn't what's blocking enterprise deployments. It's the inability to define and measure quality in the first place.That's where AI judges are now playi [...]
2025-10-16
A new framework from Stanford University and SambaNova addresses a critical challenge in building robust AI agents: context engineering. Called Agentic Context Engineering (ACE), the framework automat [...]
2025-11-13
Artificial intelligence agents powered by the world's most advanced language models routinely fail to complete even straightforward professional tasks on their own, according to groundbreaking re [...]
2025-10-01
In the race to deploy generative AI for coding, the fastest tools are not winning enterprise deals. A new VentureBeat analysis, combining a comprehensive survey of 86 engineering teams with our own ha [...]
2025-10-29
Enterprise AI agents today face a fundamental timing problem: They can't easily act on critical business events because they aren't always aware of them in real-time.The challenge is infrast [...]