venturebeat

2025-12-17

AI agents fail 63% of the time on complex tasks. Patronus AI says its new 'living' training worlds can fix that.

Patronus AI, the artificial intelligence evaluation startup backed by $20 million from investors including Lightspeed Venture Partners and Datadog, unveiled a new training architecture Tuesday that it says represents a fundamental shift in how AI agents learn to perform complex tasks.

The technology, which the company calls "Generative Simulators," creates adaptive simulation environments that continuously generate new challenges, update rules dynamically, and evaluate an agent's performance as it learn [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

2025-11-13

Upwork study shows AI agents excel with human partners but fail independently

Artificial intelligence agents powered by the world's most advanced language models routinely fail to complete even straightforward professional tasks on their own, according to groundbreaking re [...]

Match Score: 150.59

venturebeat

2025-11-10

Baseten takes on hyperscalers with new AI training platform that lets you own your model weights

Baseten, the AI infrastructure company recently valued at $2.15 billion, is making its most significant product pivot yet: a full-scale push into model training that could reshape how enterprises wean [...]

Match Score: 123.25

venturebeat

2025-11-19

Meta’s DreamGym framework trains AI agents in a simulated world to cut reinforcement learning costs

Researchers at Meta, the University of Chicago, and UC Berkeley have developed a new framework that addresses the high costs, infrastructure complexity, and unreliable feedback associated with using r [...]

Match Score: 112.58

venturebeat

2025-12-02

Amazon's new AI can code for days without human help. What does that mean for software engineers?

Amazon Web Services on Tuesday announced a new class of artificial intelligence systems called "frontier agents" that can work autonomously for hours or even days without human intervention, [...]

Match Score: 93.36

venturebeat

2025-11-19

The Google Search of AI agents? Fetch launches ASI:One and Business tier for new era of non-human web

Fetch AI, a startup founded and led by former DeepMind founding investor, Humayun Sheikh, today announced the release of three interconnected products designed to provide the trust, coordination, and [...]

Match Score: 90.11

Destination

2025-07-18

What the hell is going on with Subnautica 2?

If I had to describe the status of Subnautica 2 in just three words, it would be these: messy, messy, messy. That’s not to say the game itself is in terrible shape — this is actually a pivotal cla [...]

Match Score: 89.69

venturebeat

2025-10-12

We keep talking about AI agents, but do we ever know what they are?

Imagine you do two things on a Monday morning.First, you ask a chatbot to summarize your new emails. Next, you ask an AI tool to figure out why your top competitor grew so fast last quarter. The AI si [...]

Match Score: 87.44

venturebeat

2025-12-01

OpenAGI emerges from stealth with an AI agent that it claims crushes OpenAI and Anthropic

A stealth artificial intelligence startup founded by an MIT researcher emerged this morning with an ambitious claim: its new AI model can control computers better than systems built by OpenAI and Anth [...]

Match Score: 80.53

venturebeat

2025-10-14

EAGLET boosts AI agent performance on longer-horizon tasks by generating custom plans

2025 was supposed to be the year of "AI agents," according to Nvidia CEO Jensen Huang, and other AI industry personnel. And it has been, in many ways, with numerous leading AI model provider [...]

Match Score: 80.04