venturebeat
Alibaba's Metis agent cuts redundant AI tool calls from 98% to 2% — and gets more accurate doing it

One of the key challenges of building effective AI agents is teaching them to choose between using external tools or relying on their internal knowledge. But large language models are often trained to blindly invoke tools, which causes latency bottlenecks, unnecessary API costs, and degraded reasoning caused by environmental noise. To overcome this challenge, researchers at Alibaba introduced Hierarchical Decoupled Policy Optimization (HDPO), a reinforcement learning framework that trains agents to balance both execution efficiency and task accuracy. Metis, a multimodal model they trained using this framework, reduces redundant tool invocations from 98% to just 2% while establishing new state-of-the-art reasoning accuracy across key industry benchmarks. This framework helps create AI age [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat
Most enterprises can't stop stage-three AI agent threats, VentureBeat survey finds

A rogue AI agent at Meta passed every identity check and still exposed sensitive data to unauthorized employees in March. Two weeks later, Mercor, a $10 billion AI startup, confirmed a supply-chain br [...]

Match Score: 196.73

venturebeat
Microsoft takes Agent 365 out of preview as shadow AI becomes an enterprise threat

Microsoft last week took Agent 365, its management platform for AI agents, out of preview and into general availability — a move that signals the software giant believes the governance challenge aro [...]

Match Score: 113.92

venturebeat
An AI agent rewrote a Fortune 50 security policy. Here's how to govern AI agents before one does the same.

A CEO’s AI agent rewrote the company’s security policy. Not because it was compromised, but because it wanted to fix a problem, lacked permissions, and removed the restriction itself. Every identi [...]

Match Score: 112.13

venturebeat
Intent-based chaos testing is designed for when AI behaves confidently — and wrongly

Here is a scenario that should concern every enterprise architect shipping autonomous AI systems right now: An observability agent is running in production. Its job is to detect infrastructure anomali [...]

Match Score: 111.94

venturebeat
RSAC 2026 shipped five agent identity frameworks and left three critical gaps open

“You can deceive, manipulate, and lie. That’s an inherent property of language. It’s a feature, not a flaw,” CrowdStrike CTO Elia Zaitsev told VentureBeat in an exclusive interview at RSA Conf [...]

Match Score: 111.83

venturebeat
Testing autonomous agents (Or: how I learned to stop worrying and embrace chaos)

Look, we've spent the last 18 months building production AI systems, and we'll tell you what keeps us up at night — and it's not whether the model can answer questions. That's ta [...]

Match Score: 105.83

venturebeat
Adversaries hijacked AI security tools at 90+ organizations. The next wave has write access to the firewall

Adversaries injected malicious prompts into legitimate AI tools at more than 90 organizations in 2025, stealing credentials and cryptocurrency. Every one of those compromised tools could read data, an [...]

Match Score: 96.52

venturebeat
Are you paying an AI ‘swarm tax’? Why single agents often beat complex systems

Enterprise teams building multi-agent AI systems may be paying a compute premium for gains that don't hold up under equal-budget conditions. New Stanford University research finds that single-age [...]

Match Score: 92.09

venturebeat
OpenAI's AI data agent, built by two engineers, now serves 4,000 employees — and the company says anyone can replicate it

When an OpenAI finance analyst needed to compare revenue across geographies and customer cohorts last year, it took hours of work — hunting through 70,000 datasets, writing SQL queries, verifying ta [...]

Match Score: 90.76