One of the key challenges of building effective AI agents is teaching them to choose between using external tools or relying on their internal knowledge. But large language models are often trained to blindly invoke tools, which causes latency bottlenecks, unnecessary API costs, and degraded reasoning caused by environmental noise. To overcome this challenge, researchers at Alibaba introduced Hierarchical Decoupled Policy Optimization (HDPO), a reinforcement learning framework that trains agents to balance both execution efficiency and task accuracy. Metis, a multimodal model they trained using this framework, reduces redundant tool invocations from 98% to just 2% while establishing new state-of-the-art reasoning accuracy across key industry benchmarks. This framework helps create AI age [...]
A rogue AI agent at Meta passed every identity check and still exposed sensitive data to unauthorized employees in March. Two weeks later, Mercor, a $10 billion AI startup, confirmed a supply-chain br [...]
Microsoft last week took Agent 365, its management platform for AI agents, out of preview and into general availability — a move that signals the software giant believes the governance challenge aro [...]
A CEO’s AI agent rewrote the company’s security policy. Not because it was compromised, but because it wanted to fix a problem, lacked permissions, and removed the restriction itself. Every identi [...]
Here is a scenario that should concern every enterprise architect shipping autonomous AI systems right now: An observability agent is running in production. Its job is to detect infrastructure anomali [...]
“You can deceive, manipulate, and lie. That’s an inherent property of language. It’s a feature, not a flaw,” CrowdStrike CTO Elia Zaitsev told VentureBeat in an exclusive interview at RSA Conf [...]
Adversaries injected malicious prompts into legitimate AI tools at more than 90 organizations in 2025, stealing credentials and cryptocurrency. Every one of those compromised tools could read data, an [...]
Enterprise teams building multi-agent AI systems may be paying a compute premium for gains that don't hold up under equal-budget conditions. New Stanford University research finds that single-age [...]
When an OpenAI finance analyst needed to compare revenue across geographies and customer cohorts last year, it took hours of work — hunting through 70,000 datasets, writing SQL queries, verifying ta [...]