Peektastic.com

METR says it can barely measure Claude Mythos, Palo Alto Networks warns of autonomous AI attackers

METR can barely measure Claude Mythos Preview with its current test suite. Only five out of 228 tasks cover the relevant capability range. Meanwhile, Palo Alto Networks reports that frontier models autonomously chain vulnerabilities, shrinking the time from initial access to data exfiltration to just 25 minutes. Evaluation methods are growing more slowly than the models themselves, and that may be the bigger problem.<br /> The article METR says it can barely measure Claude Mythos, Palo Alto Networks warns of autonomous AI attackers appeared first on The Decoder. [...]

Discover Copy

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

Anthropic brings Mythos to the masses with Claude Fable 5, its most powerful generally available model ever

Anthropic today launched two new AI models — Claude Fable 5 and Claude Mythos 5 — marking the company’s first broad release of the powerful “Mythos-class” AI capabilities it previously kept [...]

More Copy

Match Score: 278.71

venturebeat

Anthropic vs. OpenAI red teaming methods reveal different security priorities for enterprise AI

Model providers want to prove the security and robustness of their models, releasing system cards and conducting red-team exercises with each new release. But it can be difficult for enterprises to pa [...]

More Copy

Match Score: 187.74

venturebeat

Mythos autonomously exploited vulnerabilities that survived 27 years of human review. Security teams need a new detection playbook

A 27-year-old bug sat inside OpenBSD’s TCP stack while auditors reviewed the code, fuzzers ran against it, and the operating system earned its reputation as one of the most security-hardened platfor [...]

More Copy

Match Score: 185.09

venturebeat

Anthropic ships major Claude Design overhaul with design system imports, code round-trips, and a fix for its token-burning problem

When Anthropic quietly released Claude Design in April as a "research preview," it generated the kind of instant traction most product teams dream about: more than one million users in its f [...]

More Copy

Match Score: 161.16

venturebeat

Anthropic just launched Claude Design, an AI tool that turns prompts into prototypes and challenges Figma

Anthropic today launched Claude Design, a new product from its Anthropic Labs division that allows users to create polished visual work — designs, interactive prototypes, slide decks, one-pagers, an [...]

More Copy

Match Score: 154.04

venturebeat

RSAC 2026 shipped five agent identity frameworks and left three critical gaps open

“You can deceive, manipulate, and lie. That’s an inherent property of language. It’s a feature, not a flaw,” CrowdStrike CTO Elia Zaitsev told VentureBeat in an exclusive interview at RSA Conf [...]

More Copy

Match Score: 144.32

venturebeat

Anthropic says its most powerful AI cyber model is too dangerous to release publicly — so it built Project Glasswing

Anthropic on Tuesday announced Project Glasswing, a sweeping cybersecurity initiative that pairs an unreleased frontier AI model — Claude Mythos Preview — with a coalition of twelve major technolo [...]

More Copy

Match Score: 142.02

venturebeat

Anthropic is bringing back Claude Fable 5 globally after US lifts export control order — where can enterprises access it?

Anthropic is restoring global access to its most powerful generally released AI model yet, Claude Fable 5, today, after the U.S. Department of Commerce last night withdrew emergency export controls.T [...]

More Copy

Match Score: 139.26

venturebeat

Anthropic’s Claude can now control your Mac, escalating the fight to build AI agents that actually do work

Anthropic on Monday launched the most ambitious consumer AI agent to date, giving its Claude chatbot the ability to directly control a user's Mac — clicking buttons, opening applications, typin [...]

More Copy

Match Score: 138.65