Destination
How easily can Russian propaganda fool AI models? A new benchmark finds out

The Institute of the Estonian Language has released a benchmark measuring how susceptible AI language models are to Russian propaganda.<br /> The article How easily can Russian propaganda fool AI models? A new benchmark finds out appeared first on The Decoder. [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

Destination
Russian fake news network floods western AI chatbots with millions of propaganda articles

A Moscow-based disinformation operation is systematically feeding Russian propaganda into Western AI systems through a vast network of fake news sites called "Pravda" (Russian for "trut [...]

Match Score: 77.87

Destination
Russia blocks Roblox, citing 'LGBT propaganda' as a reason

Russia has blocked the popular gaming platform Roblox, according to a report by Reuters. The country's communications watchdog Roskomnadzor accused the developers of distributing extremist materi [...]

Match Score: 66.76

venturebeat
DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5, and finds Claude Opus exploiting a benchmark loophole

For months, the leading AI coding benchmarks have told enterprise buyers a comforting but misleading story: the top models are all roughly the same. OpenAI's GPT-5 family, Anthropic's Claude [...]

Match Score: 63.63

venturebeat
Frontier models are failing one in three production attempts — and getting harder to audit

AI agents are now embedded in real enterprise workflows, and they're still failing roughly one in three attempts on structured benchmarks. That gap between capability and reliability is the defin [...]

Match Score: 62.04

venturebeat
Why Weibo’s tiny VibeThinker-3B has the AI world arguing over benchmarks again

On Sunday, a team of nine researchers at Sina Weibo — the Chinese social media giant better known for its microblogging platform than for cutting-edge artificial intelligence — quietly posted a 14 [...]

Match Score: 56.74

venturebeat
Is Anthropic 'nerfing' Claude? Users increasingly report performance degradation as leaders push back

A growing number of developers and AI power users are taking to social media to accuse Anthropic of degrading the performance of Claude Opus 4.6 and Claude Code — intentionally or as an outcome of c [...]

Match Score: 50.01

venturebeat
Anthropic brings Mythos to the masses with Claude Fable 5, its most powerful generally available model ever

Anthropic today launched two new AI models — Claude Fable 5 and Claude Mythos 5 — marking the company’s first broad release of the powerful “Mythos-class” AI capabilities it previously kept [...]

Match Score: 49.33

venturebeat
The 70% factuality ceiling: why Google’s new ‘FACTS’ benchmark is a wake-up call for enterprise AI

There's no shortage of generative AI benchmarks designed to measure the performance and accuracy of a given model on completing various helpful enterprise tasks — from coding to instruction fol [...]

Match Score: 49.28

Destination
Russia's recent blocking of Telegram is reportedly disrupting its military operations in Ukraine

A decision to ban Telegram on home soil may have backfired on the Kremlin. Last week, Russia went on a blocking spree, banning a number of Western apps in an effort to push domestic users towards Max, [...]

Match Score: 43.73