Destination

2025-04-22

So-called reasoning models are more efficient but not more capable than regular LLMs, study finds

A new study questions whether reinforcement learning with verifiable rewards (RLVR) actually improves the reasoning abilities of large language models - or merely helps to reproduce known solution paths more efficiently.


A new study from Tsinghua University and Shanghai Jiao Tong University examines whether reinforcement learning with verifiable rewards (RLVR) helps large language models reason better—or simply makes them more efficient at repeating known solutions.


The article

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

2025-10-02

'Western Qwen': IBM wows with Granite 4 LLM launch and hybrid Mamba/Transformer architecture

IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]

Match Score: 117.35

Destination

2025-06-07

Apple study finds "a fundamental scaling limitation" in reasoning models' thinking abilities

LLMs designed for reasoning, like Claude 3.7 and Deepseek-R1, are supposed to excel at complex problem-solving by simulating thought processes. But a new study by Apple researchers suggests that these [...]

Match Score: 92.37

Destination

2025-02-18

xAI launches Grok 3 AI, claiming it is capable of 'human reasoning'

xAI has launched its Grok 3 models during a livestream with Elon Musk, who said they were "an order of magnitude more capable than Grok 2." The Grok 3 mini model can answer questions quickly [...]

Match Score: 77.20

venturebeat

2025-09-30

Meta’s new CWM model learns how code works, not just what it looks like

Meta’s AI research team has released a new large language model (LLM) for coding that enhances code understanding by learning not only what code looks like, but also what it does when executed. The [...]

Match Score: 74.68

Destination

2025-02-28

Engadget Podcast: iPhone 16e review and Amazon's AI-powered Alexa+

The keyword for the iPhone 16e seems to be "compromise." In this episode, Devindra chats with Cherlynn about her iPhone 16e review and try to figure out who this phone is actually for. Also, [...]

Match Score: 72.37

Destination

2025-07-04

Apple's claims about large reasoning models face fresh scrutiny from a new study

A replication study of Apple's controversial "The Illusion of Thinking" paper confirms some of its main criticisms, but challenges the study's central conclusion.<br /> The a [...]

Match Score: 70.35

Destination

2025-09-01

LLMs struggle with clinical reasoning and are just matching patterns, study finds

A new study in JAMA Network Open raises fresh doubts about whether large language models (LLMs) can actually reason through medical cases or if they're just matching patterns they've seen be [...]

Match Score: 69.49

Destination

2025-08-05

OpenAI's first new open-weight LLMs in six years are here

For the first time since GPT-2 in 2019, OpenAI is releasing new open-weight large language models. It's a major milestone for a company that has increasingly been accused of forgoing its original [...]

Match Score: 64.53

Destination

2025-05-27

How Phi-4-Reasoning Redefines AI Reasoning by Challenging “Bigger is Better” Myth

Microsoft's recent release of Phi-4-reasoning challenges a key assumption in building artificial intelligence systems capable of reasoning. Since the introduction of chain-of-thought reasoning in [...]

Match Score: 61.86