Destination

2025-07-20

Alibaba's Qwen2.5 only excels at math thanks to memorized training data

Calculator with four question marks on the display against a green 3D grid background.


A new study finds that Alibaba's Qwen2.5 models achieve high math scores mainly by memorizing training data rather than through genuine reasoning.


The article Alibaba's Qwen2.5 only excels at math thanks to memorized training data appeared first on THE DECODER.< [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

2025-11-17

Phi-4 proves that a 'data-first' SFT methodology is the new differentiator

AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology [...]

Match Score: 136.30

venturebeat

2025-11-10

Baseten takes on hyperscalers with new AI training platform that lets you own your model weights

Baseten, the AI infrastructure company recently valued at $2.15 billion, is making its most significant product pivot yet: a full-scale push into model training that could reshape how enterprises wean [...]

Match Score: 125.81

venturebeat

2025-11-14

Google’s new AI training method helps small models tackle complex reasoning

Researchers at Google Cloud and UCLA have proposed a new reinforcement learning framework that significantly improves the ability of language models to learn very challenging multi-step reasoning task [...]

Match Score: 109.48

Destination

2025-03-26

Alibaba's Qwen2.5-VL-32B matches larger models with just 32B parameters

Alibaba has added a multimodal visual language model to its Qwen2.5 series, marking another step in the Chinese tech company's effort to compete in the commercial AI space.<br /> The articl [...]

Match Score: 98.59

venturebeat

2025-11-12

Weibo's new open source AI model VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training budget

Another day in late 2025, another impressive result from a Chinese company in open source artificial intelligence.Chinese social networking company Weibo's AI division recently released its open [...]

Match Score: 90.43

venturebeat

2025-11-26

Alibaba's AgentEvolver lifts model performance in tool use by ~30% using synthetic, auto-generated tasks

Researchers at Alibaba’s Tongyi Lab have developed a new framework for self-evolving agents that create their own training data by exploring their application environments. The framework, AgentEvolv [...]

Match Score: 75.73

venturebeat

2025-12-11

Nous Research just released Nomos 1, an open-source AI that ranks second on the notoriously brutal Putnam math exam

Nous Research, the San Francisco-based artificial intelligence startup, released on Tuesday an open-source mathematical reasoning system called Nomos 1 that achieved near-elite human performance on th [...]

Match Score: 65.59

venturebeat

2025-10-09

Nvidia researchers boost LLMs reasoning skills by getting them to 'think' during pre-training

Researchers at Nvidia have developed a new technique that flips the script on how large language models (LLMs) learn to reason. The method, called reinforcement learning pre-training (RLP), integrates [...]

Match Score: 64.94

venturebeat

2025-12-02

Arcee aims to reboot U.S. open source AI with new Trinity models released under Apache 2.0

For much of 2025, the frontier of open-weight language models has been defined not in Silicon Valley or New York City, but in Beijing and Hangzhou.Chinese research labs including Alibaba's Qwen, [...]

Match Score: 63.63