venturebeat

2025-10-09

Nvidia researchers boost LLMs reasoning skills by getting them to 'think' during pre-training

Researchers at Nvidia have developed a new technique that flips the script on how large language models (LLMs) learn to reason.

The method, called reinforcement learning pre-training (RLP), integrates RL into the initial training phase rather than saving it for the end.

This approach encourages the model to “think for itself before predicting what comes next, thus teaching an independent thinking behavior earlier in the pretraining,” the researchers state in their paper.

By learning to reason on plain text without needing external verifiers, models trained with RLP show significant improvements in learning complex reasoning tasks downstream, hinting at a future of more capable and adaptable AI fo [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

Destination

2025-07-26

Surfshark VPN review: A fast VPN for casual users

Surfshark is one of the youngest major VPNs, but it's grown rapidly over the last seven years. Since 2018, it's expanded its network to 100 countries, added a suite of apps to its Surfshark [...]

Match Score: 196.69

Destination

2025-06-27

NordVPN Review 2025: Innovative features, a few missteps

When we say that NordVPN is a good VPN that's not quite great, it's important to put that in perspective. Building a good VPN is hard, as evidenced by all the shovelware VPNs flooding the ma [...]

Match Score: 187.92

Destination

2025-05-30

ExpressVPN review 2025: Fast speeds and a low learning curve

ExpressVPN is good at its job. It's easy to be skeptical of any service with a knack for self-promotion, but don't let ExpressVPN's hype distract you from the fact that it keeps its fro [...]

Match Score: 187.18

Destination

2025-06-12

Proton VPN review 2025: A nonprofit service with premium performance

Proton VPN stands out for two main reasons: it's one of the only virtual private networks (VPNs) to include a free plan with no data limits, and it's one of the few services majority-owned b [...]

Match Score: 172.89

venturebeat

2025-11-17

Phi-4 proves that a 'data-first' SFT methodology is the new differentiator

AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology [...]

Match Score: 164.85

venturebeat

2025-11-14

Google’s new AI training method helps small models tackle complex reasoning

Researchers at Google Cloud and UCLA have proposed a new reinforcement learning framework that significantly improves the ability of language models to learn very challenging multi-step reasoning task [...]

Match Score: 145.32

venturebeat

2025-11-10

Baseten takes on hyperscalers with new AI training platform that lets you own your model weights

Baseten, the AI infrastructure company recently valued at $2.15 billion, is making its most significant product pivot yet: a full-scale push into model training that could reshape how enterprises wean [...]

Match Score: 143.47

venturebeat

2025-10-30

Meta researchers open the LLM black box to repair flawed AI reasoning

Researchers at Meta FAIR and the University of Edinburgh have developed a new technique that can predict the correctness of a large language model's (LLM) reasoning and even intervene to fix its [...]

Match Score: 140.46

venturebeat

2025-11-01

Large reasoning models almost certainly can think

Recently, there has been a lot of hullabaloo about the idea that large reasoning models (LRM) are unable to think. This is mostly due to a research article published by Apple, "The Illusion of Th [...]

Match Score: 137.17