Researchers from Stanford, Nvidia, and Together AI have developed a new technique that can discover new solutions to very complex problems. For example, they managed to optimize a critical GPU kernel to run 2x faster than the previous state-of-the-art written by human experts.Their technique, called “Test-Time Training to Discover” (TTT-Discover), challenges the current paradigm of letting models “think longer” for reasoning problems. TTT-Discover allows the model to continue training during the inference process and update its weights for the problem at hand.The limits of 'frozen' reasoningCurrent enterprise AI strategies often rely on "frozen" models. Whether you use a closed or open reasoning model, the model's parameters are static. When you prompt thes [...]
A new study from researchers at Stanford University and Nvidia proposes a way for AI models to keep learning after deployment — without increasing inference costs. For enterprise agents that have to [...]
Baseten, the AI infrastructure company recently valued at $2.15 billion, is making its most significant product pivot yet: a full-scale push into model training that could reshape how enterprises wean [...]
Every GPU cluster has dead time. Training jobs finish, workloads shift and hardware sits dark while power and cooling costs keep running. For neocloud operators, those empty cycles are lost margin.The [...]
When the transformer architecture was introduced in 2017 in the now seminal Google paper "Attention Is All You Need," it became an instant cornerstone of modern artificial intelligence. Ever [...]
Enterprises expanding AI deployments are hitting an invisible performance wall. The culprit? Static speculators that can't keep up with shifting workloads.Speculators are smaller AI models that w [...]
The standard guidelines for building large language models (LLMs) optimize only for training costs and ignore inference costs. This poses a challenge for real-world applications that use inference-tim [...]
Lowering the cost of inference is typically a combination of hardware and software. A new analysis released Thursday by Nvidia details how four leading inference providers are reporting 4x to 10x redu [...]
Nvidia’s $20 billion strategic licensing deal with Groq represents one of the first clear moves in a four-front fight over the future AI stack. 2026 is when that fight becomes obvious to enterprise [...]