Peektastic.com

venturebeat

AI inference costs dropped up to 10x on Nvidia's Blackwell — but hardware is only half the equation

Lowering the cost of inference is typically a combination of hardware and software. A new analysis released Thursday by Nvidia details how four leading inference providers are reporting 4x to 10x reductions in cost per token.The dramatic cost reductions were achieved using Nvidia's Blackwell platform with open-source models. Production deployment data from Baseten, DeepInfra, Fireworks AI and Together AI shows significant cost improvements across healthcare, gaming, agentic chat, and customer service as enterprises scale AI from pilot projects to millions of users.The 4x to 10x cost reductions reported by inference providers required combining Blackwell hardware with two other elements: optimized software stacks and switching from proprietary to open-source models that now match front [...]

Discover Copy

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

Nvidia’s Vera Rubin is months away — Blackwell is getting faster right now

The big news this week from Nvidia, splashed in headlines across all forms of media, was the company's announcement about its Vera Rubin GPU.This week, Nvidia CEO Jensen Huang used his CES keynot [...]

More Copy

Match Score: 223.21

venturebeat

Nvidia introduces Vera Rubin, a seven-chip AI platform with OpenAI, Anthropic and Meta on board

Nvidia on Monday took the wraps off Vera Rubin, a sweeping new computing platform built from seven chips now in full production — and backed by an extraordinary lineup of customers that includes Ant [...]

More Copy

Match Score: 192.59

venturebeat

Nvidia launches enterprise AI agent platform with Adobe, Salesforce, SAP among 17 adopters at GTC 2026

Jensen Huang walked onto the GTC stage Monday wearing his trademark leather jacket and carrying, as it turned out, the blueprints for a new kind of monopoly.The Nvidia CEO unveiled the Agent Toolkit, [...]

More Copy

Match Score: 179.42

venturebeat

Nvidia's DGX Station is a desktop supercomputer that runs trillion-parameter AI models without the cloud

Nvidia on Monday unveiled a deskside supercomputer powerful enough to run AI models with up to one trillion parameters — roughly the scale of GPT-4 — without touching the cloud. The machine, calle [...]

More Copy

Match Score: 131.58

venturebeat

5% GPU utilization: The $401 billion AI infrastructure problem enterprises can't keep ignoring

For the last 24 months, one narrative justified every over-provisioned data center and bloated IT budget: the GPU scramble. Silicon was the new oil, and H100s traded like contraband. Reserve capacity [...]

More Copy

Match Score: 124.05

venturebeat

Inside Microsoft Ignite: How Microsoft and NVIDIA are redefining the AI stack

Presented by Microsoft and NVIDIAAs the world’s leading platform providers and champions for advancing AI globally, NVIDIA and Microsoft continue to deliver unequaled value for organizations investi [...]

More Copy

Match Score: 123.18

venturebeat

Baseten takes on hyperscalers with new AI training platform that lets you own your model weights

Baseten, the AI infrastructure company recently valued at $2.15 billion, is making its most significant product pivot yet: a full-scale push into model training that could reshape how enterprises wean [...]

More Copy

Match Score: 122.07

venturebeat

Inference is splitting in two — Nvidia’s $20B Groq bet explains its next act

Nvidia’s $20 billion strategic licensing deal with Groq represents one of the first clear moves in a four-front fight over the future AI stack. 2026 is when that fight becomes obvious to enterprise [...]

More Copy

Match Score: 113.25

venturebeat

Together AI's ATLAS adaptive speculator delivers 400% inference speedup by learning from workloads in real-time

Enterprises expanding AI deployments are hitting an invisible performance wall. The culprit? Static speculators that can't keep up with shifting workloads.Speculators are smaller AI models that w [...]

More Copy

Match Score: 99.31