2025-10-08
The latest addition to the small model wave for enterprises comes from AI21 Labs, which is betting that bringing models to devices will free up traffic in data centers.
AI21’s Jamba Reasoning 3B, a “tiny” open-source model that can run extended reasoning, code generation and respond based on ground truth. Jamba Reasoning 3B handles more than 250,000 tokens and can run inference on edge devices.
The company said Jamba Reasoning 3B works on devices such as laptops and mobile phones.
Ori Goshen, co-CEO of AI21, told VentureBeat that the company sees more enterprise use cases for small models, mainly because moving most inference to devices frees up data centers.
“What we're seeing right now in the industry [...]
2025-10-02
IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]
2025-10-16
A new framework from Stanford University and SambaNova addresses a critical challenge in building robust AI agents: context engineering. Called Agentic Context Engineering (ACE), the framework automat [...]
2025-10-27
Watch out, DeepSeek and Qwen! There's a new king of open source large language models (LLMs), especially when it comes to something enterprises are increasingly valuing: agentic tool use — that [...]
2025-10-08
The trend of AI researchers developing new, small open source generative models that outperform far larger, proprietary peers continued this week with yet another staggering advancement.Alexia Jolicoe [...]
2025-05-27
Microsoft's recent release of Phi-4-reasoning challenges a key assumption in building artificial intelligence systems capable of reasoning. Since the introduction of chain-of-thought reasoning in [...]
2025-10-20
Researchers at Mila have proposed a new technique that makes large language models (LLMs) vastly more efficient when performing complex reasoning. Called Markovian Thinking, the approach allows LLMs t [...]
2025-10-09
Researchers at Nvidia have developed a new technique that flips the script on how large language models (LLMs) learn to reason. The method, called reinforcement learning pre-training (RLP), integrates [...]