2025-05-06
Mixture-of-Experts (MoE) models are revolutionizing the way we scale AI. By activating only a subset of a model’s components at any given time, MoEs offer a novel approach to managing the trade-off [...]
2025-10-02
IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]
2025-06-15
Social media company Rednote has released its first open-source large language model. The Mixture-of-Experts (MoE) system, called dots.llm1, is designed to match the performance of competing models at [...]
2025-03-11
Breaking: A Big Tech company is ramping up its AI development. (Whaaat??) In this case, the protagonist of this now-familiar tale is Meta, which Reuters reports is testing its first in-house chip for [...]
2025-04-24
As Artificial Intelligence (AI) technology advances, the need for efficient and scalable inference solutions has grown rapidly. Soon, AI inference is expected to become more important than training as [...]
2025-04-10
In a major leap for edge AI processing, NTT Corporation has announced a groundbreaking AI inference chip that can process real-time 4K video at 30 frames per second—using less than 20 watts of power [...]
2025-05-16
Agentic AI has the potential to reshape several industries by enabling autonomous decision-making, real-time adaptability, and proactive problem-solving. As businesses strive to enhance operational ef [...]
2025-10-08
The latest addition to the small model wave for enterprises comes from AI21 Labs, which is betting that bringing models to devices will free up traffic in data centers. AI21’s Jamba Reasoning 3B, a [...]
2025-10-03
IBM has released the fourth generation of its Granite language models. Granite 4.0 uses a hybrid Mamba/Transformer architecture aimed at lowering memory requirements during inference without cutting p [...]