Researchers at MiroMind AI and several Chinese universities have released OpenMMReasoner, a new training framework that improves the capabilities of language models in multimodal reasoning.The framework uses a two-stage process. It first refines a base model with a curated dataset in a supervised fine-tuning (SFT) stage. Then, a reinforcement learning (RL) stage guides the model to reason more effectively in tasks that involve both text and visual data. Experiments show that models trained with OpenMMReasoner outperform other leading visual reasoning models, often while being trained on a smaller, higher-quality dataset. The framework and all its assets, including a trained 7B model, are fully open source, providing a reliable foundation for building applications that require traceability [...]
Microsoft on Tuesday released Phi-4-reasoning-vision-15B, a compact open-weight multimodal AI model that the company says matches or exceeds the performance of systems many times its size — while co [...]
AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology [...]
Chinese AI startup Zhipu AI aka Z.ai has released its GLM-4.6V series, a new generation of open-source vision-language models (VLMs) optimized for multimodal reasoning, frontend automation, and high-e [...]
AI models are only as good as the data they're trained on. That data generally needs to be labeled, curated and organized before models can learn from it in an effective way.One of the big missin [...]
Researchers at Google Cloud and UCLA have proposed a new reinforcement learning framework that significantly improves the ability of language models to learn very challenging multi-step reasoning task [...]
Researchers at Nvidia have developed a new technique that flips the script on how large language models (LLMs) learn to reason. The method, called reinforcement learning pre-training (RLP), integrates [...]
Deploying AI agents for repository-scale tasks like bug detection, patch verification, and code review requires overcoming significant technical hurdles. One major bottleneck: the need to set up dynam [...]
Baseten, the AI infrastructure company recently valued at $2.15 billion, is making its most significant product pivot yet: a full-scale push into model training that could reshape how enterprises wean [...]