2025-10-07

A research team from MIT, IBM, and the University of Washington has released TOUCAN, the largest open dataset to date for training AI agents. The dataset contains 1.5 million real tool interactions, aiming to help open models handle external tools more effectively.
The article TOUCAN is the largest open training dataset for AI agents appeared first on THE DECODER.
[...]2025-11-10
Baseten, the AI infrastructure company recently valued at $2.15 billion, is making its most significant product pivot yet: a full-scale push into model training that could reshape how enterprises wean [...]
2025-11-17
AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology [...]
2025-10-17
AI models are only as good as the data they're trained on. That data generally needs to be labeled, curated and organized before models can learn from it in an effective way.One of the big missin [...]
2025-11-19
Researchers at Meta, the University of Chicago, and UC Berkeley have developed a new framework that addresses the high costs, infrastructure complexity, and unreliable feedback associated with using r [...]
2025-12-02
For much of 2025, the frontier of open-weight language models has been defined not in Silicon Valley or New York City, but in Beijing and Hangzhou.Chinese research labs including Alibaba's Qwen, [...]
2025-11-13
Artificial intelligence agents powered by the world's most advanced language models routinely fail to complete even straightforward professional tasks on their own, according to groundbreaking re [...]
2025-12-02
Researchers at MiroMind AI and several Chinese universities have released OpenMMReasoner, a new training framework that improves the capabilities of language models in multimodal reasoning.The framewo [...]
2025-11-10
Meta has just released a new multilingual automatic speech recognition (ASR) system supporting 1,600+ languages — dwarfing OpenAI’s open source Whisper model, which supports just 99. Is architectu [...]
2025-04-17
Wikipedia has been struggling with the impact that AI crawlers — bots that are scraping text and multimedia from the encyclopedia to train generative artificial intelligence models — have been hav [...]