Destination

2025-10-07

TOUCAN is the largest open training dataset for AI agents


A research team from MIT, IBM, and the University of Washington has released TOUCAN, the largest open dataset to date for training AI agents. The dataset contains 1.5 million real tool interactions, aiming to help open models handle external tools more effectively.


The article TOUCAN is the largest open training dataset for AI agents appeared first on THE DECODER.

[...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

Destination

2025-04-17

Wikipedia offers AI developers a training dataset to maybe get scraper bots off its back

Wikipedia has been struggling with the impact that AI crawlers — bots that are scraping text and multimedia from the encyclopedia to train generative artificial intelligence models — have been hav [...]

Match Score: 105.97

venturebeat

2025-10-02

New AI training method creates powerful software agents with just 78 examples

A new study by Shanghai Jiao Tong University and SII Generative AI Research Lab (GAIR) shows that training large language models (LLMs) for complex, autonomous tasks does not require massive datasets. [...]

Match Score: 89.15

Destination

2025-06-06

Researchers build massive AI training dataset using only openly licensed sources

The Common Pile is the first large-scale text dataset built entirely from openly licensed sources, offering an alternative to web data restricted by copyright.<br /> The article Researchers buil [...]

Match Score: 59.03

Destination

2025-07-10

How exactly did Grok go full 'MechaHitler?'

Earlier this week, Grok, X's built-in chatbot, took a hard turn toward antisemitism following a recent update. Amid unprompted, hateful rhetoric against Jews, it even began referring to itself as [...]

Match Score: 57.47

venturebeat

2025-10-01

Thinking Machines' first official product is here: meet Tinker, an API for distributed LLM fine-tuning

Thinking Machines, the AI startup founded earlier this year by former OpenAI CTO Mira Murati, has launched its first product: Tinker, a Python-based API designed to make large language model (LLM) fin [...]

Match Score: 55.62

Destination

2025-08-05

OpenAI's first new open-weight LLMs in six years are here

For the first time since GPT-2 in 2019, OpenAI is releasing new open-weight large language models. It's a major milestone for a company that has increasingly been accused of forgoing its original [...]

Match Score: 51.32

engadget

2025-10-01

Peloton updates its Bike, Tread and Row machines with form-checking cameras, rotating screens and lots of AI

It’s been a rough time for Peloton. Last year was marred by deep staff cuts, a change of CEO and a reckoning of where the home fitness company belonged, post-Pandemic boom. The answer is, unfortunat [...]

Match Score: 48.32

venturebeat

2025-10-02

'Western Qwen': IBM wows with Granite 4 LLM launch and hybrid Mamba/Transformer architecture

IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]

Match Score: 44.60

venturebeat

2025-10-07

IBM claims 45% productivity gains with Project Bob, its multi-model IDE that orchestrates LLMs with full repository context

For many enterprises, there continue to be barriers to fully adopting and benefiting from agentic AI.IBM is betting the blocker isn't building AI agents but governing them in production.At its Te [...]

Match Score: 44.33