Destination

2025-04-20

OpenAI's o3 achieves near-perfect performance on long context benchmark

One of the most compelling results in recent o3 benchmarks comes from its performance on long-context tasks.


One of the most compelling results in recent o3 benchmarks comes from its performance on long-context tasks.


The article OpenAI's o3 achieves near-perfect performance on long context benchmark appeared first on THE DECODER.

[...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

2025-10-16

ACE prevents context collapse with ‘evolving playbooks’ for self-improving AI agents

A new framework from Stanford University and SambaNova addresses a critical challenge in building robust AI agents: context engineering. Called Agentic Context Engineering (ACE), the framework automat [...]

Match Score: 144.97

venturebeat

2025-10-27

MiniMax-M2 is the new king of open source LLMs (especially for agentic tool calling)

Watch out, DeepSeek and Qwen! There's a new king of open source large language models (LLMs), especially when it comes to something enterprises are increasingly valuing: agentic tool use — that [...]

Match Score: 102.12

venturebeat

2025-10-21

DeepSeek drops open-source model that compresses text 10x through images, defying conventions

DeepSeek, the Chinese artificial intelligence research company that has repeatedly challenged assumptions about AI development costs, has released a new model that fundamentally reimagines how large l [...]

Match Score: 83.14

venturebeat

2025-10-29

The missing data link in enterprise AI: Why agents need streaming context, not just better prompts

Enterprise AI agents today face a fundamental timing problem: They can't easily act on critical business events because they aren't always aware of them in real-time.The challenge is infrast [...]

Match Score: 71.52

venturebeat

2025-10-02

'Western Qwen': IBM wows with Granite 4 LLM launch and hybrid Mamba/Transformer architecture

IBM today announced the release of Granite 4.0, the newest generation of its homemade family of open source large language models (LLMs) designed to balance high performance with lower memory and cost [...]

Match Score: 63.71

venturebeat

2025-10-03

OpenAI's DevDay 2025 preview: Will Sam Altman launch the ChatGPT browser?

OpenAI will host more than 1,500 developers at its largest annual conference on Monday, as the company behind ChatGPT seeks to maintain its edge in an increasingly competitive artificial intelligence [...]

Match Score: 61.52

venturebeat

2025-10-09

The most important OpenAI announcement you probably missed at DevDay 2025

OpenAI’s annual developer conference on Monday was a spectacle of ambitious AI product launches, from an app store for ChatGPT to a stunning video-generation API that brought creative concepts to li [...]

Match Score: 58.56

venturebeat

2025-10-29

Agentic AI is all about the context — engineering, that is

Presented by ElasticAs organizations scramble to enact agentic AI solutions, accessing proprietary data from all the nooks and crannies will be keyBy now, most organizations have heard of agentic AI, [...]

Match Score: 57.22

Destination

2025-11-02

Pangram achieves near-perfect results in AI text detection tests, study reveals

A new study from the University of Chicago finds major differences among commercial AI text detectors. While one tool performs nearly flawlessly, others fall short in key areas.<br /> The articl [...]

Match Score: 53.24