Destination
OpenAI releases Evals API for systematic prompt testing

OpenAI has introduced an Evals API that enables programmatic test creation and automation.<br /> The article OpenAI releases Evals API for systematic prompt testing appeared first on THE DECODER. [...]

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat
Three AI coding agents leaked secrets through a single prompt injection. One vendor's system card predicted it

A security researcher, working with colleagues at Johns Hopkins University, opened a GitHub pull request, typed a malicious instruction into the PR title, and watched Anthropic’s Claude Code Securit [...]

Match Score: 94.87

venturebeat
Grok 4.1 Fast's compelling dev access and Agent Tools API overshadowed by Musk glazing

Elon Musk's frontier generative AI startup xAI formally opened developer access to its Grok 4.1 Fast models last night and introduced a new Agent Tools API—but the technical milestones were imm [...]

Match Score: 89.77

venturebeat
When Claude changed, everything changed: Managing AI blast radius in production

Our system did one thing, and it did it well: It turned natural-language questions into API calls.The users were analysts, account managers, and operations leads. They knew what data they needed, but [...]

Match Score: 89.36

venturebeat
Microsoft and OpenAI gut their exclusive deal, freeing OpenAI to sell on AWS and Google Cloud

Microsoft and OpenAI on Monday announced a sweeping overhaul of the partnership that has defined the commercial AI era, dismantling key pillars of exclusivity and revenue-sharing that bound the two co [...]

Match Score: 87.65

venturebeat
OpenAI deploys Cerebras chips for 15x faster code generation in first major move beyond Nvidia

OpenAI on Thursday launched GPT-5.3-Codex-Spark, a stripped-down coding model engineered for near-instantaneous response times, marking the company's first significant inference partnership outsi [...]

Match Score: 77.63

venturebeat
OpenAI launches GPT-5.4 with native computer use mode, financial plugins for Microsoft Excel, Google Sheets

The AI updates aren't slowing down. Literally two days after OpenAI launched a new underlying AI model for ChatGPT called GPT-5.3 Instant, the company has unveiled another, even more massive upgr [...]

Match Score: 75.01

venturebeat
Why Google's new Interactions API is such a big deal for AI developers

For the last two years, the fundamental unit of generative AI development has been the "completion." You send a text prompt to a model, it sends text back, and the transaction ends. If you w [...]

Match Score: 71.95

venturebeat
OpenAI admits prompt injection is here to stay as enterprises lag on defenses

It's refreshing when a leading AI company states the obvious. In a detailed post on hardening ChatGPT Atlas against prompt injection, OpenAI acknowledged what security practitioners have known fo [...]

Match Score: 63.18

venturebeat
OpenAI's AI data agent, built by two engineers, now serves 4,000 employees — and the company says anyone can replicate it

When an OpenAI finance analyst needed to compare revenue across geographies and customer cohorts last year, it took hours of work — hunting through 70,000 datasets, writing SQL queries, verifying ta [...]

Match Score: 62.44