Peektastic.com

OpenAI and Anthropic conducted safety evaluations of each other's AI systems

Most of the time, AI companies are locked in a race to the top, treating each other as rivals and competitors. Today, OpenAI and Anthropic revealed that they agreed to evaluate the alignment of each other's publicly available systems and shared the results of their analyses. The full reports get pretty technical, but are worth a read for anyone who's following the nuts and bolts of AI development. A broad summary showed some flaws with each company's offerings, as well as revealing pointers for how to improve future safety tests. <br /> Anthropic said it evaluated OpenAI models for "sycophancy, whistleblowing, self-preservation, and supporting human misuse, as well as capabilities related to undermining AI safety evaluations and oversight." Its review found [...]

Discover Copy

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

venturebeat

Anthropic's Claude Opus 4.6 brings 1M token context and 'agent teams' to take on OpenAI's Codex

Anthropic on Thursday released Claude Opus 4.6, a major upgrade to its flagship artificial intelligence model that the company says plans more carefully, sustains longer autonomous workflows, and outp [...]

More Copy

Match Score: 143.20

venturebeat

Anthropic says DeepSeek, Moonshot, and MiniMax used 24,000 fake accounts to rip off Claude

Anthropic dropped a bombshell on the artificial intelligence industry Monday, publicly accusing three prominent Chinese AI laboratories — DeepSeek, Moonshot AI, and MiniMax — of orchestrating coor [...]

More Copy

Match Score: 141.94

venturebeat

Anthropic is giving away its powerful Claude Haiku 4.5 AI for free to take on OpenAI

Anthropic released Claude Haiku 4.5 on Wednesday, a smaller and significantly cheaper artificial intelligence model that matches the coding capabilities of systems that were considered cutting-edge ju [...]

More Copy

Match Score: 140.21

venturebeat

Anthropic's Claude Code can now read your Slack messages and write code for you

Anthropic on Monday launched a beta integration that connects its fast-growing Claude Code programming agent directly to Slack, allowing software engineers to delegate coding tasks without leaving the [...]

More Copy

Match Score: 133.97

venturebeat

Anthropic says its most powerful AI cyber model is too dangerous to release publicly — so it built Project Glasswing

Anthropic on Tuesday announced Project Glasswing, a sweeping cybersecurity initiative that pairs an unreleased frontier AI model — Claude Mythos Preview — with a coalition of twelve major technolo [...]

More Copy

Match Score: 118.02

venturebeat

OpenAI’s GPT-5.3-Codex drops as Anthropic upgrades Claude — AI coding wars heat up ahead of Super Bowl ads

OpenAI on Wednesday released GPT-5.3-Codex, which the company calls its most capable coding agent to date, in an announcement timed to land at the exact same moment Anthropic unveiled its own flagship [...]

More Copy

Match Score: 116.71

venturebeat

Artificial Analysis overhauls its AI Intelligence Index, replacing popular benchmarks with 'real-world' tests

The arms race to build smarter AI models has a measurement problem: the tests used to rank them are becoming obsolete almost as quickly as the models improve. On Monday, Artificial Analysis, an indepe [...]

More Copy

Match Score: 115.57

venturebeat

Anthropic’s Claude can now control your Mac, escalating the fight to build AI agents that actually do work

Anthropic on Monday launched the most ambitious consumer AI agent to date, giving its Claude chatbot the ability to directly control a user's Mac — clicking buttons, opening applications, typin [...]

More Copy

Match Score: 114.62

venturebeat

Anthropic just launched Claude Design, an AI tool that turns prompts into prototypes and challenges Figma

Anthropic today launched Claude Design, a new product from its Anthropic Labs division that allows users to create polished visual work — designs, interactive prototypes, slide decks, one-pagers, an [...]

More Copy

Match Score: 114.59