Fields Medalist Timothy Gowers had ChatGPT 5.5 Pro tackle open problems in number theory. The model improved an exponential bound to a polynomial one in under an hour. An MIT researcher involved calls the key idea "completely original." Gowers' takeaway: the bar for mathematical contributions is now proving something LLMs can't.<br /> The article Fields Medalist says ChatGPT 5.5 Pro delivered "PhD-level" math research in under two hours with zero human help appeared first on The Decoder. [...]
Google on Monday unveiled the most significant upgrade to its autonomous research agent capabilities since the product's debut, launching two new agents — Deep Research and Deep Research Max †[...]
Artificial intelligence agents powered by the world's most advanced language models routinely fail to complete even straightforward professional tasks on their own, according to groundbreaking re [...]
One year after emerging from stealth, Strella has raised $14 million in Series A funding to expand its AI-powered customer research platform, the company announced Thursday. The round, led by Bessemer [...]
AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology [...]
Large language models (LLMs) have astounded the world with their capabilities, yet they remain plagued by unpredictability and hallucinations – confidently outputting incorrect information. In high- [...]
Nous Research, the San Francisco-based artificial intelligence startup, released on Tuesday an open-source mathematical reasoning system called Nomos 1 that achieved near-elite human performance on th [...]
OpenAI on Monday launched a set of interactive visual tools inside ChatGPT that let users manipulate mathematical and scientific formulas in real time — a genuinely impressive education feature that [...]
OpenAI launched Codex Security on March 6, entering the application security market that Anthropic had disrupted 14 days earlier with Claude Code Security. Both scanners use LLM reasoning instead of p [...]