Benchmarks - Latest News & Analysis

In-depth coverage, analysis, and updates on Benchmarks in AI and tech. 6 articles on AI Pulse.

GPT-5 Beats Human Experts on Every Major Benchmark

The new model scores higher than PhD-level humans on medical, legal, and scientific reasoning tests. Sam Altman warns the next version will be 'qualitatively different.'

ARC-AGI-2 Test: Why GPT-5 Failed Human-Level AI

GPT-5 Pro scores 18.3% on the new benchmark. The previous version? 70.2%. Francois Chollet's test exposes what AI still can't do — and it's not what you'd expect.