SWE-Bench - Latest News & Analysis

Claude Opus 4 Shatters SWE-Bench Record with 72% Score

Anthropic's latest model autonomously fixes real GitHub issues better than any AI before. Developers report it can now handle multi-file refactors that took hours.

news The Pulse Gazette Feb 8, 2026