ARC-AGI v2

multimodal

About

ARC-AGI v2 is an enhanced version featuring newly curated tasks designed for more granular assessment of abstract reasoning and problem-solving capabilities. This improved benchmark provides wider scoring ranges, incorporates tasks less susceptible to brute-force solutions, and focuses on deeper human-like thinking in problem-solving. ARC-AGI v2 targets higher levels of fluid intelligence with empirically calibrated difficulty levels compared to human performance.

Evaluation Stats

Total Models4

Organizations4

Verified Results0

Self-Reported1

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

4 models

Top Score

15.9%

Average Score

9.0%

High Performers (80%+)

Top Organizations

#1xAI

1 model

15.9%

#2Anthropic

1 model

8.6%

#3OpenAI

1 model

6.5%

#4Google

1 model

4.9%

Leaderboard

4 models ranked by performance on ARC-AGI v2

			License
#01Grok-4	xAI	Jul 9, 2025	Proprietary	15.9%
#02Claude Opus 4	Anthropic	May 22, 2025	Proprietary	8.6%
#03o3	OpenAI	Apr 16, 2025	Proprietary	6.5%
#04Gemini 2.5 Pro	Google	May 20, 2025	Proprietary	4.9%

Resources

Research Paper