GDPVal

Agents

About

GDPVal evaluates AI models on well-specified professional tasks across finance, healthcare, government, and other GDP-contributing sectors, measuring readiness for real-world occupational work.

Evaluation Stats

Total Models11

Organizations4

Verified Results0

Self-Reported0

Benchmark Details

Max Score100

Performance Overview

Score distribution and top performers

Score Distribution

11 models

Top Score

49.7%

Average Score

33.3%

High Performers (80%+)

Top Organizations

#1Anthropic

3 models

43.9%

#2Google DeepMind

2 models

31.8%

#3OpenAI

5 models

30.1%

#4xAI

1 model

21.1%

Leaderboard

11 models ranked by performance on GDPVal

			License
#01GPT-5.2	OpenAI	Dec 11, 2025	Proprietary	49.7%
#02Claude Opus 4.5	Anthropic	Nov 1, 2025	Proprietary	45.5%
#03Claude Opus 4.1	Anthropic	Aug 5, 2025	Proprietary	43.6%
#04Claude Sonnet 4.5	Anthropic	Sep 29, 2025	Proprietary	42.5%
#05Gemini 3 Pro	Google DeepMind	Nov 18, 2025	Proprietary	40.3%
#06GPT-5	OpenAI	Aug 7, 2025	Proprietary	34.8%
#07o3	OpenAI	Apr 16, 2025	Proprietary	30.8%
#08o4 mini	OpenAI	Apr 16, 2025	Proprietary	25.3%
#09Gemini 2.5 Pro	Google DeepMind	May 20, 2025	Proprietary	23.3%
#10Grok 4	xAI	Jul 10, 2025	Proprietary	21.1%

Showing 1 to 10 of 11 models

Additional Metrics

Extended metrics for top models on GDPVal

Model	Score	Human Equiv. Rate
GPT-5.2	49.7	70.9%
Claude Opus 4.5	45.5	59.6%
Claude Opus 4.1	43.6	47.6%
Claude Sonnet 4.5	42.5	50.3%
Gemini 3 Pro	40.3	53.5%
GPT-5	34.8	38%
o3	30.8	34.1%
o4 mini	25.3	27.8%
Gemini 2.5 Pro	23.3	25.5%
Grok 4	21.1	24.3%
GPT-4o	9.9	12.3%

Resources

Source Leaderboard