GDPVal

Agents
+
+
+
+
About

GDPVal evaluates AI models on well-specified professional tasks across finance, healthcare, government, and other GDP-contributing sectors, measuring readiness for real-world occupational work.

+
+
+
+
Evaluation Stats
Total Models11
Organizations4
Verified Results0
Self-Reported0
+
+
+
+
Benchmark Details
Max Score100
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

11 models
Top Score
49.7%
Average Score
33.3%
High Performers (80%+)
0

Top Organizations

#1Anthropic
3 models
43.9%
#2Google DeepMind
2 models
31.8%
#3OpenAI
5 models
30.1%
#4xAI
1 model
21.1%
+
+
+
+
Leaderboard
11 models ranked by performance on GDPVal
LicenseLinks
Dec 11, 2025
Proprietary
49.7%
Nov 24, 2025
Proprietary
45.5%
Aug 5, 2025
Proprietary
43.6%
Sep 29, 2025
Proprietary
42.5%
Nov 18, 2025
Proprietary
40.3%
Aug 7, 2025
Proprietary
34.8%
Apr 16, 2025
Proprietary
30.8%
Apr 16, 2025
Proprietary
25.3%
Mar 25, 2025
Proprietary
23.3%
Jul 9, 2025
Proprietary
21.1%
Showing 1 to 10 of 11 models
+
+
+
+
Additional Metrics
Extended metrics for top models on GDPVal
ModelScoreHuman Equiv. Rate
GPT-5.249.770.9%
Claude Opus 4.545.559.6%
Claude Opus 4.143.647.6%
Claude Sonnet 4.542.550.3%
Gemini 3 Pro40.353.5%
GPT-534.838%
o330.834.1%
o4 mini25.327.8%
Gemini 2.5 Pro23.325.5%
Grok 421.124.3%
GPT-4o9.912.3%
+
+
+
+
Resources