MMLU-Pro

text
+
+
+
+
About

MMLU-Pro is an enhanced version of MMLU featuring more challenging reasoning-focused questions with expanded choice sets from four to ten options. It eliminates trivial questions from the original MMLU and demonstrates greater stability under varying prompts. The benchmark causes a 16-33% accuracy drop compared to standard MMLU, better revealing differences in model capabilities and requiring chain-of-thought reasoning for optimal performance.

+
+
+
+
Evaluation Stats
Total Models68
Organizations12
Verified Results0
Self-Reported68
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

68 models
Top Score
85.0%
Average Score
65.6%
High Performers (80%+)
14

Top Organizations

#1Zhipu AI
2 models
83.0%
#2DeepSeek
5 models
82.2%
#3Moonshot AI
4 models
78.5%
#4OpenAI
2 models
73.6%
#5Anthropic
5 models
68.8%
+
+
+
+
Leaderboard
68 models ranked by performance on MMLU-Pro
LicenseLinks
Sep 29, 2025
MIT
85.0%
May 28, 2025
MIT
85.0%
Jul 28, 2025
MIT
84.6%
Jul 25, 2025
Apache 2.0
84.4%
Jan 10, 2025
MIT
83.7%
Jul 22, 2025
Apache 2.0
83.0%
Sep 10, 2025
Apache 2.0
82.7%
Sep 5, 2025
Proprietary
82.5%
Jul 28, 2025
MIT
81.4%
Mar 25, 2025
MIT + Model License (Commercial use allowed)
81.2%
Showing 1 to 10 of 68 models
...
+
+
+
+
Resources