MMLU-Pro
text
+
+
+
+
About
MMLU-Pro is an enhanced version of MMLU featuring more challenging reasoning-focused questions with expanded choice sets from four to ten options. It eliminates trivial questions from the original MMLU and demonstrates greater stability under varying prompts. The benchmark causes a 16-33% accuracy drop compared to standard MMLU, better revealing differences in model capabilities and requiring chain-of-thought reasoning for optimal performance.
+
+
+
+
Evaluation Stats
Total Models68
Organizations12
Verified Results0
Self-Reported68
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
68 models
Top Score
85.0%
Average Score
65.6%
High Performers (80%+)
14Top Organizations
#1Zhipu AI
2 models
83.0%
#2DeepSeek
5 models
82.2%
#3Moonshot AI
4 models
78.5%
#4OpenAI
2 models
73.6%
#5Anthropic
5 models
68.8%
+
+
+
+
Leaderboard
68 models ranked by performance on MMLU-Pro
License | Links | ||||
---|---|---|---|---|---|
Sep 29, 2025 | MIT | 85.0% | |||
May 28, 2025 | MIT | 85.0% | |||
Jul 28, 2025 | MIT | 84.6% | |||
Jul 25, 2025 | Apache 2.0 | 84.4% | |||
Jan 10, 2025 | MIT | 83.7% | |||
Jul 22, 2025 | Apache 2.0 | 83.0% | |||
Sep 10, 2025 | Apache 2.0 | 82.7% | |||
Sep 5, 2025 | Proprietary | 82.5% | |||
Jul 28, 2025 | MIT | 81.4% | |||
Mar 25, 2025 | MIT + Model License (Commercial use allowed) | 81.2% |
Showing 1 to 10 of 68 models
...