MMLU-redux-2.0

text
+
+
+
+
About

MMLU-Redux 2.0 represents the second iteration of the refined Massive Multitask Language Understanding benchmark, incorporating additional improvements in question quality, evaluation metrics, and domain coverage. This version offers enhanced reliability and more comprehensive assessment of language models' multidisciplinary knowledge and reasoning abilities.

+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

1 models
Top Score
90.2%
Average Score
90.2%
High Performers (80%+)
1

Top Organizations

#1Moonshot AI
1 model
90.2%
+
+
+
+
Leaderboard
1 models ranked by performance on MMLU-redux-2.0
LicenseLinks
Jul 11, 2025
MIT
90.2%
+
+
+
+
Resources