MMLU-Base
text
+
+
+
+
About
MMLU-Base represents the foundational version of the Massive Multitask Language Understanding benchmark, providing baseline evaluation across 57 academic and professional domains. It serves as the standard reference point for measuring language models' broad knowledge and reasoning capabilities, covering subjects from elementary mathematics to advanced professional fields like law and medicine.
+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
1 models
Top Score
68.0%
Average Score
68.0%
High Performers (80%+)
0Top Organizations
#1Alibaba Cloud / Qwen Team
1 model
68.0%
+
+
+
+
Leaderboard
1 models ranked by performance on MMLU-Base
License | Links | ||||
---|---|---|---|---|---|
Sep 19, 2024 | Apache 2.0 | 68.0% |