
Qwen2 72B Instruct
Zero-eval
#1CMMLU
#1TheoremQA
#2EvalPlus
+1 more
by Alibaba Cloud / Qwen Team
+
+
+
+
About
Qwen2 72B Instruct is a language model developed by Alibaba Cloud / Qwen Team. It achieves strong performance with an average score of 73.6% across 17 benchmarks. It excels particularly in GSM8k (91.1%), CMMLU (90.1%), HellaSwag (87.6%). Released in 2024, it represents Alibaba Cloud / Qwen Team's latest advancement in AI technology.
+
+
+
+
Timeline
AnnouncedJul 23, 2024
ReleasedJul 23, 2024
+
+
+
+
License & Family
License
tongyi-qianwen
Performance Overview
Performance metrics and category breakdown
Overall Performance
17 benchmarks
Average Score
73.6%
Best Score
91.1%
High Performers (80%+)
9+
+
+
+
All Benchmark Results for Qwen2 72B Instruct
Complete list of benchmark scores with detailed information
GSM8k | text | 0.91 | 91.1% | Self-reported | |
CMMLU | text | 0.90 | 90.1% | Self-reported | |
HellaSwag | text | 0.88 | 87.6% | Self-reported | |
HumanEval | text | 0.86 | 86.0% | Self-reported | |
Winogrande | text | 0.85 | 85.1% | Self-reported | |
C-Eval | text | 0.84 | 83.8% | Self-reported | |
BBH | text | 0.82 | 82.4% | Self-reported | |
MMLU | text | 0.82 | 82.3% | Self-reported | |
MBPP | text | 0.80 | 80.2% | Self-reported | |
EvalPlus | text | 0.79 | 79.0% | Self-reported |
Showing 1 to 10 of 17 benchmarks