Qwen2 72B Instruct
Zero-eval
#1CMMLU
#1TheoremQA
#2EvalPlus
+1 more
by Alibaba Cloud / Qwen Team
+
+
+
+
About
Qwen2 72B was developed as the flagship model in the Qwen2 generation, designed to provide advanced language understanding with 72 billion parameters. Built to deliver strong performance across diverse tasks, it represented a significant advancement in Qwen's model capabilities when introduced.
+
+
+
+
Timeline
AnnouncedJul 23, 2024
ReleasedJul 23, 2024
+
+
+
+
License & Family
License
tongyi-qianwen
Performance Overview
Performance metrics and category breakdown
Overall Performance
17 benchmarks
Average Score
73.6%
Best Score
91.1%
High Performers (80%+)
9+
+
+
+
All Benchmark Results for Qwen2 72B Instruct
Complete list of benchmark scores with detailed information
| GSM8k | text | 0.91 | 91.1% | Self-reported | |
| CMMLU | text | 0.90 | 90.1% | Self-reported | |
| HellaSwag | text | 0.88 | 87.6% | Self-reported | |
| HumanEval | text | 0.86 | 86.0% | Self-reported | |
| Winogrande | text | 0.85 | 85.1% | Self-reported | |
| C-Eval | text | 0.84 | 83.8% | Self-reported | |
| BBH | text | 0.82 | 82.4% | Self-reported | |
| MMLU | text | 0.82 | 82.3% | Self-reported | |
| MBPP | text | 0.80 | 80.2% | Self-reported | |
| EvalPlus | text | 0.79 | 79.0% | Self-reported |
Showing 1 to 10 of 17 benchmarks
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+