Alibaba Cloud / Qwen Team

Qwen2.5 32B Instruct

Zero-eval
#1MMLU-STEM
#1MBPP+
#2TheoremQA
+1 more

by Alibaba Cloud / Qwen Team

+
+
+
+
About

Qwen2.5 32B Instruct is a language model developed by Alibaba Cloud / Qwen Team. It achieves strong performance with an average score of 74.3% across 18 benchmarks. It excels particularly in GSM8k (95.9%), HumanEval (88.4%), HellaSwag (85.2%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Alibaba Cloud / Qwen Team's latest advancement in AI technology.

+
+
+
+
Timeline
AnnouncedSep 19, 2024
ReleasedSep 19, 2024
+
+
+
+
Specifications
Training Tokens18.0T
+
+
+
+
License & Family
License
Apache 2.0
Performance Overview
Performance metrics and category breakdown

Overall Performance

18 benchmarks
Average Score
74.3%
Best Score
95.9%
High Performers (80%+)
10
+
+
+
+
All Benchmark Results for Qwen2.5 32B Instruct
Complete list of benchmark scores with detailed information
GSM8k
text
0.96
95.9%
Self-reported
HumanEval
text
0.88
88.4%
Self-reported
HellaSwag
text
0.85
85.2%
Self-reported
BBH
text
0.84
84.5%
Self-reported
MBPP
text
0.84
84.0%
Self-reported
MMLU-Redux
text
0.84
83.9%
Self-reported
MMLU
text
0.83
83.3%
Self-reported
MATH
text
0.83
83.1%
Self-reported
Winogrande
text
0.82
82.0%
Self-reported
MMLU-STEM
text
0.81
80.9%
Self-reported
Showing 1 to 10 of 18 benchmarks