Alibaba Cloud / Qwen Team

Qwen2.5 14B Instruct

Zero-eval
#2MMLU-STEM
#2MBPP+

by Alibaba Cloud / Qwen Team

+
+
+
+
About

Qwen2.5 14B Instruct is a language model developed by Alibaba Cloud / Qwen Team. It achieves strong performance with an average score of 70.0% across 16 benchmarks. It excels particularly in GSM8k (94.8%), HumanEval (83.5%), MBPP (82.0%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Alibaba Cloud / Qwen Team's latest advancement in AI technology.

+
+
+
+
Timeline
AnnouncedSep 19, 2024
ReleasedSep 19, 2024
+
+
+
+
Specifications
Training Tokens18.0T
+
+
+
+
License & Family
License
Apache 2.0
Performance Overview
Performance metrics and category breakdown

Overall Performance

16 benchmarks
Average Score
70.0%
Best Score
94.8%
High Performers (80%+)
5
+
+
+
+
All Benchmark Results for Qwen2.5 14B Instruct
Complete list of benchmark scores with detailed information
GSM8k
text
0.95
94.8%
Self-reported
HumanEval
text
0.83
83.5%
Self-reported
MBPP
text
0.82
82.0%
Self-reported
MMLU-Redux
text
0.80
80.0%
Self-reported
MATH
text
0.80
80.0%
Self-reported
MMLU
text
0.80
79.7%
Self-reported
BBH
text
0.78
78.2%
Self-reported
MMLU-STEM
text
0.76
76.4%
Self-reported
MultiPL-E
text
0.73
72.8%
Self-reported
ARC-C
text
0.67
67.3%
Self-reported
Showing 1 to 10 of 16 benchmarks