Alibaba Cloud / Qwen Team

Qwen2.5-Coder 7B Instruct

Zero-eval
#1MMLU-Base
#1CRUXEval-Input-CoT
#1CRUXEval-Output-CoT
+3 more

by Alibaba Cloud / Qwen Team

+
+
+
+
About

Qwen2.5-Coder 7B Instruct is a language model developed by Alibaba Cloud / Qwen Team. The model shows competitive results across 19 benchmarks. It excels particularly in HumanEval (88.4%), GSM8k (83.9%), MBPP (83.5%). It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Alibaba Cloud / Qwen Team's latest advancement in AI technology.

+
+
+
+
Timeline
AnnouncedSep 19, 2024
ReleasedSep 19, 2024
+
+
+
+
Specifications
Training Tokens5.5T
+
+
+
+
License & Family
License
Apache 2.0
Base ModelQwen2.5 7B Instruct
Performance Overview
Performance metrics and category breakdown

Overall Performance

19 benchmarks
Average Score
58.0%
Best Score
88.4%
High Performers (80%+)
3
+
+
+
+
All Benchmark Results for Qwen2.5-Coder 7B Instruct
Complete list of benchmark scores with detailed information
HumanEval
text
0.88
88.4%
Self-reported
GSM8k
text
0.84
83.9%
Self-reported
MBPP
text
0.83
83.5%
Self-reported
HellaSwag
text
0.77
76.8%
Self-reported
Winogrande
text
0.73
72.9%
Self-reported
MMLU-Base
text
0.68
68.0%
Self-reported
MMLU
text
0.68
67.6%
Self-reported
MMLU-Redux
text
0.67
66.6%
Self-reported
ARC-C
text
0.61
60.9%
Self-reported
CRUXEval-Input-CoT
text
0.56
56.5%
Self-reported
Showing 1 to 10 of 19 benchmarks