Alibaba Cloud / Qwen Team

Qwen2.5-Omni-7B

Multimodal
Zero-eval
#1VocalSound
#1GiantSteps Tempo
#1MMBench-V1.1
+24 more

by Alibaba Cloud / Qwen Team

+
+
+
+
About

Qwen2.5-Omni-7B is a multimodal language model developed by Alibaba Cloud / Qwen Team. The model shows competitive results across 45 benchmarks. It excels particularly in DocVQA (95.2%), VocalSound (93.9%), GSM8k (88.7%). As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents Alibaba Cloud / Qwen Team's latest advancement in AI technology.

+
+
+
+
Timeline
AnnouncedMar 27, 2025
ReleasedMar 27, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Apache 2.0
Performance Overview
Performance metrics and category breakdown

Overall Performance

45 benchmarks
Average Score
59.2%
Best Score
95.2%
High Performers (80%+)
8
+
+
+
+
All Benchmark Results for Qwen2.5-Omni-7B
Complete list of benchmark scores with detailed information
DocVQA
multimodal
0.95
95.2%
Self-reported
VocalSound
audio
0.94
93.9%
Self-reported
GSM8k
text
0.89
88.7%
Self-reported
GiantSteps Tempo
audio
0.88
88.0%
Self-reported
ChartQA
multimodal
0.85
85.3%
Self-reported
TextVQA
multimodal
0.84
84.4%
Self-reported
AI2D
multimodal
0.83
83.2%
Self-reported
MMBench-V1.1
multimodal
0.82
81.8%
Self-reported
HumanEval
text
0.79
78.7%
Self-reported
CRPErelation
text
0.77
76.5%
Self-reported
Showing 1 to 10 of 45 benchmarks
...