Qwen2.5-Omni-7B
Multimodal
by Alibaba / Qwen
+
+
+
+
About
Qwen2.5-Omni-7B is a 7-billion-parameter end-to-end multimodal model from Alibaba, released in March 2025 as part of the Omni series designed to unify perception and generation across text, images, audio, and video in a single model architecture. Unlike pipeline-based multimodal systems, it processes all modalities end-to-end and can generate both text and speech outputs, targeting use cases in voice assistants, multimodal agents, and real-time interactive applications. Its compact size made it notable for on-device and resource-constrained multimodal deployments.
+
+
+
+
Timeline
ReleasedMar 26, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Apache 2.0
Performance Overview
Performance metrics and category breakdown
Overall Performance
1 benchmarks
Average Score
73.2%
Best Score
73.2%
High Performers (80%+)
0Top Categories
Coding
73.2%
+
+
+
+
All Benchmark Results for Qwen2.5-Omni-7B
Complete list of benchmark scores with detailed information
| MBPP | Coding | 73.20 | 73.2% | Unverified |