Qwen2.5-Omni-7B

Multimodal

by Alibaba / Qwen

+
+
+
+
About

Qwen2.5-Omni-7B is a 7-billion-parameter end-to-end multimodal model from Alibaba, released in March 2025 as part of the Omni series designed to unify perception and generation across text, images, audio, and video in a single model architecture. Unlike pipeline-based multimodal systems, it processes all modalities end-to-end and can generate both text and speech outputs, targeting use cases in voice assistants, multimodal agents, and real-time interactive applications. Its compact size made it notable for on-device and resource-constrained multimodal deployments.

+
+
+
+
Timeline
ReleasedMar 26, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Apache 2.0
Performance Overview
Performance metrics and category breakdown

Overall Performance

1 benchmarks
Average Score
73.2%
Best Score
73.2%
High Performers (80%+)
0

Top Categories

Coding
73.2%
+
+
+
+
All Benchmark Results for Qwen2.5-Omni-7B
Complete list of benchmark scores with detailed information
MBPP
Coding
73.20
73.2%
Unverified