Qwen2.5-Omni-7B

Name: Qwen2.5-Omni-7B
Rating: 73.2 (1 reviews)
Author: Alibaba / Qwen

Multimodal

by Alibaba / Qwen

About

Qwen2.5-Omni-7B is a 7-billion-parameter end-to-end multimodal model from Alibaba, released in March 2025 as part of the Omni series designed to unify perception and generation across text, images, audio, and video in a single model architecture. Unlike pipeline-based multimodal systems, it processes all modalities end-to-end and can generate both text and speech outputs, targeting use cases in voice assistants, multimodal agents, and real-time interactive applications. Its compact size made it notable for on-device and resource-constrained multimodal deployments.

Timeline

ReleasedMar 26, 2025

Specifications

Capabilities

Multimodal

License & Family

License

Apache 2.0

Performance Overview

Performance metrics and category breakdown

1 benchmarks

Average Score

73.2%

Best Score

73.2%

High Performers (80%+)

Coding

73.2%

All Benchmark Results for Qwen2.5-Omni-7B

Complete list of benchmark scores with detailed information


MBPP	Coding		73.20	73.2%	Unverified

Resources