MVBench

multimodal

About

MVBench is a comprehensive multimodal video understanding benchmark covering 20 challenging video tasks that cannot be effectively solved with a single frame. It employs a novel static-to-dynamic method to generate temporal-related tasks, evaluating models' temporal understanding capabilities ranging from perception to cognition through multiple-choice QA format with ground-truth video annotations.

Evaluation Stats

Total Models4

Organizations1

Verified Results0

Self-Reported4

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

4 models

Top Score

73.6%

Average Score

71.0%

High Performers (80%+)

Top Organizations

#1Alibaba Cloud / Qwen Team

4 models

71.0%

Leaderboard

4 models ranked by performance on MVBench

			License
#01Qwen2-VL-72B-Instruct	Alibaba Cloud / Qwen Team	Aug 29, 2024	tongyi-qianwen	73.6%
#02Qwen2.5 VL 72B Instruct	Alibaba Cloud / Qwen Team	Jan 26, 2025	tongyi-qianwen	70.4%
#03Qwen2.5-Omni-7B	Alibaba Cloud / Qwen Team	Mar 27, 2025	Apache 2.0	70.3%
#04Qwen2.5 VL 7B Instruct	Alibaba Cloud / Qwen Team	Jan 26, 2025	Apache 2.0	69.6%

Resources

Research Paper