MLVU-M

text

About

MLVU-M is a variant of the Multi-task Long Video Understanding Benchmark (MLVU) specifically designed for multimodal evaluation. It extends the core MLVU framework to assess models' ability to understand and reason about long video content through multimodal inputs, testing advanced video comprehension capabilities across different video genres and temporal contexts while emphasizing multimodal integration and reasoning.

Evaluation Stats

Total Models1

Organizations1

Verified Results0

Self-Reported1

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

1 models

Top Score

74.6%

Average Score

74.6%

High Performers (80%+)

Top Organizations

#1Alibaba Cloud / Qwen Team

1 model

74.6%

Leaderboard

1 models ranked by performance on MLVU-M

			License		Links
#01Qwen2.5 VL 72B Instruct	Alibaba Cloud / Qwen Team	Jan 26, 2025	tongyi-qianwen	74.6%