VideoMMMU

multimodal

About

Video-MMMU is a massive multi-modal, multi-disciplinary video benchmark evaluating large multimodal models' knowledge acquisition capabilities from educational videos across diverse academic disciplines. Featuring 300 expert-level videos and 900 human-annotated questions, this comprehensive evaluation tests perception, comprehension, and adaptation abilities, measuring how effectively AI models learn from educational video content.

Evaluation Stats

Total Models4

Organizations2

Verified Results0

Self-Reported4

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

4 models

Top Score

84.6%

Average Score

78.2%

High Performers (80%+)

Top Organizations

#1Google

1 model

83.6%

#2OpenAI

3 models

76.4%

Leaderboard

4 models ranked by performance on VideoMMMU

			License
#01GPT-5	OpenAI	Aug 7, 2025	Proprietary	84.6%
#02Gemini 2.5 Pro Preview 06-05	Google	Jun 5, 2025	Proprietary	83.6%
#03o3	OpenAI	Apr 16, 2025	Proprietary	83.3%
#04GPT-4o	OpenAI	Aug 6, 2024	Proprietary	61.2%

Resources

Research Paper