MMMUval

multimodal

About

MMMUVal is the validation dataset for the Massive Multi-discipline Multimodal Understanding benchmark, serving as the standard development set for multimodal model evaluation. It provides a controlled environment for model validation across diverse academic disciplines while maintaining the same rigorous standards as the full MMMU benchmark.

Evaluation Stats

Total Models2

Organizations2

Verified Results0

Self-Reported2

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

2 models

Top Score

77.8%

Average Score

71.2%

High Performers (80%+)

Top Organizations

#1Anthropic

1 model

77.8%

#2Alibaba Cloud / Qwen Team

1 model

64.5%

Leaderboard

2 models ranked by performance on MMMUval

			License		Links
#01Claude Sonnet 4.5	Anthropic	Sep 29, 2025	Proprietary	77.8%
#02Qwen2-VL-72B-Instruct	Alibaba Cloud / Qwen Team	Aug 29, 2024	tongyi-qianwen	64.5%

Resources

Research Paper