MMStar

multimodal
+
+
+
+
About

MMStar is an elite vision-indispensable multimodal benchmark comprising 1,500 challenge samples meticulously selected by humans. It focuses on evaluating large vision-language models on tasks that absolutely require visual understanding, eliminating questions that can be answered through text alone. This curated benchmark provides more accurate assessment of models' true multimodal capabilities by ensuring vision-dependency in all evaluation tasks.

+
+
+
+
Evaluation Stats
Total Models7
Organizations2
Verified Results0
Self-Reported7
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

7 models
Top Score
70.8%
Average Score
61.8%
High Performers (80%+)
0

Top Organizations

#1Alibaba Cloud / Qwen Team
4 models
67.1%
#2DeepSeek
3 models
54.7%
+
+
+
+
Leaderboard
7 models ranked by performance on MMStar
LicenseLinks
Jan 26, 2025
tongyi-qianwen
70.8%
Feb 28, 2025
Apache 2.0
69.5%
Mar 27, 2025
Apache 2.0
64.0%
Jan 26, 2025
Apache 2.0
63.9%
Dec 13, 2024
deepseek
61.3%
Dec 13, 2024
deepseek
57.0%
Dec 13, 2024
deepseek
45.9%
+
+
+
+
Resources