MMStar
multimodal
+
+
+
+
About
MMStar is an elite vision-indispensable multimodal benchmark comprising 1,500 challenge samples meticulously selected by humans. It focuses on evaluating large vision-language models on tasks that absolutely require visual understanding, eliminating questions that can be answered through text alone. This curated benchmark provides more accurate assessment of models' true multimodal capabilities by ensuring vision-dependency in all evaluation tasks.
+
+
+
+
Evaluation Stats
Total Models7
Organizations2
Verified Results0
Self-Reported7
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
7 models
Top Score
70.8%
Average Score
61.8%
High Performers (80%+)
0Top Organizations
#1Alibaba Cloud / Qwen Team
4 models
67.1%
#2DeepSeek
3 models
54.7%
+
+
+
+
Leaderboard
7 models ranked by performance on MMStar
License | Links | ||||
---|---|---|---|---|---|
Jan 26, 2025 | tongyi-qianwen | 70.8% | |||
Feb 28, 2025 | Apache 2.0 | 69.5% | |||
Mar 27, 2025 | Apache 2.0 | 64.0% | |||
Jan 26, 2025 | Apache 2.0 | 63.9% | |||
Dec 13, 2024 | deepseek | 61.3% | |||
Dec 13, 2024 | deepseek | 57.0% | |||
Dec 13, 2024 | deepseek | 45.9% |