MMBench-V1.1
Multilingual
multimodal
+
+
+
+
About
MMBench v1.1 is an updated version of the MMBench benchmark featuring enhanced evaluation protocols and expanded question diversity for vision-language model assessment. This iteration incorporates improvements based on community feedback, refined evaluation metrics, and additional bilingual visual reasoning tasks to provide more comprehensive and accurate evaluation of multimodal capabilities.
+
+
+
+
Evaluation Stats
Total Models4
Organizations2
Verified Results0
Self-Reported4
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
4 models
Top Score
81.8%
Average Score
77.2%
High Performers (80%+)
1Top Organizations
#1Alibaba Cloud / Qwen Team
1 model
81.8%
#2DeepSeek
3 models
75.6%
+
+
+
+
Leaderboard
4 models ranked by performance on MMBench-V1.1
License | Links | ||||
---|---|---|---|---|---|
Mar 27, 2025 | Apache 2.0 | 81.8% | |||
Dec 13, 2024 | deepseek | 79.3% | |||
Dec 13, 2024 | deepseek | 79.2% | |||
Dec 13, 2024 | deepseek | 68.3% |