MMBench-V1.1

Multilingual
multimodal
+
+
+
+
About

MMBench v1.1 is an updated version of the MMBench benchmark featuring enhanced evaluation protocols and expanded question diversity for vision-language model assessment. This iteration incorporates improvements based on community feedback, refined evaluation metrics, and additional bilingual visual reasoning tasks to provide more comprehensive and accurate evaluation of multimodal capabilities.

+
+
+
+
Evaluation Stats
Total Models4
Organizations2
Verified Results0
Self-Reported4
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

4 models
Top Score
81.8%
Average Score
77.2%
High Performers (80%+)
1

Top Organizations

#1Alibaba Cloud / Qwen Team
1 model
81.8%
#2DeepSeek
3 models
75.6%
+
+
+
+
Leaderboard
4 models ranked by performance on MMBench-V1.1
LicenseLinks
Mar 27, 2025
Apache 2.0
81.8%
Dec 13, 2024
deepseek
79.3%
Dec 13, 2024
deepseek
79.2%
Dec 13, 2024
deepseek
68.3%
+
+
+
+
Resources