MME
multimodal
+
+
+
+
About
MME (Multimodal Large Language Model Evaluation) is a comprehensive benchmark measuring both perception and cognition abilities across 14 subtasks. It features manually designed instruction-answer pairs to avoid data leakage, evaluating 30+ advanced MLLMs on tasks ranging from basic visual recognition to complex reasoning. The benchmark reveals significant room for improvement in current multimodal models and provides quantitative analysis for model optimization.
+
+
+
+
Evaluation Stats
Total Models3
Organizations1
Verified Results0
Self-Reported3
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
3 models
Top Score
22.5%
Average Score
21.0%
High Performers (80%+)
0Top Organizations
#1DeepSeek
3 models
21.0%
+
+
+
+
Leaderboard
3 models ranked by performance on MME
License | Links | ||||
---|---|---|---|---|---|
Dec 13, 2024 | deepseek | 22.5% | |||
Dec 13, 2024 | deepseek | 21.2% | |||
Dec 13, 2024 | deepseek | 19.1% |