MMT-Bench
multimodal
+
+
+
+
About
MMT-Bench is a comprehensive multimodal benchmark featuring 31,325 meticulously curated multi-choice visual questions designed to assess Large Vision-Language Models across massive multimodal tasks requiring expert knowledge. It covers 32 core meta-tasks and 162 subtasks spanning scenarios like vehicle driving and embodied navigation, evaluating visual recognition, localization, reasoning, and planning capabilities to advance general-purpose multimodal intelligence.
+
+
+
+
Evaluation Stats
Total Models4
Organizations2
Verified Results0
Self-Reported4
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
4 models
Top Score
63.6%
Average Score
60.8%
High Performers (80%+)
0Top Organizations
#1Alibaba Cloud / Qwen Team
1 model
63.6%
#2DeepSeek
3 models
59.9%
+
+
+
+
Leaderboard
4 models ranked by performance on MMT-Bench
License | Links | ||||
---|---|---|---|---|---|
Dec 13, 2024 | deepseek | 63.6% | |||
Jan 26, 2025 | Apache 2.0 | 63.6% | |||
Dec 13, 2024 | deepseek | 62.9% | |||
Dec 13, 2024 | deepseek | 53.2% |