VideoMME w sub.

multimodal
+
+
+
+
About

VideoMME w/ Sub is the subtitle-enhanced variant of the VideoMME benchmark that includes textual captions and subtitles to support video understanding tasks. This evaluation tests AI models' ability to integrate textual information with visual and audio content, assessing enhanced multimodal comprehension when subtitle assistance is available for video analysis and question answering.

+
+
+
+
Evaluation Stats
Total Models4
Organizations2
Verified Results0
Self-Reported4
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

4 models
Top Score
86.7%
Average Score
77.2%
High Performers (80%+)
1

Top Organizations

#1OpenAI
1 model
86.7%
#2Alibaba Cloud / Qwen Team
3 models
74.0%
+
+
+
+
Leaderboard
4 models ranked by performance on VideoMME w sub.
LicenseLinks
Aug 7, 2025
Proprietary
86.7%
Feb 28, 2025
Apache 2.0
77.9%
Mar 27, 2025
Apache 2.0
72.4%
Jan 26, 2025
Apache 2.0
71.6%
+
+
+
+
Resources