QvQ-72B-Preview
Multimodal
Zero-eval
#1OlympiadBench
#3MathVision
by Alibaba Cloud / Qwen Team
+
+
+
+
About
QVQ-72B Preview was introduced as an experimental visual question answering model, designed to combine vision and language understanding for complex reasoning tasks. Built to demonstrate advanced multimodal reasoning capabilities, it represents Qwen's exploration into models that can analyze and reason about visual information.
+
+
+
+
Timeline
AnnouncedDec 25, 2024
ReleasedDec 25, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Qwen
Base ModelQwen2-VL-72B-Instruct
Performance Overview
Performance metrics and category breakdown
Overall Performance
4 benchmarks
Average Score
49.5%
Best Score
71.4%
High Performers (80%+)
0+
+
+
+
All Benchmark Results for QvQ-72B-Preview
Complete list of benchmark scores with detailed information
| MathVista | multimodal | 0.71 | 71.4% | Self-reported | |
| MMMU | multimodal | 0.70 | 70.3% | Self-reported | |
| MathVision | multimodal | 0.36 | 35.9% | Self-reported | |
| OlympiadBench | multimodal | 0.20 | 20.4% | Self-reported |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+