MathVista
multimodal
+
+
+
+
About
MathVista is a comprehensive benchmark for evaluating mathematical reasoning in visual contexts, featuring 6,141 examples from 28 existing datasets. This benchmark combines challenges from diverse mathematical and visual tasks, testing AI models' ability to perform mathematical reasoning with charts, plots, geometric figures, and scientific diagrams across multiple mathematical domains and visual representations.
+
+
+
+
Evaluation Stats
Total Models35
Organizations10
Verified Results0
Self-Reported33
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
35 models
Top Score
86.8%
Average Score
62.6%
High Performers (80%+)
2Top Organizations
#1Moonshot AI
1 model
74.9%
#2Alibaba Cloud / Qwen Team
2 models
69.7%
#3Anthropic
1 model
67.7%
#4Mistral AI
3 models
64.8%
#5OpenAI
11 models
63.5%
+
+
+
+
Leaderboard
35 models ranked by performance on MathVista
License | Links | ||||
---|---|---|---|---|---|
Apr 16, 2025 | Proprietary | 86.8% | |||
Apr 16, 2025 | Proprietary | 84.3% | |||
Jan 20, 2025 | Proprietary | 74.9% | |||
Apr 5, 2025 | Llama 4 Community License Agreement | 73.7% | |||
Apr 14, 2025 | Proprietary | 73.1% | |||
Feb 27, 2025 | Proprietary | 72.3% | |||
Apr 14, 2025 | Proprietary | 72.2% | |||
Dec 17, 2024 | Proprietary | 71.8% | |||
Dec 25, 2024 | Qwen | 71.4% | |||
Apr 5, 2025 | Llama 4 Community License Agreement | 70.7% |
Showing 1 to 10 of 35 models