MATH-500
text
+
+
+
+
About
MATH-500 is a curated subset of 500 diverse problems from the MATH benchmark, spanning probability, algebra, trigonometry, and geometry. This streamlined evaluation set provides efficient assessment of AI models' mathematical reasoning capabilities across multiple domains, offering representative coverage of mathematical problem-solving skills while maintaining the challenging nature of competition-level mathematics.
+
+
+
+
Evaluation Stats
Total Models25
Organizations9
Verified Results0
Self-Reported25
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
25 models
Top Score
98.2%
Average Score
92.1%
High Performers (80%+)
23Top Organizations
#1Zhipu AI
2 models
98.2%
#2Moonshot AI
3 models
97.0%
#3NVIDIA
4 models
96.7%
#4Anthropic
1 model
96.2%
#5Microsoft
1 model
94.6%
+
+
+
+
Leaderboard
25 models ranked by performance on MATH-500
| License | Links | ||||
|---|---|---|---|---|---|
| Jul 28, 2025 | MIT | 98.2% | |||
| Jul 28, 2025 | MIT | 98.1% | |||
| Aug 18, 2025 | NVIDIA Open Model License Agreement | 97.8% | |||
| Jul 11, 2025 | MIT | 97.4% | |||
| Sep 5, 2025 | MIT | 97.4% | |||
| Apr 7, 2025 | Llama 3.1 Community License | 97.0% | |||
| Mar 18, 2025 | Llama 3.1 Community License | 96.6% | |||
| Feb 24, 2025 | Proprietary | 96.2% | |||
| Jan 20, 2025 | Proprietary | 96.2% | |||
| Jan 20, 2025 | MIT | 95.9% |
Showing 1 to 10 of 25 models