AIME 2024
text
+
+
+
+
About
The AIME 2024 benchmark evaluates AI models' mathematical reasoning using 15 problems from the 2024 American Invitational Mathematics Examination. This challenging test requires step-by-step problem-solving across algebra, geometry, and number theory, with integer answers from 000-999. Models must demonstrate olympiad-level mathematical capabilities that qualify top high school students for USAMO. The benchmark uses exact match scoring and multiple runs to assess advanced logical reasoning.
+
+
+
+
Evaluation Stats
Total Models45
Organizations11
Verified Results0
Self-Reported45
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
45 models
Top Score
95.8%
Average Score
73.2%
High Performers (80%+)
24Top Organizations
#1xAI
2 models
94.5%
#2Zhipu AI
2 models
90.2%
#3Google
3 models
84.4%
#4IBM
2 models
81.2%
#5Anthropic
1 model
80.0%
+
+
+
+
Leaderboard
45 models ranked by performance on AIME 2024
License | Links | ||||
---|---|---|---|---|---|
Feb 17, 2025 | Proprietary | 95.8% | |||
Apr 16, 2025 | Proprietary | 93.4% | |||
Feb 17, 2025 | Proprietary | 93.3% | |||
May 20, 2025 | Proprietary | 92.0% | |||
Apr 16, 2025 | Proprietary | 91.6% | |||
May 28, 2025 | MIT | 91.4% | |||
Jul 28, 2025 | MIT | 91.0% | |||
Jul 28, 2025 | MIT | 89.4% | |||
May 20, 2025 | Proprietary | 88.0% | |||
Jan 30, 2025 | Proprietary | 87.3% |
Showing 1 to 10 of 45 models
...