FrontierMath
text
+
+
+
+
About
FrontierMath is an exceptionally challenging mathematical reasoning benchmark featuring hundreds of original, unpublished mathematics problems crafted by expert mathematicians. Created by Epoch AI, this benchmark tests advanced mathematical reasoning capabilities at the frontier of AI capabilities, covering complex mathematical domains that challenge even the most sophisticated AI systems with problems requiring deep mathematical insight and creativity.
+
+
+
+
Evaluation Stats
Total Models6
Organizations1
Verified Results0
Self-Reported6
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
6 models
Top Score
26.3%
Average Score
14.8%
High Performers (80%+)
0Top Organizations
#1OpenAI
6 models
14.8%
+
+
+
+
Leaderboard
6 models ranked by performance on FrontierMath
License | Links | ||||
---|---|---|---|---|---|
Aug 7, 2025 | Proprietary | 26.3% | |||
Aug 7, 2025 | Proprietary | 22.1% | |||
Apr 16, 2025 | Proprietary | 15.8% | |||
Aug 7, 2025 | Proprietary | 9.6% | |||
Jan 30, 2025 | Proprietary | 9.2% | |||
Dec 17, 2024 | Proprietary | 5.5% |