FrontierMath

text

About

FrontierMath is an exceptionally challenging mathematical reasoning benchmark featuring hundreds of original, unpublished mathematics problems crafted by expert mathematicians. Created by Epoch AI, this benchmark tests advanced mathematical reasoning capabilities at the frontier of AI capabilities, covering complex mathematical domains that challenge even the most sophisticated AI systems with problems requiring deep mathematical insight and creativity.

Evaluation Stats

Total Models6

Organizations1

Verified Results0

Self-Reported6

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

6 models

Top Score

26.3%

Average Score

14.8%

High Performers (80%+)

Top Organizations

#1OpenAI

6 models

14.8%

Leaderboard

6 models ranked by performance on FrontierMath

			License
#01GPT-5	OpenAI	Aug 7, 2025	Proprietary	26.3%
#02GPT-5 mini	OpenAI	Aug 7, 2025	Proprietary	22.1%
#03o3	OpenAI	Apr 16, 2025	Proprietary	15.8%
#04GPT-5 nano	OpenAI	Aug 7, 2025	Proprietary	9.6%
#05o3-mini	OpenAI	Jan 30, 2025	Proprietary	9.2%
#06o1	OpenAI	Dec 17, 2024	Proprietary	5.5%

Resources

Research Paper