OpenAI

o1

Zero-eval
#1GPQA Physics
#1GPQA Biology
#1GPQA Chemistry
+3 more

by OpenAI

+
+
+
+
About

o1 was developed as part of OpenAI's reasoning-focused model series, designed to spend more time thinking before responding. Built to excel at complex reasoning tasks in science, coding, and mathematics, it employs extended internal reasoning processes to solve harder problems than traditional language models through careful step-by-step analysis.

+
+
+
+
Pricing Range
Input (per 1M)$15.00 -$15.00
Output (per 1M)$60.00 -$60.00
Providers2
+
+
+
+
Timeline
AnnouncedDec 17, 2024
ReleasedDec 17, 2024
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown

Overall Performance

19 benchmarks
Average Score
71.6%
Best Score
97.1%
High Performers (80%+)
7

Performance Metrics

Max Context Window
300.0K
Avg Throughput
41.0 tok/s
Avg Latency
8ms
+
+
+
+
All Benchmark Results for o1
Complete list of benchmark scores with detailed information
GSM8k
text
0.97
97.1%
Self-reported
MATH
text
0.96
96.4%
Self-reported
GPQA Physics
text
0.93
92.8%
Self-reported
MMLU
text
0.92
91.8%
Self-reported
MGSM
text
0.89
89.3%
Self-reported
HumanEval
text
0.88
88.1%
Self-reported
MMMLU
text
0.88
87.7%
Self-reported
GPQA
text
0.78
78.0%
Self-reported
MMMU
multimodal
0.78
77.6%
Self-reported
AIME 2024
text
0.74
74.3%
Self-reported
Showing 1 to 10 of 19 benchmarks
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+