o3-mini
Zero-eval
#1MATH
#1IFEval
#1LiveBench
+8 more
by OpenAI
+
+
+
+
About
o3-mini was created as an efficient variant of the o3 reasoning model, designed to provide advanced thinking capabilities with reduced computational requirements. Built to make next-generation reasoning accessible to a broader range of applications, it balances analytical depth with practical speed and cost considerations.
+
+
+
+
Pricing Range
Input (per 1M)$1.10 -$1.10
Output (per 1M)$4.40 -$4.40
Providers2
+
+
+
+
Timeline
AnnouncedJan 30, 2025
ReleasedJan 30, 2025
Knowledge CutoffSep 30, 2023
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown
Overall Performance
26 benchmarks
Average Score
56.9%
Best Score
98.7%
High Performers (80%+)
8Performance Metrics
Max Context Window
300.0KAvg Throughput
115.0 tok/sAvg Latency
5ms+
+
+
+
All Benchmark Results for o3-mini
Complete list of benchmark scores with detailed information
| COLLIE | text | 0.99 | 98.7% | Self-reported | |
| MATH | text | 0.98 | 97.9% | Self-reported | |
| IFEval | text | 0.94 | 93.9% | Self-reported | |
| MGSM | text | 0.92 | 92.0% | Self-reported | |
| AIME 2024 | text | 0.87 | 87.3% | Self-reported | |
| MMLU | text | 0.87 | 86.9% | Self-reported | |
| LiveBench | text | 0.85 | 84.6% | Self-reported | |
| Multilingual MMLU | text | 0.81 | 80.7% | Self-reported | |
| Multi-IF | text | 0.80 | 79.5% | Self-reported | |
| GPQA | text | 0.77 | 77.2% | Self-reported |
Showing 1 to 10 of 26 benchmarks
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+