o4-mini
Multimodal
Zero-eval
#2AIME 2024
#2MathVista
#2BrowseComp
+2 more
by OpenAI
+
+
+
+
About
o4-mini was created as part of the next generation of OpenAI's reasoning models, designed to continue advancing the balance between analytical capability and operational efficiency. Built to bring cutting-edge reasoning techniques to applications requiring quick turnaround, it represents the evolution of compact reasoning-focused models.
+
+
+
+
Pricing Range
Input (per 1M)$1.10 -$1.10
Output (per 1M)$4.40 -$4.40
Providers1
+
+
+
+
Timeline
AnnouncedApr 16, 2025
ReleasedApr 16, 2025
Knowledge CutoffMay 31, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown
Overall Performance
14 benchmarks
Average Score
66.5%
Best Score
93.4%
High Performers (80%+)
5Performance Metrics
Max Context Window
300.0KAvg Throughput
115.0 tok/sAvg Latency
5ms+
+
+
+
All Benchmark Results for o4-mini
Complete list of benchmark scores with detailed information
| AIME 2024 | text | 0.93 | 93.4% | Self-reported | |
| AIME 2025 | text | 0.93 | 92.7% | Self-reported | |
| MathVista | multimodal | 0.84 | 84.3% | Self-reported | |
| MMMU | multimodal | 0.82 | 81.6% | Self-reported | |
| GPQA | text | 0.81 | 81.4% | Self-reported | |
| CharXiv-R | multimodal | 0.72 | 72.0% | Self-reported | |
| TAU-bench Retail | text | 0.72 | 71.8% | Self-reported | |
| Aider-Polyglot | text | 0.69 | 68.9% | Self-reported | |
| SWE-Bench Verified | text | 0.68 | 68.1% | Self-reported | |
| Aider-Polyglot Edit | text | 0.58 | 58.2% | Self-reported |
Showing 1 to 10 of 14 benchmarks
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+