o3
Multimodal
#2MMMU
by OpenAI
+
+
+
+
About
OpenAI o3, released by OpenAI in April 2025, is a large reasoning model that applies extended chain-of-thought processing to deliver improved performance on complex math, science, and coding tasks. It features a 200K token context window and native image understanding, with demonstrated strong results on mathematics and software engineering benchmarks. o3 targets demanding analytical and engineering tasks where deliberate, multi-step reasoning produces significantly better outcomes than direct generation.
+
+
+
+
Pricing Range
Input (per 1M)$10.00 -$10.00
Output (per 1M)$40.00 -$40.00
Providers1
+
+
+
+
Timeline
ReleasedApr 16, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown
Overall Performance
6 benchmarks
Average Score
43.7%
Best Score
76.4%
High Performers (80%+)
0Performance Metrics
Max Context Window
300.0KTop Categories
Multimodal
76.4%
Science
69.1%
Tool Use
43.6%
Agents
26.9%
Reasoning
19.2%
+
+
+
+
All Benchmark Results for o3
Complete list of benchmark scores with detailed information
| MMMU | Multimodal | 76.40 | 76.4% | Unverified | |
| GPQA Diamond | Science | 69.10 | 69.1% | Unverified | |
| MCP-Atlas | Tool Use | 43.60 | 43.6% | Unverified | |
| GDPVal | Agents | 30.80 | 30.8% | Unverified | |
| OSWorld | Agents | 23.00 | 23.0% | Unverified | |
| Humanity's Last Exam | Reasoning | 19.20 | 19.2% | Unverified |