Comprehensive side-by-side LLM comparison
o1-pro leads with 12.0% higher average benchmark score. o1-pro supports multimodal inputs. Overall, o1-pro is the stronger choice for coding tasks.
OpenAI
o1-pro was developed as an enhanced version of the o1 reasoning model, designed to provide extended reasoning capabilities with greater depth and reliability. Built for professionals and advanced users tackling complex analytical tasks, it offers enhanced thinking time and reasoning quality for the most demanding applications.
Microsoft
Phi-4 Reasoning was developed to incorporate extended analytical thinking into the Phi-4 architecture, designed to spend more time on complex problem-solving. Built to combine compact model efficiency with reasoning depth, it represents Microsoft's exploration of thoughtful small models.
4 months newer

o1-pro
OpenAI
2024-12-17

Phi 4 Reasoning
Microsoft
2025-04-30
Average performance across 2 common benchmarks

o1-pro

Phi 4 Reasoning
o1-pro
2023-09-30
Phi 4 Reasoning
2025-03-01
Available providers and their performance metrics

o1-pro

Phi 4 Reasoning

o1-pro

Phi 4 Reasoning