Phi 4 Reasoning Plus
Zero-eval
#1FlenQA
#1OmniMath
#1PhiBench
+1 more
by Microsoft
+
+
+
+
About
Phi-4 Reasoning Plus was created as an enhanced reasoning variant, designed to provide even deeper analytical capabilities within the Phi-4 family. Built to maximize reasoning quality while maintaining the efficiency benefits of small models, it represents the most capable reasoning-focused option in the Phi-4 series.
+
+
+
+
Timeline
AnnouncedApr 30, 2025
ReleasedApr 30, 2025
Knowledge CutoffMar 1, 2025
+
+
+
+
Specifications
Training Tokens16.0B
+
+
+
+
License & Family
License
MIT
Performance Overview
Performance metrics and category breakdown
Overall Performance
11 benchmarks
Average Score
78.9%
Best Score
97.9%
High Performers (80%+)
5+
+
+
+
All Benchmark Results for Phi 4 Reasoning Plus
Complete list of benchmark scores with detailed information
| FlenQA | text | 0.98 | 97.9% | Self-reported | |
| HumanEval+ | text | 0.92 | 92.3% | Self-reported | |
| IFEval | text | 0.85 | 84.9% | Self-reported | |
| OmniMath | text | 0.82 | 81.9% | Self-reported | |
| AIME 2024 | text | 0.81 | 81.3% | Self-reported | |
| Arena Hard | text | 0.79 | 79.0% | Self-reported | |
| AIME 2025 | text | 0.78 | 78.0% | Self-reported | |
| MMLU-Pro | text | 0.76 | 76.0% | Self-reported | |
| PhiBench | text | 0.74 | 74.2% | Self-reported | |
| GPQA | text | 0.69 | 68.9% | Self-reported |
Showing 1 to 10 of 11 benchmarks
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+