Grok-4
Multimodal
Zero-eval
#1ARC-AGI v2
#2HMMT25
#2GPQA
+2 more
by xAI
+
+
+
+
About
Grok 4 represents the fourth generation of xAI's language models, developed to continue advancing the frontier of AI reasoning and knowledge. Built to handle increasingly complex tasks with enhanced reliability, it demonstrates xAI's commitment to pushing AI capabilities forward.
+
+
+
+
Pricing Range
Input (per 1M)$3.00 -$3.00
Output (per 1M)$15.00 -$15.00
Providers2
+
+
+
+
Timeline
AnnouncedJul 9, 2025
ReleasedJul 9, 2025
Knowledge CutoffDec 31, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown
Overall Performance
7 benchmarks
Average Score
63.1%
Best Score
91.7%
High Performers (80%+)
3Performance Metrics
Max Context Window
264.0KAvg Throughput
100.0 tok/sAvg Latency
1ms+
+
+
+
All Benchmark Results for Grok-4
Complete list of benchmark scores with detailed information
| AIME 2025 | text | 0.92 | 91.7% | Self-reported | |
| HMMT25 | text | 0.90 | 90.0% | Self-reported | |
| GPQA | text | 0.88 | 87.5% | Self-reported | |
| LiveCodeBench | text | 0.79 | 79.0% | Self-reported | |
| Humanity's Last Exam | multimodal | 0.40 | 40.0% | Self-reported | |
| USAMO25 | text | 0.38 | 37.5% | Self-reported | |
| ARC-AGI v2 | multimodal | 0.16 | 15.9% | Self-reported |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+