xAI

Grok-4

Multimodal
Zero-eval
#1ARC-AGI v2
#2HMMT25
#2GPQA
+2 more

by xAI

+
+
+
+
About

Grok 4 represents the fourth generation of xAI's language models, developed to continue advancing the frontier of AI reasoning and knowledge. Built to handle increasingly complex tasks with enhanced reliability, it demonstrates xAI's commitment to pushing AI capabilities forward.

+
+
+
+
Pricing Range
Input (per 1M)$3.00 -$3.00
Output (per 1M)$15.00 -$15.00
Providers2
+
+
+
+
Timeline
AnnouncedJul 9, 2025
ReleasedJul 9, 2025
Knowledge CutoffDec 31, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown

Overall Performance

7 benchmarks
Average Score
63.1%
Best Score
91.7%
High Performers (80%+)
3

Performance Metrics

Max Context Window
264.0K
Avg Throughput
100.0 tok/s
Avg Latency
1ms
+
+
+
+
All Benchmark Results for Grok-4
Complete list of benchmark scores with detailed information
AIME 2025
text
0.92
91.7%
Self-reported
HMMT25
text
0.90
90.0%
Self-reported
GPQA
text
0.88
87.5%
Self-reported
LiveCodeBench
text
0.79
79.0%
Self-reported
Humanity's Last Exam
multimodal
0.40
40.0%
Self-reported
USAMO25
text
0.38
37.5%
Self-reported
ARC-AGI v2
multimodal
0.16
15.9%
Self-reported
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Resources