Claude Opus 4.1
Multimodal
Zero-eval
#1MMMLU
#1MMMU (validation)
#2TAU-bench Retail
+2 more
by Anthropic
+
+
+
+
About
Claude Opus 4.1 represents an iteration within the Claude 4 Opus line, built to deliver refined performance in complex reasoning and analysis tasks. Developed as part of Anthropic's flagship tier, it incorporates improvements to the foundational capabilities that define the Opus family of models.
+
+
+
+
Pricing Range
Input (per 1M)$15.00 -$15.00
Output (per 1M)$75.00 -$75.00
Providers4
+
+
+
+
Timeline
AnnouncedAug 5, 2025
ReleasedAug 5, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown
Overall Performance
8 benchmarks
Average Score
72.7%
Best Score
89.5%
High Performers (80%+)
3Performance Metrics
Max Context Window
232.0KAvg Throughput
76.0 tok/sAvg Latency
0ms+
+
+
+
All Benchmark Results for Claude Opus 4.1
Complete list of benchmark scores with detailed information
| MMMLU | text | 0.90 | 89.5% | Self-reported | |
| TAU-bench Retail | text | 0.82 | 82.4% | Self-reported | |
| GPQA | text | 0.81 | 80.9% | Self-reported | |
| AIME 2025 | text | 0.78 | 78.0% | Self-reported | |
| MMMU (validation) | multimodal | 0.77 | 77.1% | Self-reported | |
| SWE-Bench Verified | text | 0.74 | 74.5% | Self-reported | |
| TAU-bench Airline | text | 0.56 | 56.0% | Self-reported | |
| Terminal-Bench | text | 0.43 | 43.3% | Self-reported |
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+