Anthropic

Claude Opus 4.1

Multimodal
Zero-eval
#1MMMLU
#1MMMU (validation)
#2TAU-bench Retail
+2 more

by Anthropic

+
+
+
+
About

Claude Opus 4.1 represents an iteration within the Claude 4 Opus line, built to deliver refined performance in complex reasoning and analysis tasks. Developed as part of Anthropic's flagship tier, it incorporates improvements to the foundational capabilities that define the Opus family of models.

+
+
+
+
Pricing Range
Input (per 1M)$15.00 -$15.00
Output (per 1M)$75.00 -$75.00
Providers4
+
+
+
+
Timeline
AnnouncedAug 5, 2025
ReleasedAug 5, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown

Overall Performance

8 benchmarks
Average Score
72.7%
Best Score
89.5%
High Performers (80%+)
3

Performance Metrics

Max Context Window
232.0K
Avg Throughput
76.0 tok/s
Avg Latency
0ms
+
+
+
+
All Benchmark Results for Claude Opus 4.1
Complete list of benchmark scores with detailed information
MMMLU
text
0.90
89.5%
Self-reported
TAU-bench Retail
text
0.82
82.4%
Self-reported
GPQA
text
0.81
80.9%
Self-reported
AIME 2025
text
0.78
78.0%
Self-reported
MMMU (validation)
multimodal
0.77
77.1%
Self-reported
SWE-Bench Verified
text
0.74
74.5%
Self-reported
TAU-bench Airline
text
0.56
56.0%
Self-reported
Terminal-Bench
text
0.43
43.3%
Self-reported
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+