Anthropic

Claude Opus 4.6

Multimodal
#1TAU2-Bench Telecom
#1TAU2-Bench Retail
#1BrowseComp
+12 more

by Anthropic

+
+
+
+
About

Claude Opus 4.6, released by Anthropic in February 2026, is a large language model from the Claude 4 family designed for complex agent orchestration, extended reasoning, and long-form code generation. It features a 200K token context window (extendable to 1M tokens in beta), 128K maximum output tokens, native image understanding, and extended thinking with both standard and adaptive effort modes. Opus 4.6 targets multi-step agentic workflows, parallel tool use, and applications requiring sustained reasoning over large contexts.

+
+
+
+
Pricing Range
Input (per 1M)$5.00 -$5.00
Output (per 1M)$25.00 -$25.00
Providers3
+
+
+
+
Timeline
ReleasedFeb 1, 2026
Knowledge CutoffMay 1, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown

Overall Performance

17 benchmarks
Average Score
73.0%
Best Score
99.3%
High Performers (80%+)
7

Performance Metrics

Max Context Window
328.0K

Top Categories

Science
91.3%
Knowledge
91.1%
Agents
85.6%
Multimodal
75.6%
Coding
66.0%
+
+
+
+
All Benchmark Results for Claude Opus 4.6
Complete list of benchmark scores with detailed information
TAU2-Bench Telecom
Agents
99.30
99.3%
Self-reported
TAU2-Bench Retail
Agents
91.90
91.9%
Self-reported
GPQA Diamond
Science
91.30
91.3%
Self-reported
MMMLU
Knowledge
91.10
91.1%
Self-reported
BrowseComp
Agents
84.00
84.0%
Self-reported
SWE Bench Verified
Coding
80.80
80.8%
Unverified
GDPVal AA ELO
Agents
1606.00
80.3%
Self-reported
MMMU-Pro with Tools
Multimodal
77.30
77.3%
Self-reported
MMMU-Pro
Multimodal
73.90
73.9%
Self-reported
OSWorld
Agents
72.70
72.7%
Self-reported
Showing 1 to 10 of 17 benchmarks