Claude Sonnet 4.5
Multimodal
#2τ-bench
#3OSWorld
by Anthropic
+
+
+
+
About
Claude Sonnet 4.5, released by Anthropic in September 2025, was built specifically around long-duration agentic work — coding, computer use, and autonomous tasks that run in loops, use tools, and execute over extended sessions. Anthropic positioned it as their most capable model for building complex agents, reflecting a shift toward designing Claude iterations primarily for autonomous agent use cases rather than conversational chat.
+
+
+
+
Pricing Range
Input (per 1M)$3.00 -$3.00
Output (per 1M)$15.00 -$15.00
Providers3
+
+
+
+
Timeline
ReleasedSep 29, 2025
Knowledge CutoffJan 1, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown
Overall Performance
9 benchmarks
Average Score
44.6%
Best Score
84.7%
High Performers (80%+)
1Performance Metrics
Max Context Window
264.0KTop Categories
Agents
62.9%
Finance
54.5%
Coding
49.0%
Tool Use
43.8%
Reasoning
17.7%
+
+
+
+
All Benchmark Results for Claude Sonnet 4.5
Complete list of benchmark scores with detailed information
| τ-bench | Agents | 84.70 | 84.7% | Unverified | |
| OSWorld | Agents | 61.40 | 61.4% | Unverified | |
| Finance Agent | Finance | 54.50 | 54.5% | Unverified | |
| Terminal Bench 2.0 | Coding | 51.00 | 51.0% | Unverified | |
| SWE-rebench | Coding | 47.10 | 47.1% | Unverified | |
| MCP-Atlas | Tool Use | 43.80 | 43.8% | Unverified | |
| GDPVal | Agents | 42.50 | 42.5% | Unverified | |
| Humanity's Last Exam | Reasoning | 17.70 | 17.7% | Unverified | |
| MMMU | Multimodal | -1.00 | -1.0% | Unverified |