Claude Sonnet 4.6

Name: Claude Sonnet 4.6
Price: 3 USD
Rating: 72.0 (16 reviews)
Author: Anthropic

Multimodal

#1GDPVal AA ELO

#1Finance Agent

#2TAU2-Bench Retail

+7 more

by Anthropic

About

Claude Sonnet 4.6 is a general-purpose language model from Anthropic, released in February 2026 as an update to the Sonnet 4 line that introduced adaptive thinking — a mode where the model automatically calibrates its reasoning depth based on task complexity rather than requiring manual configuration by the developer. The model accepts text and image inputs and integrates natively with web search and code execution tools, consolidating capabilities that previously required separate toolchain setup into a unified API surface. It became the primary workhorse model in the Claude 4 series for code assistance, agentic pipelines, and retrieval-augmented applications that benefit from built-in web access.

Pricing Range

Input (per 1M)$3.00 -$3.00

Output (per 1M)$15.00 -$15.00

Providers3

Timeline

ReleasedFeb 17, 2026

Knowledge CutoffAug 1, 2025

Specifications

Capabilities

Multimodal

License & Family

License

Proprietary

Performance Overview

Performance metrics and category breakdown

Overall Performance

16 benchmarks

Average Score

72.0%

Best Score

97.9%

High Performers (80%+)

Performance Metrics

Max Context Window

264.0K

Top Categories

Science

89.9%

Knowledge

89.3%

Agents

83.7%

Multimodal

75.0%

Coding

69.3%

All Benchmark Results for Claude Sonnet 4.6

Complete list of benchmark scores with detailed information


TAU2-Bench Telecom	Agents	97.90	97.9%	Self-reported
TAU2-Bench Retail	Agents	91.70	91.7%	Self-reported
GPQA Diamond	Science	89.90	89.9%	Self-reported
MMMLU	Knowledge	89.30	89.3%	Self-reported
GDPVal AA ELO	Agents	1633.00	81.7%	Self-reported
SWE Bench Verified	Coding	79.60	79.6%	Unverified
MMMU-Pro with Tools	Multimodal	75.60	75.6%	Self-reported
BrowseComp	Agents	74.70	74.7%	Self-reported
MMMU-Pro	Multimodal	74.50	74.5%	Self-reported
OSWorld	Agents	72.50	72.5%	Self-reported

Showing 1 to 10 of 16 benchmarks

Resources

API Reference Blog Post