Claude Sonnet 4.6 vs Grok 3 mini: Complete Benchmarks, Speed & Cost Comparison (2026)

Claude Sonnet 4.6 vs Grok 3 mini

Comprehensive side-by-side LLM comparison

Claude Sonnet 4.6 leads with 73.0% higher average benchmark score. Claude Sonnet 4.6 offers 116.5K more tokens in context window than Grok 3 mini. Grok 3 mini is $17.20 cheaper per million tokens. Claude Sonnet 4.6 supports multimodal inputs. Claude Sonnet 4.6 is available on 3 providers. Overall, Claude Sonnet 4.6 is the stronger choice for coding tasks.

Anthropic

Claude Sonnet 4.6 is a general-purpose language model from Anthropic, released in February 2026 as an update to the Sonnet 4 line that introduced adaptive thinking — a mode where the model automatically calibrates its reasoning depth based on task complexity rather than requiring manual configuration by the developer. The model accepts text and image inputs and integrates natively with web search and code execution tools, consolidating capabilities that previously required separate toolchain setup into a unified API surface. It became the primary workhorse model in the Claude 4 series for code assistance, agentic pipelines, and retrieval-augmented applications that benefit from built-in web access.

xAI

Grok 3 mini, released by xAI alongside Grok 3 in February 2025, is a compact reasoning model from the Grok 3 family featuring RL-enhanced Think mode for extended chain-of-thought processing. It features a 131K token context window and targets STEM tasks, mathematics, and coding applications where cost-efficient reasoning with configurable depth is required.

1 year newer

Grok 3 mini

xAI

2025-02-17

Claude Sonnet 4.6

Anthropic

2026-02-17

Pricing Comparison

Cost per million tokens (USD)

Claude Sonnet 4.6

Input:$3.00

Output:$15.00

Grok 3 mini

Input:$0.30

Output:$0.50($17.20 cheaper)

Performance Metrics

Context window and performance specifications

Average performance across 1 common benchmarks

Claude Sonnet 4.6

Average Score:74.7%(+73.0%)

Grok 3 mini

Average Score:1.7%

Performance comparison across key benchmark categories

Claude Sonnet 4.6

Agents74.7%(+73.0%)

Grok 3 mini

Agents1.7%

Knowledge Cutoff

Training data recency comparison

Claude Sonnet 4.6

2025-08

More recent knowledge cutoff means awareness of newer technologies and frameworks

Provider Availability & Performance

Available providers and their performance metrics

Claude Sonnet 4.6

3 providers

Anthropic

AWS Bedrock

Google Cloud Vertex AI

Grok 3 mini

Claude Sonnet 4.6

Avg Score:74.7%(+73.0%)

Providers:3

Grok 3 mini

Avg Score:1.7%

Providers:1