Comprehensive side-by-side LLM comparison
GPT-5.1 Codex Max leads with 1.3% higher average benchmark score. Claude Sonnet 4.6 supports multimodal inputs. Claude Sonnet 4.6 is available on 3 providers. Both models have their strengths depending on your specific coding needs.
Anthropic
Claude Sonnet 4.6 is a general-purpose language model from Anthropic, released in February 2026 as an update to the Sonnet 4 line that introduced adaptive thinking — a mode where the model automatically calibrates its reasoning depth based on task complexity rather than requiring manual configuration by the developer. The model accepts text and image inputs and integrates natively with web search and code execution tools, consolidating capabilities that previously required separate toolchain setup into a unified API surface. It became the primary workhorse model in the Claude 4 series for code assistance, agentic pipelines, and retrieval-augmented applications that benefit from built-in web access.
OpenAI
GPT-5.1 Codex Max, released by OpenAI in November 2025, is an enhanced coding variant from the GPT-5.1 Codex line, designed for more complex software engineering tasks requiring additional reasoning depth. It targets large-scale code generation, automated refactoring, and sophisticated agentic development workflows.
3 months newer

GPT-5.1 Codex Max
OpenAI
2025-11

Claude Sonnet 4.6
Anthropic
2026-02-17
Context window and performance specifications
Average performance across 1 common benchmarks
Claude Sonnet 4.6
GPT-5.1 Codex Max
Performance comparison across key benchmark categories
Claude Sonnet 4.6
GPT-5.1 Codex Max
Claude Sonnet 4.6
2025-08
Available providers and their performance metrics
Claude Sonnet 4.6
Anthropic
AWS Bedrock
Google Cloud Vertex AI
GPT-5.1 Codex Max
Claude Sonnet 4.6
GPT-5.1 Codex Max
Claude Sonnet 4.6
GPT-5.1 Codex Max