Comprehensive side-by-side LLM comparison
Claude 3.7 Sonnet leads with 54.0% higher average benchmark score. Claude 3.7 Sonnet is available on 4 providers. Overall, Claude 3.7 Sonnet is the stronger choice for coding tasks.
Anthropic
Claude 3.7 Sonnet represents Anthropic's first hybrid reasoning model, capable of producing near-instant responses or extended step-by-step thinking that is visible to users. Developed with particularly strong improvements in coding and front-end web development, it allows users to control thinking budgets and balances real-world task performance with reasoning capabilities for enterprise applications.
Gemma 3N E2B IT LiteRT Preview was introduced as an experimental version optimized for LiteRT deployment, designed to push the boundaries of on-device AI. Built to demonstrate the potential of running instruction-tuned models on mobile and edge devices, it represents ongoing efforts to make AI more accessible across hardware platforms.
2 months newer

Claude 3.7 Sonnet
Anthropic
2025-02-24

Gemma 3n E2B Instructed LiteRT (Preview)
2025-05-20
Context window and performance specifications
Average performance across 2 common benchmarks

Claude 3.7 Sonnet

Gemma 3n E2B Instructed LiteRT (Preview)
Gemma 3n E2B Instructed LiteRT (Preview)
2024-06-01
Available providers and their performance metrics

Claude 3.7 Sonnet
Anthropic
Bedrock
ZeroEval

Claude 3.7 Sonnet

Gemma 3n E2B Instructed LiteRT (Preview)

Claude 3.7 Sonnet

Gemma 3n E2B Instructed LiteRT (Preview)

Gemma 3n E2B Instructed LiteRT (Preview)