Comprehensive side-by-side LLM comparison
Qwen3-Next-80B-A3B-Thinking leads with 19.8% higher average benchmark score. Qwen3-Next-80B-A3B-Thinking offers 99.1K more tokens in context window than Phi 4. Phi 4 is $1.44 cheaper per million tokens. Overall, Qwen3-Next-80B-A3B-Thinking is the stronger choice for coding tasks.
Microsoft
Phi-4 was introduced as the fourth generation of Microsoft's small language model series, designed to push the boundaries of what compact models can achieve. Built with advanced training techniques and architectural improvements, it demonstrates continued progress in efficient, high-quality language models.
Alibaba Cloud / Qwen Team
Qwen3-Next 80B Thinking was created as a reasoning-enhanced variant, designed to incorporate extended analytical capabilities into the Qwen3-Next architecture. Built to handle complex problem-solving with mixture-of-experts efficiency, it serves applications requiring both deep reasoning and computational practicality.
9 months newer

Phi 4
Microsoft
2024-12-12

Qwen3-Next-80B-A3B-Thinking
Alibaba Cloud / Qwen Team
2025-09-10
Cost per million tokens (USD)

Phi 4

Qwen3-Next-80B-A3B-Thinking
Context window and performance specifications
Average performance across 3 common benchmarks

Phi 4

Qwen3-Next-80B-A3B-Thinking
Phi 4
2024-06-01
Available providers and their performance metrics

Phi 4
DeepInfra

Qwen3-Next-80B-A3B-Thinking

Phi 4

Qwen3-Next-80B-A3B-Thinking

Phi 4

Qwen3-Next-80B-A3B-Thinking
Novita