Comprehensive side-by-side LLM comparison
GPT-5.1 leads with 20.6% higher average benchmark score. Overall, GPT-5.1 is the stronger choice for coding tasks.
OpenAI
GPT-5.1, released by OpenAI in November 2025, is a large language model from the GPT-5 family that delivers incremental improvements in reasoning, instruction following, and multimodal understanding over GPT-5. It features a 400K token context window and targets general-purpose development, long-context analysis, and agentic workflows.
Kunlun Tech
Skywork-R1V3-38B, released by Kunlun Tech's Skywork AI team on July 9, 2025, is a 38 billion parameter multimodal reasoning model built on InternVL-38B with reinforcement learning post-training that enhances both visual and textual reasoning. It uses the GRPO algorithm and cold-start fine-tuning to improve reasoning across image and text modalities. Skywork-R1V3-38B targets open-source multimodal reasoning deployments requiring strong performance across vision-language benchmarks.
3 months newer
Skywork-R1V3-38B
Kunlun Tech
2025-07-09

GPT-5.1
OpenAI
2025-11
Average performance across 1 common benchmarks
GPT-5.1
Skywork-R1V3-38B
Performance comparison across key benchmark categories
GPT-5.1
Skywork-R1V3-38B
Available providers and their performance metrics
GPT-5.1
Skywork-R1V3-38B
GPT-5.1
Skywork-R1V3-38B