Qwen3-VL-235B-A22B

Multimodal

by Alibaba / Qwen

+
+
+
+
About

Qwen3-VL-235B-A22B, released by Alibaba's Qwen team in September 2025, is a natively multimodal Mixture-of-Experts large language model with 235 billion total parameters and 22 billion active parameters. It features a 256K token context window (with extrapolation to 1M tokens), native support for text, image, and video input, and joint visual-textual reasoning capabilities. Qwen3-VL-235B targets complex visual reasoning, video understanding, and multimodal agentic tasks under the Apache 2.0 license.

+
+
+
+
Pricing Range
Input (per 1M)$0.25 -$0.25
Output (per 1M)$0.75 -$0.75
Providers1
+
+
+
+
Timeline
ReleasedSep 23, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Apache 2.0
Performance Overview
Performance metrics and category breakdown

Overall Performance

1 benchmarks
Average Score
68.1%
Best Score
68.1%
High Performers (80%+)
0

Performance Metrics

Max Context Window
270.3K

Top Categories

Multimodal
68.1%
+
+
+
+
All Benchmark Results for Qwen3-VL-235B-A22B
Complete list of benchmark scores with detailed information
MMMU
Multimodal
68.10
68.1%
Unverified