Seed 1.5-VL
Multimodal
by ByteDance
+
+
+
+
About
Seed1.5-VL, released by ByteDance Seed on May 15, 2025, is a vision-language foundation model composed of a 532M-parameter vision encoder and a Mixture-of-Experts language model with 20 billion active parameters. It was pretrained on over 3 trillion multimodal tokens and achieved state-of-the-art performance on 38 out of 60 public VLM benchmarks at release. Seed1.5-VL targets complex visual reasoning, OCR, video comprehension, 3D spatial understanding, and multimodal agentic tasks.
+
+
+
+
Timeline
ReleasedMay 15, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown
Overall Performance
1 benchmarks
Average Score
67.6%
Best Score
67.6%
High Performers (80%+)
0Top Categories
Multimodal
67.6%
+
+
+
+
All Benchmark Results for Seed 1.5-VL
Complete list of benchmark scores with detailed information
| MMMU | Multimodal | 67.60 | 67.6% | Unverified |