Seed 1.5-VL

Multimodal

by ByteDance

+
+
+
+
About

Seed1.5-VL, released by ByteDance Seed in May 2025, is a vision-language foundation model designed for advanced general-purpose multimodal understanding and reasoning — combining a dedicated vision encoder with a Mixture-of-Experts language model in an architecture built to handle complex visual reasoning tasks. ByteDance published a comprehensive technical report alongside the release, contributing to the growing body of open research on multimodal model architecture and training methodology.

+
+
+
+
Timeline
ReleasedMay 15, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown

Overall Performance

1 benchmarks
Average Score
67.6%
Best Score
67.6%
High Performers (80%+)
0

Top Categories

Multimodal
67.6%
+
+
+
+
All Benchmark Results for Seed 1.5-VL
Complete list of benchmark scores with detailed information
MMMU
Multimodal
67.60
67.6%
Unverified