Seed 1.5-VL

Name: Seed 1.5-VL
Rating: 67.6 (1 reviews)
Author: ByteDance

Multimodal

by ByteDance

About

Seed1.5-VL, released by ByteDance Seed on May 15, 2025, is a vision-language foundation model composed of a 532M-parameter vision encoder and a Mixture-of-Experts language model with 20 billion active parameters. It was pretrained on over 3 trillion multimodal tokens and achieved state-of-the-art performance on 38 out of 60 public VLM benchmarks at release. Seed1.5-VL targets complex visual reasoning, OCR, video comprehension, 3D spatial understanding, and multimodal agentic tasks.

Timeline

ReleasedMay 15, 2025

Specifications

Capabilities

Multimodal

License & Family

License

Proprietary

Performance Overview

Performance metrics and category breakdown

1 benchmarks

Average Score

67.6%

Best Score

67.6%

High Performers (80%+)

Multimodal

67.6%

All Benchmark Results for Seed 1.5-VL

Complete list of benchmark scores with detailed information


MMMU	Multimodal		67.60	67.6%	Unverified

Resources