Doubao 1.5 Vision Pro

Multimodal

by ByteDance

+
+
+
+
About

Doubao 1.5 Vision Pro, released by ByteDance via Volcano Engine on January 22, 2025, is a multimodal large language model from the Doubao 1.5 family with comprehensive upgrades to visual reasoning, OCR, and fine-grained image understanding. Built on a large-scale sparse MoE architecture, it targets vision-intensive workflows including document analysis, chart interpretation, and visual question answering.

+
+
+
+
Pricing Range
Input (per 1M)$0.08 -$0.08
Output (per 1M)$0.08 -$0.08
Providers1
+
+
+
+
Timeline
ReleasedJan 22, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown

Overall Performance

1 benchmarks
Average Score
40.0%
Best Score
40.0%
High Performers (80%+)
0

Performance Metrics

Max Context Window
36.1K

Top Categories

Agents
40.0%
+
+
+
+
All Benchmark Results for Doubao 1.5 Vision Pro
Complete list of benchmark scores with detailed information
OSWorld
Agents
40.00
40.0%
Unverified
+
+
+
+
Resources