Doubao 1.5 Vision Pro
Multimodal
by ByteDance
+
+
+
+
About
Doubao 1.5 Vision Pro, released by ByteDance via Volcano Engine on January 22, 2025, is a multimodal large language model from the Doubao 1.5 family with comprehensive upgrades to visual reasoning, OCR, and fine-grained image understanding. Built on a large-scale sparse MoE architecture, it targets vision-intensive workflows including document analysis, chart interpretation, and visual question answering.
+
+
+
+
Pricing Range
Input (per 1M)$0.08 -$0.08
Output (per 1M)$0.08 -$0.08
Providers1
+
+
+
+
Timeline
ReleasedJan 22, 2025
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown
Overall Performance
1 benchmarks
Average Score
40.0%
Best Score
40.0%
High Performers (80%+)
0Performance Metrics
Max Context Window
36.1KTop Categories
Agents
40.0%
+
+
+
+
All Benchmark Results for Doubao 1.5 Vision Pro
Complete list of benchmark scores with detailed information
| OSWorld | Agents | 40.00 | 40.0% | Unverified |