Mistral AI

Pixtral-12B

Multimodal
Zero-eval
#1MM IF-Eval
#2VQAv2
#2MM-MT-Bench

by Mistral AI

+
+
+
+
About

Pixtral-12B is a multimodal language model developed by Mistral AI. It achieves strong performance with an average score of 66.9% across 12 benchmarks. It excels particularly in DocVQA (90.7%), ChartQA (81.8%), VQAv2 (78.6%). It supports a 136K token context window for handling large documents. The model is available through 1 API provider. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Mistral AI's latest advancement in AI technology.

+
+
+
+
Pricing Range
Input (per 1M)$0.15 -$0.15
Output (per 1M)$0.15 -$0.15
Providers1
+
+
+
+
Timeline
AnnouncedSep 17, 2024
ReleasedSep 17, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Apache 2.0
Performance Overview
Performance metrics and category breakdown

Overall Performance

12 benchmarks
Average Score
66.9%
Best Score
90.7%
High Performers (80%+)
2

Performance Metrics

Max Context Window
136.2K
Avg Throughput
0.1 tok/s
Avg Latency
1ms
+
+
+
+
All Benchmark Results for Pixtral-12B
Complete list of benchmark scores with detailed information
DocVQA
multimodal
0.91
90.7%
Self-reported
ChartQA
multimodal
0.82
81.8%
Self-reported
VQAv2
multimodal
0.79
78.6%
Self-reported
MT-Bench
text
0.77
76.8%
Self-reported
HumanEval
text
0.72
72.0%
Self-reported
MMLU
text
0.69
69.2%
Self-reported
IFEval
text
0.61
61.3%
Self-reported
MM-MT-Bench
multimodal
0.60
60.5%
Self-reported
MathVista
multimodal
0.58
58.0%
Self-reported
MM IF-Eval
multimodal
0.53
52.7%
Self-reported
Showing 1 to 10 of 12 benchmarks