
Pixtral-12B
Multimodal
Zero-eval
#1MM IF-Eval
#2VQAv2
#2MM-MT-Bench
by Mistral AI
+
+
+
+
About
Pixtral-12B is a multimodal language model developed by Mistral AI. It achieves strong performance with an average score of 66.9% across 12 benchmarks. It excels particularly in DocVQA (90.7%), ChartQA (81.8%), VQAv2 (78.6%). It supports a 136K token context window for handling large documents. The model is available through 1 API provider. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Mistral AI's latest advancement in AI technology.
+
+
+
+
Pricing Range
Input (per 1M)$0.15 -$0.15
Output (per 1M)$0.15 -$0.15
Providers1
+
+
+
+
Timeline
AnnouncedSep 17, 2024
ReleasedSep 17, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Apache 2.0
Performance Overview
Performance metrics and category breakdown
Overall Performance
12 benchmarks
Average Score
66.9%
Best Score
90.7%
High Performers (80%+)
2Performance Metrics
Max Context Window
136.2KAvg Throughput
0.1 tok/sAvg Latency
1ms+
+
+
+
All Benchmark Results for Pixtral-12B
Complete list of benchmark scores with detailed information
DocVQA | multimodal | 0.91 | 90.7% | Self-reported | |
ChartQA | multimodal | 0.82 | 81.8% | Self-reported | |
VQAv2 | multimodal | 0.79 | 78.6% | Self-reported | |
MT-Bench | text | 0.77 | 76.8% | Self-reported | |
HumanEval | text | 0.72 | 72.0% | Self-reported | |
MMLU | text | 0.69 | 69.2% | Self-reported | |
IFEval | text | 0.61 | 61.3% | Self-reported | |
MM-MT-Bench | multimodal | 0.60 | 60.5% | Self-reported | |
MathVista | multimodal | 0.58 | 58.0% | Self-reported | |
MM IF-Eval | multimodal | 0.53 | 52.7% | Self-reported |
Showing 1 to 10 of 12 benchmarks