
Llama 3.2 90B Instruct
Multimodal
Zero-eval
#1InfographicsQA
#3VQAv2
by Meta
+
+
+
+
About
Llama 3.2 90B Instruct is a multimodal language model developed by Meta. It achieves strong performance with an average score of 71.3% across 13 benchmarks. It excels particularly in AI2D (92.3%), DocVQA (90.1%), MGSM (86.9%). It supports a 256K token context window for handling large documents. The model is available through 5 API providers. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2024, it represents Meta's latest advancement in AI technology.
+
+
+
+
Pricing Range
Input (per 1M)$0.35 -$2.00
Output (per 1M)$0.40 -$2.00
Providers5
+
+
+
+
Timeline
AnnouncedSep 25, 2024
ReleasedSep 25, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Llama 3.2
Performance Overview
Performance metrics and category breakdown
Overall Performance
13 benchmarks
Average Score
71.3%
Best Score
92.3%
High Performers (80%+)
5Performance Metrics
Max Context Window
256.0KAvg Throughput
54.6 tok/sAvg Latency
1ms+
+
+
+
All Benchmark Results for Llama 3.2 90B Instruct
Complete list of benchmark scores with detailed information
AI2D | multimodal | 0.92 | 92.3% | Self-reported | |
DocVQA | multimodal | 0.90 | 90.1% | Self-reported | |
MGSM | text | 0.87 | 86.9% | Self-reported | |
MMLU | text | 0.86 | 86.0% | Self-reported | |
ChartQA | multimodal | 0.85 | 85.5% | Self-reported | |
VQAv2 | multimodal | 0.78 | 78.1% | Self-reported | |
TextVQA | multimodal | 0.73 | 73.5% | Self-reported | |
MATH | text | 0.68 | 68.0% | Self-reported | |
MMMU | multimodal | 0.60 | 60.3% | Self-reported | |
MathVista | multimodal | 0.57 | 57.3% | Self-reported |
Showing 1 to 10 of 13 benchmarks