IBM

Granite 3.3 8B Instruct

Multimodal
Zero-eval
#2AttaQ
#2TruthfulQA
#2AlpacaEval 2.0
+1 more

by IBM

+
+
+
+
About

Granite 3.3 8B Instruct is a multimodal language model developed by IBM. It achieves strong performance with an average score of 69.8% across 14 benchmarks. It excels particularly in HumanEval (89.7%), AttaQ (88.5%), HumanEval+ (86.1%). As a multimodal model, it can process and understand text, images, and other input formats seamlessly. It's licensed for commercial use, making it suitable for enterprise applications. Released in 2025, it represents IBM's latest advancement in AI technology.

+
+
+
+
Timeline
AnnouncedApr 16, 2025
ReleasedApr 16, 2025
Knowledge CutoffApr 1, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Apache 2.0
Performance Overview
Performance metrics and category breakdown

Overall Performance

14 benchmarks
Average Score
69.8%
Best Score
89.7%
High Performers (80%+)
5
+
+
+
+
All Benchmark Results for Granite 3.3 8B Instruct
Complete list of benchmark scores with detailed information
HumanEval
text
0.90
89.7%
Self-reported
AttaQ
text
0.89
88.5%
Self-reported
HumanEval+
text
0.86
86.1%
Self-reported
AIME 2024
text
0.81
81.2%
Self-reported
GSM8k
text
0.81
80.9%
Self-reported
IFEval
text
0.75
74.8%
Self-reported
BIG-Bench Hard
text
0.69
69.1%
Self-reported
MATH-500
text
0.69
69.0%
Self-reported
TruthfulQA
text
0.67
66.9%
Self-reported
MMLU
text
0.66
65.5%
Self-reported
Showing 1 to 10 of 14 benchmarks