OpenAI

GPT-4.1 mini

Multimodal
Zero-eval
#2CharXiv-D
#2OpenAI-MRCR: 2 needle 1M
#2Graphwalks BFS >128k
+3 more

by OpenAI

+
+
+
+
About

GPT-4.1 mini is a multimodal language model developed by OpenAI. The model shows competitive results across 29 benchmarks. It excels particularly in CharXiv-D (88.4%), MMLU (87.5%), IFEval (84.1%). With a 1.1M token context window, it can handle extensive documents and complex multi-turn conversations. The model is available through 2 API providers. As a multimodal model, it can process and understand text, images, and other input formats seamlessly. Released in 2025, it represents OpenAI's latest advancement in AI technology.

+
+
+
+
Pricing Range
Input (per 1M)$0.40 -$0.40
Output (per 1M)$1.60 -$1.60
Providers2
+
+
+
+
Timeline
AnnouncedApr 14, 2025
ReleasedApr 14, 2025
Knowledge CutoffMay 31, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown

Overall Performance

29 benchmarks
Average Score
49.6%
Best Score
88.4%
High Performers (80%+)
3

Performance Metrics

Max Context Window
1.1M
Avg Throughput
150.0 tok/s
Avg Latency
5ms
+
+
+
+
All Benchmark Results for GPT-4.1 mini
Complete list of benchmark scores with detailed information
CharXiv-D
multimodal
0.88
88.4%
Self-reported
MMLU
text
0.88
87.5%
Self-reported
IFEval
text
0.84
84.1%
Self-reported
MMMLU
text
0.79
78.5%
Self-reported
MathVista
multimodal
0.73
73.1%
Self-reported
MMMU
multimodal
0.73
72.7%
Self-reported
Multi-IF
text
0.67
67.0%
Self-reported
GPQA
text
0.65
65.0%
Self-reported
Graphwalks BFS <128k
text
0.62
61.7%
Self-reported
Graphwalks parents <128k
text
0.60
60.5%
Self-reported
Showing 1 to 10 of 29 benchmarks