DeepSeek

DeepSeek-V2.5

Zero-eval
#1DS-FIM-Eval
#1Aider
#1DS-Arena-Code
+5 more

by DeepSeek

+
+
+
+
About

DeepSeek-V2.5 is a language model developed by DeepSeek. It achieves strong performance with an average score of 71.1% across 15 benchmarks. It excels particularly in GSM8k (95.1%), MT-Bench (90.2%), HumanEval (89.0%). The model is available through 3 API providers. Released in 2024, it represents DeepSeek's latest advancement in AI technology.

+
+
+
+
Pricing Range
Input (per 1M)$0.14 -$2.00
Output (per 1M)$0.28 -$2.00
Providers3
+
+
+
+
Timeline
AnnouncedMay 8, 2024
ReleasedMay 8, 2024
+
+
+
+
License & Family
License
deepseek
Performance Overview
Performance metrics and category breakdown

Overall Performance

15 benchmarks
Average Score
71.1%
Best Score
95.1%
High Performers (80%+)
6

Performance Metrics

Max Context Window
16.4K
Avg Throughput
87.7 tok/s
Avg Latency
1ms
+
+
+
+
All Benchmark Results for DeepSeek-V2.5
Complete list of benchmark scores with detailed information
GSM8k
text
0.95
95.1%
Self-reported
MT-Bench
text
0.90
90.2%
Self-reported
HumanEval
text
0.89
89.0%
Self-reported
BBH
text
0.84
84.3%
Self-reported
MMLU
text
0.80
80.4%
Self-reported
AlignBench
text
0.80
80.4%
Self-reported
DS-FIM-Eval
text
0.78
78.3%
Self-reported
Arena Hard
text
0.76
76.2%
Self-reported
MATH
text
0.75
74.7%
Self-reported
HumanEval-Mul
text
0.74
73.8%
Self-reported
Showing 1 to 10 of 15 benchmarks