Kimi K2 Instruct
Zero-eval
#1GSM8k
#1CBNSL
#1AutoLogi
+17 more
by Moonshot AI
+
+
+
+
About
Kimi K2 Instruct was developed as the instruction-tuned version of Kimi K2, designed to follow user instructions reliably across diverse tasks. Built to serve general-purpose conversational and task-completion applications, it provides Moonshot's accessible interface for language AI.
+
+
+
+
Pricing Range
Input (per 1M)$0.57 -$0.57
Output (per 1M)$2.30 -$2.30
Providers1
+
+
+
+
Timeline
AnnouncedJul 11, 2025
ReleasedJul 11, 2025
+
+
+
+
Specifications
Training Tokens15.5T
+
+
+
+
License & Family
License
MIT
Base ModelKimi K2 Base
Performance Overview
Performance metrics and category breakdown
Overall Performance
38 benchmarks
Average Score
66.7%
Best Score
97.4%
High Performers (80%+)
12Performance Metrics
Max Context Window
262.1KAvg Throughput
45.0 tok/sAvg Latency
1ms+
+
+
+
All Benchmark Results for Kimi K2 Instruct
Complete list of benchmark scores with detailed information
| MATH-500 | text | 0.97 | 97.4% | Self-reported | |
| GSM8k | text | 0.97 | 97.3% | Self-reported | |
| CBNSL | text | 0.96 | 95.6% | Self-reported | |
| HumanEval | text | 0.93 | 93.3% | Self-reported | |
| MMLU-Redux | text | 0.93 | 92.7% | Self-reported | |
| IFEval | text | 0.90 | 89.8% | Self-reported | |
| MMLU | text | 0.90 | 89.5% | Self-reported | |
| AutoLogi | text | 0.90 | 89.5% | Self-reported | |
| ZebraLogic | text | 0.89 | 89.0% | Self-reported | |
| MultiPL-E | text | 0.86 | 85.7% | Self-reported |
Showing 1 to 10 of 38 benchmarks
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+