SWE-bench Multilingual

Multilingual
text
+
+
+
+
About

SWE-bench-multilingual is a software engineering benchmark extending the original SWE-bench to cover multiple programming languages including Java, TypeScript, JavaScript, Go, Rust, C, and C++. This comprehensive evaluation tests AI models' ability to resolve real-world software issues across diverse programming ecosystems, challenging multilingual code understanding and debugging capabilities in authentic development scenarios.

+
+
+
+
Evaluation Stats
Total Models5
Organizations2
Verified Results0
Self-Reported5
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

5 models
Top Score
57.9%
Average Score
47.5%
High Performers (80%+)
0

Top Organizations

#1DeepSeek
3 models
47.6%
#2Moonshot AI
2 models
47.3%
+
+
+
+
Leaderboard
5 models ranked by performance on SWE-bench Multilingual
LicenseLinks
Sep 29, 2025
MIT
57.9%
Jan 10, 2025
MIT
54.5%
Jul 11, 2025
MIT
47.3%
Sep 5, 2025
MIT
47.3%
May 28, 2025
MIT
30.5%
+
+
+
+
Resources