MultiPL-E

Multilingual
text
+
+
+
+
About

MultiPL-E is a comprehensive multilingual programming benchmark for evaluating code generation performance of large language models across multiple programming languages. It extends existing code benchmarks to cover diverse programming languages, testing models' ability to generate syntactically correct and functionally accurate code in various programming paradigms and language ecosystems.

+
+
+
+
Evaluation Stats
Total Models12
Organizations2
Verified Results0
Self-Reported12
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

12 models
Top Score
87.9%
Average Score
75.1%
High Performers (80%+)
4

Top Organizations

#1Moonshot AI
2 models
85.7%
#2Alibaba Cloud / Qwen Team
10 models
72.9%
+
+
+
+
Leaderboard
12 models ranked by performance on MultiPL-E
Showing 1 to 10 of 12 models
+
+
+
+
Resources