MBPP Plus
text
+
+
+
+
About
MBPP-Plus is an enhanced version of the Mostly Basic Python Problems benchmark featuring additional test cases and improved evaluation criteria for more comprehensive code assessment. This augmented benchmark provides rigorous testing of AI models' Python programming capabilities through expanded test coverage, measuring code correctness and robustness beyond the original MBPP's scope.
+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
1 models
Top Score
78.3%
Average Score
78.3%
High Performers (80%+)
0Top Organizations
#1Mistral AI
1 model
78.3%
+
+
+
+
Leaderboard
1 models ranked by performance on MBPP Plus
License | Links | ||||
---|---|---|---|---|---|
Jun 20, 2025 | Apache 2.0 | 78.3% |