HumanEvalFIM-Average

text
+
+
+
+
About

HumanEvalFIM-Average is a fill-in-the-middle code completion benchmark that provides averaged performance metrics for evaluating AI models' ability to complete code segments given prefix and suffix context. This benchmark tests models' contextual code understanding and completion accuracy in realistic IDE scenarios, measuring averaged performance across multiple FIM code completion tasks.

+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

1 models
Top Score
91.6%
Average Score
91.6%
High Performers (80%+)
1

Top Organizations

#1Mistral AI
1 model
91.6%
+
+
+
+
Leaderboard
1 models ranked by performance on HumanEvalFIM-Average
LicenseLinks
May 29, 2024
MNPL-0.1
91.6%
+
+
+
+
Resources