HumanEvalFIM-Average

text

About

HumanEvalFIM-Average is a fill-in-the-middle code completion benchmark that provides averaged performance metrics for evaluating AI models' ability to complete code segments given prefix and suffix context. This benchmark tests models' contextual code understanding and completion accuracy in realistic IDE scenarios, measuring averaged performance across multiple FIM code completion tasks.

Evaluation Stats

Total Models1

Organizations1

Verified Results0

Self-Reported1

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

1 models

Top Score

91.6%

Average Score

91.6%

High Performers (80%+)

Top Organizations

#1Mistral AI

1 model

91.6%

Leaderboard

1 models ranked by performance on HumanEvalFIM-Average

			License		Links
#01Codestral-22B	Mistral AI	May 29, 2024	MNPL-0.1	91.6%

Resources

Research Paper