Aider-Polyglot Edit

text
+
+
+
+
About

Aider-Polyglot Edit is an advanced AI code editing benchmark that specifically evaluates language models' ability to modify and integrate code within existing Python codebases. Using 133 Exercism coding exercises, it tests both code completion accuracy and adherence to specific edit formats. The benchmark measures practical coding skills including file editing, code integration, and format compliance for real-world AI programming assistance.

+
+
+
+
Evaluation Stats
Total Models10
Organizations3
Verified Results0
Self-Reported10
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

10 models
Top Score
79.7%
Average Score
48.2%
High Performers (80%+)
0

Top Organizations

#1DeepSeek
1 model
79.7%
#2Google
2 models
64.7%
#3OpenAI
7 models
38.9%
+
+
+
+
Leaderboard
10 models ranked by performance on Aider-Polyglot Edit
LicenseLinks
Dec 25, 2024
MIT + Model License (Commercial use allowed)
79.7%
May 20, 2025
Proprietary
72.7%
Jan 30, 2025
Proprietary
60.4%
Apr 16, 2025
Proprietary
58.2%
May 20, 2025
Proprietary
56.7%
Apr 14, 2025
Proprietary
52.9%
Feb 27, 2025
Proprietary
44.9%
Apr 14, 2025
Proprietary
31.6%
Aug 6, 2024
Proprietary
18.2%
Apr 14, 2025
Proprietary
6.2%
+
+
+
+
Resources