Aider-Polyglot Edit
text
+
+
+
+
About
Aider-Polyglot Edit is an advanced AI code editing benchmark that specifically evaluates language models' ability to modify and integrate code within existing Python codebases. Using 133 Exercism coding exercises, it tests both code completion accuracy and adherence to specific edit formats. The benchmark measures practical coding skills including file editing, code integration, and format compliance for real-world AI programming assistance.
+
+
+
+
Evaluation Stats
Total Models10
Organizations3
Verified Results0
Self-Reported10
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
10 models
Top Score
79.7%
Average Score
48.2%
High Performers (80%+)
0Top Organizations
#1DeepSeek
1 model
79.7%
#2Google
2 models
64.7%
#3OpenAI
7 models
38.9%
+
+
+
+
Leaderboard
10 models ranked by performance on Aider-Polyglot Edit
License | Links | ||||
---|---|---|---|---|---|
Dec 25, 2024 | MIT + Model License (Commercial use allowed) | 79.7% | |||
May 20, 2025 | Proprietary | 72.7% | |||
Jan 30, 2025 | Proprietary | 60.4% | |||
Apr 16, 2025 | Proprietary | 58.2% | |||
May 20, 2025 | Proprietary | 56.7% | |||
Apr 14, 2025 | Proprietary | 52.9% | |||
Feb 27, 2025 | Proprietary | 44.9% | |||
Apr 14, 2025 | Proprietary | 31.6% | |||
Aug 6, 2024 | Proprietary | 18.2% | |||
Apr 14, 2025 | Proprietary | 6.2% |