Aider

text
+
+
+
+
About

Aider's AI coding benchmarks evaluate language models through 225 polyglot programming exercises across C++, Go, Java, JavaScript, Python, and Rust. The benchmark tests code editing, instruction following, and real-world programming capabilities with pass rates from 3.6% to 88%. It provides comprehensive evaluation of AI coding assistants' ability to translate natural language into executable code.

+
+
+
+
Evaluation Stats
Total Models4
Organizations2
Verified Results0
Self-Reported4
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

4 models
Top Score
72.2%
Average Score
59.9%
High Performers (80%+)
0

Top Organizations

#1DeepSeek
1 model
72.2%
#2Alibaba Cloud / Qwen Team
3 models
55.9%
+
+
+
+
Leaderboard
4 models ranked by performance on Aider
LicenseLinks
May 8, 2024
deepseek
72.2%
Apr 29, 2025
Apache 2.0
61.8%
Sep 19, 2024
Apache 2.0
55.6%
Apr 29, 2025
Apache 2.0
50.2%
+
+
+
+
Resources