Aider
text
+
+
+
+
About
Aider's AI coding benchmarks evaluate language models through 225 polyglot programming exercises across C++, Go, Java, JavaScript, Python, and Rust. The benchmark tests code editing, instruction following, and real-world programming capabilities with pass rates from 3.6% to 88%. It provides comprehensive evaluation of AI coding assistants' ability to translate natural language into executable code.
+
+
+
+
Evaluation Stats
Total Models4
Organizations2
Verified Results0
Self-Reported4
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
4 models
Top Score
72.2%
Average Score
59.9%
High Performers (80%+)
0Top Organizations
#1DeepSeek
1 model
72.2%
#2Alibaba Cloud / Qwen Team
3 models
55.9%
+
+
+
+
Leaderboard
4 models ranked by performance on Aider
License | Links | ||||
---|---|---|---|---|---|
May 8, 2024 | deepseek | 72.2% | |||
Apr 29, 2025 | Apache 2.0 | 61.8% | |||
Sep 19, 2024 | Apache 2.0 | 55.6% | |||
Apr 29, 2025 | Apache 2.0 | 50.2% |