Natural2Code
text
+
+
+
+
About
Natural2Code is a benchmark for evaluating natural language to code generation capabilities in interactive data science notebooks. It tests models' ability to generate executable code from natural language descriptions, focusing on data science workflows, programming tasks, and the integration of code generation within computational notebook environments.
+
+
+
+
Evaluation Stats
Total Models8
Organizations1
Verified Results0
Self-Reported8
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
8 models
Top Score
92.9%
Average Score
78.1%
High Performers (80%+)
4Top Organizations
#1Google
8 models
78.1%
+
+
+
+
Leaderboard
8 models ranked by performance on Natural2Code
License | Links | ||||
---|---|---|---|---|---|
Dec 1, 2024 | Proprietary | 92.9% | |||
May 1, 2024 | Proprietary | 85.4% | |||
Mar 12, 2025 | Gemma | 84.5% | |||
Mar 12, 2025 | Gemma | 80.7% | |||
May 1, 2024 | Proprietary | 79.8% | |||
Mar 15, 2024 | Proprietary | 75.5% | |||
Mar 12, 2025 | Gemma | 70.3% | |||
Mar 12, 2025 | Gemma | 56.0% |