CodeForces

text
+
+
+
+
About

CodeForces is a competition-level programming benchmark designed to evaluate Large Language Models' reasoning capabilities through challenging algorithmic problems from the CodeForces platform. This benchmark tests advanced problem-solving skills, algorithmic thinking, and code generation abilities using real competitive programming contests. CodeForces provides human-comparable evaluation of AI coding capabilities in complex, contest-quality scenarios requiring sophisticated reasoning and optimization.

+
+
+
+
Evaluation Stats
Total Models6
Organizations3
Verified Results0
Self-Reported6
+
+
+
+
Benchmark Details
Max Score3000
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

6 models
Top Score
87.4%
Average Score
73.6%
High Performers (80%+)
2

Top Organizations

#1OpenAI
2 models
85.6%
#2DeepSeek
3 models
68.2%
#3Alibaba Cloud / Qwen Team
1 model
65.9%
+
+
+
+
Leaderboard
6 models ranked by performance on CodeForces
LicenseLinks
Aug 5, 2025
Apache 2.0
87.4%
Aug 5, 2025
Apache 2.0
83.9%
Sep 29, 2025
MIT
70.7%
Jan 10, 2025
MIT
69.7%
Apr 29, 2025
Apache 2.0
65.9%
May 28, 2025
MIT
64.3%
+
+
+
+
Resources