LiveCodeBench

text

About

LiveCodeBench is a holistic and contamination-free code evaluation benchmark that continuously collects new programming problems from competitive coding platforms. This dynamic benchmark tests AI models' programming capabilities through fresh problems that evolve over time, preventing memorization and ensuring genuine coding skills assessment across algorithm implementation, problem-solving, and code generation tasks.

Evaluation Stats

Total Models50

Organizations10

Verified Results0

Self-Reported50

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

50 models

Top Score

80.4%

Average Score

47.7%

High Performers (80%+)

Top Organizations

#1xAI

5 models

79.6%

#2Zhipu AI

2 models

71.8%

#3NVIDIA

2 models

68.7%

#4Moonshot AI

1 model

53.7%

#5Microsoft

2 models

53.4%

Leaderboard

50 models ranked by performance on LiveCodeBench

			License
#01Grok-3 Mini	xAI	Feb 17, 2025	Proprietary	80.4%
#02Grok 4 Fast	xAI	Aug 28, 2025	Proprietary	80.0%
#03Grok-4 Heavy	xAI	Jul 9, 2025	Proprietary	79.4%
#04Grok-3	xAI	Feb 17, 2025	Proprietary	79.4%
#05Grok-4	xAI	Jul 9, 2025	Proprietary	79.0%
#06DeepSeek-V3.2-Exp	DeepSeek	Sep 29, 2025	MIT	74.1%
#07DeepSeek-R1-0528	DeepSeek	May 28, 2025	MIT	73.3%
#08GLM-4.5	Zhipu AI	Jul 28, 2025	MIT	72.9%
#09Nemotron Nano 9B v2	NVIDIA	Aug 18, 2025	NVIDIA Open Model License Agreement	71.1%
#10GLM-4.5-Air	Zhipu AI	Jul 28, 2025	MIT	70.7%

Showing 1 to 10 of 50 models

...

Resources

Research Paper