ARC-C

text

About

ARC-C (ARC Challenge) is an advanced reasoning benchmark featuring complex multiple-choice science questions that require deep understanding and integration of knowledge across various domains. More challenging than ARC-E, it tests sophisticated reasoning, synthesis of information, and domain-specific knowledge in AI systems. The benchmark is designed to challenge even state-of-the-art AI models with questions requiring advanced logical inference and scientific comprehension.

Evaluation Stats

Total Models31

Organizations11

Verified Results0

Self-Reported31

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

31 models

Top Score

96.9%

Average Score

77.6%

High Performers (80%+)

Top Organizations

#1Anthropic

3 models

92.9%

#2Amazon

3 models

92.5%

#3AI21 Labs

2 models

89.3%

#4Meta

4 models

88.4%

#5Microsoft

3 models

86.4%

Leaderboard

31 models ranked by performance on ARC-C

			License
#01Llama 3.1 405B Instruct	Meta	Jul 23, 2024	Llama 3.1 Community License	96.9%
#02Claude 3 Opus	Anthropic	Feb 29, 2024	Proprietary	96.4%
#03Llama 3.1 70B Instruct	Meta	Jul 23, 2024	Llama 3.1 Community License	94.8%
#04Nova Pro	Amazon	Nov 20, 2024	Proprietary	94.8%
#05Claude 3 Sonnet	Anthropic	Feb 29, 2024	Proprietary	93.2%
#06Jamba 1.5 Large	AI21 Labs	Aug 22, 2024	Jamba Open Model License	93.0%
#07Nova Lite	Amazon	Nov 20, 2024	Proprietary	92.4%
#08Mistral Small 3 24B Base	Mistral AI	Jan 30, 2025	Apache 2.0	91.3%
#09Phi-3.5-MoE-instruct	Microsoft	Aug 23, 2024	MIT	91.0%
#10Nova Micro	Amazon	Nov 20, 2024	Proprietary	90.2%

Showing 1 to 10 of 31 models

Resources

Research Paper