CommonSenseQA

text

About

CommonSenseQA is a multiple-choice question answering benchmark that evaluates AI models' commonsense reasoning abilities through questions requiring everyday knowledge and logical inference. The benchmark features questions designed to test intuitive understanding of the world, social situations, and cause-effect relationships. CommonSenseQA measures how well AI systems can apply common sense knowledge to answer questions that seem obvious to humans but challenge machine reasoning.

Evaluation Stats

Total Models1

Organizations1

Verified Results0

Self-Reported1

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

1 models

Top Score

70.4%

Average Score

70.4%

High Performers (80%+)

Top Organizations

#1Mistral AI

1 model

70.4%

Leaderboard

1 models ranked by performance on CommonSenseQA

			License		Links
#01Mistral NeMo Instruct	Mistral AI	Jul 18, 2024	Apache 2.0	70.4%

Resources

Research Paper