AMC_2022_23

text

About

AMC 2022-23 is an AI benchmark derived from American Mathematics Competitions problems from 2022 and 2023, designed to evaluate large language models' mathematical reasoning capabilities. This challenging dataset tests AI systems on competition-level mathematics problems requiring advanced problem-solving techniques, logical reasoning, and mathematical knowledge across algebra, geometry, number theory, and combinatorics. Used extensively in LLM evaluation.

Evaluation Stats

Total Models2

Organizations1

Verified Results0

Self-Reported2

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

2 models

Top Score

46.4%

Average Score

40.6%

High Performers (80%+)

Top Organizations

#1Google

2 models

40.6%

Leaderboard

2 models ranked by performance on AMC_2022_23

			License		Links
#01Gemini 1.5 Pro	Google	May 1, 2024	Proprietary	46.4%
#02Gemini 1.5 Flash	Google	May 1, 2024	Proprietary	34.8%

Resources

Research Paper