HiddenMath

text
+
+
+
+
About

HiddenMath is a mathematical reasoning benchmark designed to evaluate AI models' ability to solve complex mathematical problems that require deep analytical thinking and problem-solving skills. This benchmark tests models' mathematical competency through challenging problems that go beyond simple arithmetic, measuring advanced mathematical reasoning and logical deduction capabilities in mathematical contexts.

+
+
+
+
Evaluation Stats
Total Models13
Organizations1
Verified Results0
Self-Reported13
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

13 models
Top Score
63.0%
Average Score
42.7%
High Performers (80%+)
0

Top Organizations

#1Google
13 models
42.7%
+
+
+
+
Leaderboard
13 models ranked by performance on HiddenMath
LicenseLinks
Dec 1, 2024
Proprietary
63.0%
Mar 12, 2025
Gemma
60.3%
Feb 5, 2025
Proprietary
55.3%
Mar 12, 2025
Gemma
54.5%
May 1, 2024
Proprietary
52.0%
May 1, 2024
Proprietary
47.2%
Mar 12, 2025
Gemma
43.0%
May 20, 2025
Gemma
37.7%
Jun 26, 2025
Proprietary
37.7%
Mar 15, 2024
Proprietary
32.8%
Showing 1 to 10 of 13 models
+
+
+
+
Resources