HiddenMath
text
+
+
+
+
About
HiddenMath is a mathematical reasoning benchmark designed to evaluate AI models' ability to solve complex mathematical problems that require deep analytical thinking and problem-solving skills. This benchmark tests models' mathematical competency through challenging problems that go beyond simple arithmetic, measuring advanced mathematical reasoning and logical deduction capabilities in mathematical contexts.
+
+
+
+
Evaluation Stats
Total Models13
Organizations1
Verified Results0
Self-Reported13
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
13 models
Top Score
63.0%
Average Score
42.7%
High Performers (80%+)
0Top Organizations
#1Google
13 models
42.7%
+
+
+
+
Leaderboard
13 models ranked by performance on HiddenMath
License | Links | ||||
---|---|---|---|---|---|
Dec 1, 2024 | Proprietary | 63.0% | |||
Mar 12, 2025 | Gemma | 60.3% | |||
Feb 5, 2025 | Proprietary | 55.3% | |||
Mar 12, 2025 | Gemma | 54.5% | |||
May 1, 2024 | Proprietary | 52.0% | |||
May 1, 2024 | Proprietary | 47.2% | |||
Mar 12, 2025 | Gemma | 43.0% | |||
May 20, 2025 | Gemma | 37.7% | |||
Jun 26, 2025 | Proprietary | 37.7% | |||
Mar 15, 2024 | Proprietary | 32.8% |
Showing 1 to 10 of 13 models