TheoremQA

text
+
+
+
+
About

TheoremQA is the first theorem-driven question answering benchmark featuring 800 high-quality questions covering 350 theorems across Mathematics, Physics, Electrical Engineering, Computer Science, and Finance. Curated by domain experts, this rigorous evaluation tests AI models' ability to apply theoretical knowledge and mathematical theorems to solve challenging science problems requiring deep understanding and reasoning.

+
+
+
+
Evaluation Stats
Total Models6
Organizations1
Verified Results0
Self-Reported6
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

6 models
Top Score
44.4%
Average Score
39.0%
High Performers (80%+)
0

Top Organizations

#1Alibaba Cloud / Qwen Team
6 models
39.0%
+
+
+
+
Leaderboard
6 models ranked by performance on TheoremQA
LicenseLinks
Jul 23, 2024
tongyi-qianwen
44.4%
Sep 19, 2024
Apache 2.0
44.1%
Sep 19, 2024
Apache 2.0
43.1%
Sep 19, 2024
Apache 2.0
43.0%
Sep 19, 2024
Apache 2.0
34.0%
Jul 23, 2024
Apache 2.0
25.3%
+
+
+
+
Resources