PopQA
text
+
+
+
+
About
PopQA is an open-domain question-answering benchmark featuring 14,000 QA pairs with fine-grained entity popularity information from Wikidata and Wikipedia page views. This unique benchmark evaluates how AI models' factual knowledge correlates with entity popularity, testing both knowledge retrieval capabilities and potential biases toward popular versus obscure entities in factual question-answering tasks.
+
+
+
+
Evaluation Stats
Total Models3
Organizations1
Verified Results0
Self-Reported3
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
3 models
Top Score
26.2%
Average Score
25.1%
High Performers (80%+)
0Top Organizations
#1IBM
3 models
25.1%
+
+
+
+
Leaderboard
3 models ranked by performance on PopQA
License | Links | ||||
---|---|---|---|---|---|
Apr 16, 2025 | Apache 2.0 | 26.2% | |||
Apr 16, 2025 | Apache 2.0 | 26.2% | |||
May 2, 2025 | Apache 2.0 | 22.9% |