PopQA

text
+
+
+
+
About

PopQA is an open-domain question-answering benchmark featuring 14,000 QA pairs with fine-grained entity popularity information from Wikidata and Wikipedia page views. This unique benchmark evaluates how AI models' factual knowledge correlates with entity popularity, testing both knowledge retrieval capabilities and potential biases toward popular versus obscure entities in factual question-answering tasks.

+
+
+
+
Evaluation Stats
Total Models3
Organizations1
Verified Results0
Self-Reported3
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

3 models
Top Score
26.2%
Average Score
25.1%
High Performers (80%+)
0

Top Organizations

#1IBM
3 models
25.1%
+
+
+
+
Leaderboard
3 models ranked by performance on PopQA
LicenseLinks
Apr 16, 2025
Apache 2.0
26.2%
Apr 16, 2025
Apache 2.0
26.2%
May 2, 2025
Apache 2.0
22.9%
+
+
+
+
Resources