Qasper

text
+
+
+
+
About

QASPER is a question-answering benchmark specifically designed for scientific research papers, featuring 5,049 information-seeking questions over 1,585 Natural Language Processing papers. This comprehensive evaluation tests AI models' ability to understand complex scientific content, extract relevant information from lengthy academic texts, and provide accurate answers to domain-specific research questions requiring deep comprehension.

+
+
+
+
Evaluation Stats
Total Models2
Organizations1
Verified Results0
Self-Reported2
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

2 models
Top Score
41.9%
Average Score
40.9%
High Performers (80%+)
0

Top Organizations

#1Microsoft
2 models
40.9%
+
+
+
+
Leaderboard
2 models ranked by performance on Qasper
LicenseLinks
Aug 23, 2024
MIT
41.9%
Aug 23, 2024
MIT
40.0%
+
+
+
+
Resources