SQuALITY
text
+
+
+
+
About
SQuALITY is a question-focused summarization benchmark featuring 100 Project Gutenberg short stories with 500 questions and 2,000 high-quality summaries created by trained writers. This comprehensive evaluation tests AI models' ability to generate targeted summaries that answer specific questions about long documents, requiring sophisticated reading comprehension and selective information extraction capabilities.
+
+
+
+
Evaluation Stats
Total Models5
Organizations2
Verified Results0
Self-Reported5
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
5 models
Top Score
24.3%
Average Score
21.2%
High Performers (80%+)
0Top Organizations
#1Microsoft
2 models
24.2%
#2Amazon
3 models
19.3%
+
+
+
+
Leaderboard
5 models ranked by performance on SQuALITY
License | Links | ||||
---|---|---|---|---|---|
Aug 23, 2024 | MIT | 24.3% | |||
Aug 23, 2024 | MIT | 24.1% | |||
Nov 20, 2024 | Proprietary | 19.8% | |||
Nov 20, 2024 | Proprietary | 19.2% | |||
Nov 20, 2024 | Proprietary | 18.8% |