TriviaQA
text
+
+
+
+
About
TriviaQA is a large-scale reading comprehension benchmark featuring over 650,000 question-answer-evidence triples with 95,000 question-answer pairs authored by trivia enthusiasts. Using distant supervision with independently gathered evidence documents, this comprehensive evaluation tests AI models' ability to find and comprehend relevant information to answer factual questions across diverse knowledge domains.
+
+
+
+
Evaluation Stats
Total Models13
Organizations4
Verified Results0
Self-Reported13
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
13 models
Top Score
85.1%
Average Score
74.3%
High Performers (80%+)
5Top Organizations
#1Moonshot AI
1 model
85.1%
#2IBM
1 model
78.2%
#3Mistral AI
5 models
76.1%
#4Google
6 models
70.4%
+
+
+
+
Leaderboard
13 models ranked by performance on TriviaQA
License | Links | ||||
---|---|---|---|---|---|
Jul 11, 2025 | MIT | 85.1% | |||
Jun 27, 2024 | Gemma | 83.7% | |||
Mar 17, 2025 | Apache 2.0 | 80.5% | |||
Mar 17, 2025 | Apache 2.0 | 80.5% | |||
Jan 30, 2025 | Apache 2.0 | 80.3% | |||
Apr 16, 2025 | Apache 2.0 | 78.2% | |||
Jun 27, 2024 | Gemma | 76.6% | |||
Jul 18, 2024 | Apache 2.0 | 73.8% | |||
May 20, 2025 | Gemma | 70.2% | |||
Jun 26, 2025 | Proprietary | 70.2% |
Showing 1 to 10 of 13 models